Wikidata:Property proposal/GS1 GPC brick code
GS1 GPC brick code
[edit]Originally proposed at Wikidata:Property proposal/Authority control
Description | The brick code is used to classify products in the GS1 Global Product Classification |
---|---|
Represents | product (Q2424752) |
Data type | External identifier |
Domain | item |
Allowed values | [1-9]\d* |
Example 1 | fortified wine (Q722338) → 10000273 |
Example 2 | beer (Q44) → 10000159 |
Example 3 | vinegar (Q41354) → 10000051 |
Source | https://www.gs1.org/standards/gpc/dec-2017 |
Planned use | import all values using mix'n'match |
Number of IDs in source | 3000 |
Expected completeness | eventually complete (Q21873974) |
Motivation
The GS1 GPC is one of the 3 majors goods and services vocabularies, along with the UNSPSC (already in Wikidata) and the CPV (EU-based, property creation pending). It is curated by GS1, the organism responsible for barcodes, and is widely used in commerce/e-commerce. Teolemon (talk) 08:50, 26 June 2018 (UTC)
Discussion
- Support. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:25, 27 June 2018 (UTC)
- Support Dhx1 (talk) 14:12, 28 June 2018 (UTC)
- Comment @Teolemon: Could the allowed values regular expression restrict the length of the code? Is there a minimum or maximum length? Dhx1 (talk) 14:14, 28 June 2018 (UTC)
- Support but Comment to @Teolemon: where is it called "brick code"? This sounds very slang and unofficial. Suggest to leave it as GS1 GPC code, with alias GS1 Global Product Classification (GPC). And is there a formatterURL, i.e. per-code pages? --Vladimir Alexiev (talk) 17:34, 28 June 2018 (UTC)
- This is their official name. There are several parts in the GPC, this is only one of them. No public formatter URL that I know of, unfortunately. But the codes are versionned and stable.Teolemon (talk) 08:35, 1 July 2018 (UTC)
- Support Jheald (talk) 15:36, 2 July 2018 (UTC)
- @Teolemon, Pigsonthewing, Dhx1, Vladimir Alexiev, Jheald: Done GS1 GPC code (P8957) Pamputt (talk) 14:27, 13 December 2020 (UTC)
Updated link to https://www.gs1.org/standards/gpc --Vladimir Alexiev (talk) 09:29, 18 December 2020 (UTC) @PKM: wrote "GPC is going to be hugely helpful for me in a number of projects in addition to agriculture (clothing! arts and crafts!). Thanks very much for that link", and surprise! The prop is created after 2y in the freezer. Paula, you have seer Vision :-) --Vladimir Alexiev ( talk) 21:04, 15 December 2020 (UTC)
GPC Scope and Structure
[edit]@Teolemon, PKM, Pigsonthewing, Dhx1, Vladimir Alexiev, Jheald:
- Please cite where it says "brick". --Vladimir Alexiev (talk) 09:29, 18 December 2020 (UTC)
- ”The building block of GPC is a product code known as a brick.” https://www.gs1.org/standards/gpc/how-gpc-works. - PKM (talk) 23:25, 15 December 2020 (UTC)
- If you think of GS1 as a house, there are many things that make a house, not just bricks. I took the second line of their text file and show it as a table below.
- I used this command:
csvtk head -n 2 "GS1 Combined Published_Schema as at 01062020 EN.txt"|csvtk transpose -t|csvtk cut -t -f1,3|csvtk csv2md -t|head -13|pandoc -f markdown -t mediawiki
Field | Value |
---|---|
Segment Code | 70000000 |
Segment Description | Arts/Crafts/Needlework |
Family Code | 70010000 |
Family Description | Arts/Crafts/Needlework Supplies |
Class Code | 70010200 |
Class Description | Airbrushing Supplies |
Brick Code | 10001688 |
Brick Description | Airbrushing Equipment Replacement Parts/Accessories |
Core Attribute Type Code | 20001349 |
Core Attribute Type Description | Type of Airbrushing Equipment Replacement Part/Accessory |
Core Attribute Value Code | 30008542 |
Core Attribute Value Description | AIRBRUSH CONTROL VALVE |
- The structure is Segment>Family>Class>Brick (hierarchical) and then Core Attribute Type: Core Attribute Value (repeatable attributes)
- Looking at the levels, all codes are unique, and the higher levels fully deserve to be captured (why capture only Brick but not Class?). So I vote to rename this to "GS1 GPC code"
- Some levels are repetitive, eg "family 77040000 Aircraft, class 77040100 Aircraft, brick 10008049 Aircraft" so in this case it's not necessary to capture the class (but if GPC is extended in the future, we'll want to capture the cass)
- In many other cases it makes sense to capture the class, eg "family 77013500 Automotive Anti-theft Products" as super-node of "class 10003004 Car Alarms/Anti-jacking Alarms" and "class 10005241 Immobilisers (Automotive)"
- A commercial product is described by many different aspects. Consider this example from @Nikola Tulechki: where I added the field names:
The following entry corresponds to "dried and aged testicles of wild Bison": segment 50000000 Food/Beverage/ Tobacco family 50240000 Meat/Poultry/Other Animals class 50240100 Meat/Poultry/Other Animals - Prepared/Processed brick 10005768 Bison/Buffalo – Prepared/Processed Attribute Type 20002688 Anatomical Form: Attribute Value 30002433 TESTICLES Attribute Type 20002677 Non-Thermal Preservation: Attribute Value 30013615 DRY AGED Attribute Type 20000163 Source: Attribute Value 30013603 WILD/WILD CAUGHT
- Similarly, we have Common Procurement Vocabulary (P5417) but also Wikidata:Property proposal/CPV Supplementary. This allows you to say "CPV=diesel" then qualify with "supplementary=for ditsrict heating"
- In addition to "GS1 GPC code" do we need separate props for "GS1 GPC Attribute Type" and "GS1 GPC Attribute Value I think NO, at least for the time being
- A few of the attribute values are are not stand-alone (i.e. don't make sense without the attribute type), eg
<attType code="20003029" text="Crop Production Purpose" definition="This particular cultivated crop will be grown for this specific purpose."> <attValue code="30000720" text="COMBINATION" definition=""/> <attValue code="30017727" text="PROPAGATION" definition=""/>
- Some sort of stand alone, but don't really make sense alone, so one has to look at the attType, see "nose, scalp, throat" below (sorry the example is a bit macabre!)
<attType code="20000609" text="Type of Pet Food/Treat" <attValue code="30019064" text="ANTLER" <attValue code="30019065" text="BIT/DICE/BITS/DROP" <attValue code="30005548" text="BONES" <attValue code="30019066" text="BULL PIZZLE" <attValue code="30019067" text="CHEWING ROLLS" <attValue code="30019068" text="EARS" <attValue code="30019069" text="HIDE/FUR" <attValue code="30019060" text="NOSE" <attValue code="30005550" text="PET BISCUIT/CRACKER" <attValue code="30019070" text="RUMEN" <attValue code="30019071" text="SCALP" <attValue code="30015614" text="SKIN" <attValue code="30018015" text="SPIRAL" <attValue code="30004265" text="STICK" <attValue code="30002345" text="STOMACH" <attValue code="30019072" text="STRIPE" <attValue code="30019073" text="THROAT"
- A few other values are not very useful, but stand alone, and I'm sure are already on WD
<attValue code="30002960" text="NO" definition=""/> <attValue code="30002654" text="YES" definition=""/> <attValue code="30002515" text="UNCLASSIFIED" definition="This term is used to describe those product attributes that are unable to be classified ... <attValue code="30002518" text="UNIDENTIFIED" definition="This term is used to describe those product attributes that are unidentifiable...
--Vladimir Alexiev (talk) 09:29, 18 December 2020 (UTC)
Mix-n-Match Catalogs
[edit]Notified participants of WikiProject Economics
- https://mix-n-match.toolforge.org/#/catalog/4062 : hierarchical thesaurus (Segment>Family>Class>Brick)
- https://mix-n-match.toolforge.org/#/catalog/4063 : Attribute Values
Notes:
- Renamed P8957 to "GS1 GPC code" and attached it to both catalogs, so that when I create new items from the catalog (eg automatic pet water dish (Q104244383), it can assign the GPC id.
- Put up the conversion scripts and TSVs to https://github.com/VladimirAlexiev/gs1-gpc
Problems and fixes:
- DONE: remove heading rows, MnM Import doesn't like them
- DONE: lowercase the names; especially bug problem for Attribute Values (see above item): easy but is incorrect for some cases (eg value "ROMANIA - UNCLASSIFIED")
- DONE: move trailing "Other", "Variety Packs" and " - UNCLASSIFIED" from name to descr to improve matching; and collapse such bricks that are not appropriate for WD into the parent class
- DONE: sort Attribute Values by popularity to incentivize matching
- DONE: convert names to singular
- TODO: list the Attribute Type(s) that pertain to a Value in order to clarify the value: HARD
- TODO: reload the catalogs while preserving the matches (about 20+15).
Looking for help!!! https://www.wikidata.org/wiki/Topic:Vzu6yjsddsvy5wd8 --Vladimir Alexiev (talk) 12:33, 18 December 2020 (UTC)
Multi-valued
[edit]Removed the Single Value constraint because I think eg all the codes below should be mapped to television set (Q8075)
68010100 Televisions (GPC class) 10001400 Televisions (GPC brick) 10001405 Televisions Other (GPC brick) 10001403 Televisions Variety Packs (GPC brick)
"TVs Other" and "TVs Variety Packs" are not appropriate as WD items, but it's still useful to capture these GPC codes --Vladimir Alexiev (talk) 13:13, 18 December 2020 (UTC)