Wiktionary:Categorization
This is a Wiktionary policy, guideline or common practices page. This is a draft proposal. It is unofficial, and it is unknown whether it is widely accepted by Wiktionary editors. | |
Policies – Entries: CFI - EL - NORM - NPOV - QUOTE - REDIR - DELETE. Languages: LT - AXX. Others: BLOCK - BOTS - VOTES. |
This page describes categorization practices specific to Wiktionary. To find out about how to use categories in MediaWiki, the software that runs Wiktionary and Wikipedia, see Help:Category. Before you start to assign pages and subcategories to categories and create new categories, please check the conventions described in this page, and make yourself familiar with the categories already in use.
Each page is typically in at least one category. It may be in more, but entries in topical categories should rarely, if ever, be put into a more narrow category and also a more general category.
Each category should be in at least one higher-level category, except for the highest-level category: Category:Fundamental.
For category names, the usual rules for case-sensitivity of page names apply: they are case-sensitive for all characters, but in most projects the first character is case-insensitive. So be aware that you create a new category if the capitalization beyond the first character is not the same.
All topical categories begin with a capital letter: there is "Category:Foods" rather than "Category:foods". However, the language-specific prefix such as "fr" in "Category:fr:Foods" is in lowercase.
Main branches
The root category of Wiktionary is Category:Fundamental. This category contains several main categories.
Categories are used in Wiktionary for the following main purposes:
- Grouping words by part of speech
- Grouping words by language
- Grouping words by topic
- Grouping words by etymology
- Grouping words by style of usage
- Grouping words by their common properties
- Grouping pages for purposes internal to Wiktionary
A word can also be categorized in many branches of the category tree, allowing it to be found in different ways. For example: mare is in the category Category:English nouns, as well as in the topic Category:en:Horses. It can quickly be found by readers looking for an English noun in particular. Likewise, 馬 is in Category:Japanese nouns as well as in Category:ja:Horses.
Part of speech
Each word should be in the appropriate category for language and part of speech. For example: mare is a word in English, Latin, Italian, and Romanian, and has sections for each of those languages. It thus belongs to language categories of each branch, specifically, Category:English nouns, Category:Latin nouns, Category:Italian nouns, and Category:Romanian nouns. The assignment of an entry to a part-of-speech category is usually done automatically by an inflection-line template, such as {{en-noun}}
, {{es-adj}}
, or the generic {{head}}
, so you don't need to manually add it.
Parts of speech have the root category of Category:Lemmas subcategories by language, a subcategory of Category:Fundamental. The root category for parts of speech includes the category of Category:Nouns by language, which in its turn includes the category of Category:Ukrainian nouns.
Some terms are not strictly parts of speech, such as affixes, or only stand in for other terms that have, like acronyms. These terms have their own categories and are kept separate from the part of speech hierarchy.
Language
The Category:Fundamental contains Category:All languages, which in its turn contains categories named like Category:English language. Each language category contains subcategories for parts of speech, described in the section #Part of speech, and some other categories related to the language. Languages are also categorised by various other means, such as their relationship to other languages (Category:Languages by family), script (Category:Languages by script) or the places where they are spoken (Category:Languages by country). See also Wiktionary:Language categories.
Topic
Where useful, words are categorized by topic such as "Animals" or "Chemistry". The root of topical categories is Category:All topics. This root category contains only some major subcategories, such as Category:Communication or Category:Sciences; further categories on a more fine-grained level such as Category:Horses are located somewhere else, deeper in the subcategory tree. There is also Category:List of topics, which lists all topics alphabetically no matter where in the category tree they are located.
Each name of a topical category refers to the objects or meanings referred to by words that are members of the category; the name does not refer to the member words themselves. Thus, there is "Category:Chemistry" rather than "Category:Chemical terminology" or "Category:Chemical terms", or there is "Category:Animals" rather than "Category:Animal names". In other words, the names of topical categories denote what the terms are about, not what they are.
Each name of a topical category has a prefix that indicates the language of the terms belonging to the category. Thus, the English category for horses is Category:en:Horses, while the Japanese one is Category:ja:Horses. The prefix consists of the language code followed by a colon.
Names and subcategory structures of categories should be matched between languages. Thus, if Category:Horses is a subcategory of Category:Equids, then also Category:en:Horses should be a subcategory of Category:en:Equids. If there exists the category Category:fr:Cryptozoology, there should also exist the category Category:Cryptozoology.
The subcategory structure of topical categories is technically maintained in "topic cat parents". This subpage tells the template {{topic cat}}
what the parents of the topical category are, regardless of the language. This technique ensures consistent categorization across languages, guaranteeing the matching mentioned in the previous paragraph.
"Names" are proper nouns, and are categorized by part of speech: Category:Japanese proper nouns
Etymology
Each language that has terms for which the origin is known has a category of the form "Category:(language) terms by etymology". This category contains terms categorised by the way in which they first entered the language. This includes words that were formed with certain affixes or words that were formed by compounding them with another word, but also terms that have unusual or notable origins. The root of etymology categories is Category:Terms by etymology by language.
In particular, languages that are known to have acquired some of their words from other languages have a subcategory named "Category:(language) terms derived from other languages", such as Category:English terms derived from other languages. Languages that in turn are the source of words in other languages have a subcategory named "Category:Terms derived from (language)", for example Category:Terms derived from English. This category lists non-English words derived from English, rather than listing English words.
When adding the etymology section to an entry, please categorize the word into the appropriate derivation categories, using templates such as {{der}}
. If known, include all of the languages through which the word has come. For example: An English word derived originally from Ancient Greek via Latin and then French would use all three of the templates {{der|en|fr}}
, {{der|en|la}}
and {{der|en|grc}}
, which in turn would add the word to Category:English terms derived from French, Category:English terms derived from Latin and Category:English terms derived from Ancient Greek respectively.
Style of usage
Many words in a language are not used in all situations, or have certain connotations that are not obvious. This style of usage categorized under Category:Terms by usage by language, with a category for each language. This category contains subcategories for terms that are stilted or formal, old-fashioned, offensive, humorous and so on.
Purposes internal to Wiktionary
Categories such as the following are special categories used for maintenance purposes, and reside outside of the main category tree:
- Category:Requests for deletion
- Category:Requests for cleanup
- Category:Requested entries — definitions that have been requested
- Category:Requests for translations by language
- Category:Requests for review of translations by language
In addition, many templates are used on Wiktionary to help in creating and maintaining entries. These are located under Category:Templates. Many languages also have templates that are specific to that language, which are contained in categories such as Category:English templates.
Category browsing
Categories only list 200 entries at a time. This can be an inconvenience to the user when a category contains more than 200 entries. To alleviate this, add an A-Z table of contents to the category by editing the category page and adding {{CategoryTOC}} to it.
See Category:Spanish lemmas for an example of this in action.
Discussions
The discussions and votes on categorization:
- Wiktionary:Votes/2007-05/Categories at end of language section
- Beer parlour, "Only main categories in All topics", October 2010
- Beer parlour, "Deprecation of topical categories", February 2011
- Wiktionary:Votes/2011-04/Lexical categories
- Wiktionary:Votes/2011-04/Derivations categories
- Wiktionary:Votes/pl-2011-05/Add en: to English topical categories, part 2
See also
- Help:Category
- m:Categorization requirements (original guidelines for category proposals and implementations)
- Special:Categories — All existing categories alphabetically.
- Special:Uncategorizedcategories — The maintenance categories, and also any other categories that have not yet been properly integrated into the main category tree.
- Special:Uncategorizedpages — Pages that should be categorized.
- Category:Fundamental — The root of the main category tree.