Wiktionary:Persian transliteration
These are the rules concerning transliteration in Persian entries.
Three transliteration schemes are used in Persian entries (See below). Among them, the Tajik transliteration should only appear in a pronunciation template. Within entries otherwise, only the Classical and Iranian Transliterations should be used. While editors discussed and supported the idea of streamlining transliterations by inputting classical transliteration in all templates and automatically generating the Iranian transliteration (thus displaying both with one input), this has not been implemented yet; Due to the fact that the necessary modules to support generating two transliterations are still works-in-progress. Until said modules are available to streamline Persian transliteration, the transliterations used are as so:
In pronunciation templates: Pronunciation templates must always use the classical transliteration exclusively. As it is not possible to generate Classical, Dari and Tajik romanizations from the Iranian romanization. If the term is exclusive to Iranian Persian, still input the classical romanization, the pronunciation module fa-IPA can generate all romanizations from the classical romanization. Simply set the module to generate an Iranian transliteration.
In link and headerword templates: after streamlining, link & headword templates will also be able to generate both the Iranian and Classical romanization as fa-IPA does. In the mean time, there is no consensus regarding what transliteration to use. The most common practice is to use a modified Iranian transliteration, with q and ğ distinguished.
Within quotation and conjugation templates: all modern varieties spoken in Iran after the 16th century should use the Iranian Transliteration. Otherwise all other varieties (including Medieval varieties spoken in Iran) should utilize the Classical transliteration.
After streamlining is complete, it will be possible to use the classical transliteration in all cases and have the Iranian transliteration automatically generated.
Classical Transliteration
editThis transliteration should be used by any variety that is not modern Iranian Persian (i.e. Classical, Dari, Hazaragi and Tajik) for terms in the Arabic script. For Tajik terms in Cyrillic see the Tajik transliteration.
This is the transliteration used in all pronunciation templates.
Consonants
editNo. | Letter | Name of letter | Transcription | IPA |
---|---|---|---|---|
1 | ا | alif | ā, ∅(see below) | /ʔ/, /aː/, [ɑː] |
1b | آ | ā, 'ā | /ʔaː/, [ʔɑː] | |
2 | ب | bē | b | /b/ |
3 | پ | pē | p | /p/ |
4 | ت | tē | t | /t/ |
5 | ث | sē | s | /s/ |
6 | ج | jīm | j | /d͡ʒ/ |
7 | چ | čē | č | /t͡ʃ/ |
8 | ح | hē | h | /h/ |
9 | خ | xē | x | /x/ |
10 | د | dāl | d | /d/ |
11 | ذ | zāl | z | /z/ |
12 | ر | rē | r | /r/ |
13 | ز | zē | z | /z/ |
14 | ژ | žē | ž | /ʒ/ |
15 | س | sīn | s | /s/ |
16 | ش | šīn | š | /ʃ/ |
17 | ص | sād or swād | s | /s/ |
18 | ض | zād or zwād | z | /z/ |
19 | ط | tā or tōy | t | /t/ |
20 | ظ | zā or zōy | z | /z/ |
21 | ع | 'ayn | ' | /ʔ/ |
22 | غ | ğayn | ğ | /ɣ/ |
23 | ف | fē | f | /f/ |
24 | ق | qāf | q | /q/ |
25 | ک | kāf | k | /k/ |
26 | گ | gāf | g | /ɡ/ |
27 | ل | lām | l | /l/ |
28 | م | mīm | m | /m/ |
29 | ن | nūn | n | /n/, [n], [ŋ], [ɴ] |
30 | و | wāw (ma'rūf) | w, ū | /w/, /uː/ |
wāw (majhūl) | ō | /oː/ | ||
31 | ه | hē | h, ∅(see below) | /h/ |
32 | ی | yē (ma'rūf) | y, ī | /j/, /iː/ |
yē (majhūl) | ē | /eː/ | ||
0 | ء | hamza | ', – | /ʔ/ |
- Dental, labial and velar stops are always aspirated, with few exceptions.
- /t/ and /d/ are phonetically dental in nearly all varieties. Though they are also phonemically dental in Hazaragi and some other varieties of Dari.
- Alif is, with few exceptions, only a glottal stop in word initial positions
- Geminated consonants are shown with the consonant diacritic tashdid (ـّ). Geminated consonants are transliterated with doubled letters, both in IPA and in transliterations.
- hē (ه) in word final positions may act as a placeholder for any short vowel instead of a consonant. Most commonly the short vowel /a/.
About retroflex forms
editSome regional dialects in Afghanistan, such as Hazaragi, have the retroflex consonants /ʈ/ and /ɖ/ as distinct phonemes. However, Hazaragi is treated by its speakers as a spoken form of Dari so it does not have a standardized written form. As these phonemes are not present in standard Dari, their standardized forms are always written with tē (ت) and dāl (د) respectively.
Occasionally the nonstandard characters ٹ, ڈ or ټ, ډ are used by Hazaragi speakers. But there is no consensus on how these forms should be treated.
Vowels
editThe vocalization used by Classical Persian and Dari differs slightly from the vocalization used by Iranian Persian. The table below shows the vocalization used by Classical Persian and Dari.
Romanization | IPA | Final | Medial | Initial |
---|---|---|---|---|
a | /a/ | ـَ | اَ | |
ā | /aː/, [ɑː] | ـا,ـیٰ | ـا | آ |
i | /i/ | ـِ | اِ | |
ī | /iː/ | ـِى | ـِیـ | اِیـ |
ē | /eː/ | ـی | ـیـ | ایـ |
u | /u/ | ـُ | اُ | |
ū | /uː/ | ـُو | اُو | |
ō | /oː/ | ـو | او |
- Word final short vowels are usually shown proceeding a ه (hē)
- The diacritic zēr is often realized as [ɛ] in many nonstandard dialects outside of Kabul, such as the Herati dialect. However, it is /ɪ/ in standard pronunciation and in the Kabuli dialect.
Diphthongs
editRomanization | IPA | Final | Medial | Initial |
---|---|---|---|---|
ay | /aj/ | ـَىْ | ـَیْـ | اَیْـ |
āy | /aːj/ | ـَاىْ | ـَایْـ | آیْـ |
aw | /aw/ | ـَوْ | اَوْ | |
āw | /aːw/ | ـَاوْ | آوْ | |
ūy | /uːj/ | ـُوىْ | ـُویْـ | اُویْـ |
ōy | /oːj/ | ـوىْ | ـویْـ | اویْـ |
- All diphthongs are interpreted as being a phonemic sequence of a vowel + semi-vowel. Subsequently, if there are two adjacent vowels, at least one should become a semi-vowel (Such that there should never be two Adjacent vowels.)
Vowel diacritics
editVowel | Name | Transcription | IPA | Notes |
zabar | a | /a/, [ä], [æ] | ||
zēr | i | /i/ | May also be called zēr-i ma'rūf | |
zēr-i majhūl | [e̞] | Only appears before glottal consonants, technically an allophone. | ||
pēš | u | /u/ | May also be called pēš-i ma'rūf | |
pēš-i majhūl | [o̞] | Only appears before glottal consonants, technically an allophone. | ||
jazm | N/A | none | Vowel killer / zero-vowel diacritic. |
- Though short vowels also have ma'rūf-majhūl variants, majhūl short vowels likely will not be included in romanizations. Unlike the majhūl long vowels, which can appear anywhere, the majhūl short vowels only appear before glottal consonants, and are technically allophones.
Additional information
edit- ـًا, ـاً, ءً (always word-final) – an
- All forms of hamza, including ء, ؤ and ئ are transliterated as '
- The (izāfa) vowel is transcribed differently depending on context:
- ـِ (always word-final, after consonants) – -i
- یِ (after the long vowels ا (ā) or و (ū, ō)) - -yi
- آفریقایِ جنوبی ― āfrīqā-yi janūbī ― South Africa
- یِّ (always word-final after ی) - ī-yi
- جَمْهُوریِّ کورِیا ― jamhūrī-yi kōriyā ― Republic of Korea
- ـهٔ, the form ـهی is treated as a variant – a-yi.
- خانهٔ کلان (spelled with a hamza diacritic) ― xāna-yi kalān ― a big house
- In spoken Dari هٔ may reduce to a short e (or i). So خانهٔ کلان may be pronounced as xāne kalān. It is not known if this will be included since this is the only Izāfa vowel with which this occurs.
- خانهی کلان (spelled with a a non-connecting ye) ― xāna-yi kalān ― a big house
- ـِیِّـ is transliterated as iyyi, the only exception being for Izāfa when it is transliterated as ī-yi.
- ـّ (tašdīd) – geminate consonant (Arabic shadda)
- Al- assimilation الـ
- Only occurs in loaned compound terms from Arabic, as the article الـ is typically dropped from the lemma form of all Arabic loanwords.
- if الـ is followed by one of the 'sun letters' of Arabic, the lām ل will assimilate with the following letter.
- if الـ is part of a conjugation where the alif is silent, the alif is lam is transliterated as l- (or -l- if there is a ZWNJ)
- حَبْلُ المَتِین (hablu l-matīn)
- حَبْلُالمَتِین (hablu-l-matīn) (word with ZWNJ)
- فِالْحَال (fi-l-hāl) (example within a single word)
- ـه - when used as a colloquial copula in the 3rd person singular (he/she/it is) - -a (with a hyphen)
- تَهْران پایتَخْتِ ایرانه. (colloquial) ― ta(h)rān pāytaxt-i ērān-a. ― Tehran is the capital of Iran.
- ZWNJ – - (hyphen)
- Various governments of Afghanistan have recommended that the suffix ـگی have a space or ZWNJ when added to a word ending in ـه. This suggestion is not always observed, even in academic settings and by media broadcasters in Afghanistan. These spellings may be included as alternative forms.
- زندهگی (recommended spelling) ― zinda-gī
- زندگی (common spelling) ― zindagī
Iranian Transliteration
editThis transliteration should be used for modern Iranian Persian, particularly varieties spoken in Iran after the 16th centuries. Varieties of Iranian Persian spoken before the 16th century should, with some exceptions, be treated as classical Persian.
Persian | Wiktionary | IPA | Others (dispreferred) |
---|---|---|---|
ا (word-initial) | a, o, e | (ʔ)æ, (ʔ)o, (ʔ)e | |
ا (other positions) | â | ɒː | ā |
آ | â (word-initial) 'â (other positions) |
(ʔ)ɒː (word-initial) ʔɒː (other positions) |
ʼā |
ب | b | b | |
پ | p | p | |
ت | t | t | |
ث | s | s | th, s̱, ṯ, s̄ |
ج | j | dʒ | ǧ |
چ | č | tʃ | ch, c |
ح | h | h | ḥ, ḩ |
خ | x | x | kh, k͟h, ḫ, ḵ |
د | d | d | |
ذ | z | z | dh, d͟h, ẕ, ḏ |
ر | r | r | |
ز | z | z | |
ژ | ž | ʒ | zh, z͟h |
س | s | s | |
ش | š | ʃ | sh, s͟h |
ص | s | s | ṣ |
ض | z | z | ḍ, ż, ẕ |
ط | t | t | ṭ, ţ |
ظ | z | z | ẓ, z̧ |
ع | ' | ʔ, ː | ʻ |
غ | ğ | ɣ, ɢ | q, gh, g͟h, ġ |
ف | f | f | |
ق | ğ | ɣ, ɢ | q, gh, g͟h, ḳ |
ک | k | k | |
گ | g | g | |
ل | l | l | |
م | m | m | |
ن | n | n | |
و (consonant) | v | v | w |
و (long vowel) | u, ô | uː, oː | ū, ō |
و (diphthong[1]) | ow | ow | au, aw |
خوا (e.g. خواندن etc.) | xâ | xɒː | xwā-, khwā- |
خوی (e.g. خوید etc.) | xi | xiː | |
ه (consonant) | h (may appear in final position after a vowel, e.g. ده (dah)) | h | |
ـه (word-final vowel) | e | e, æ | eh, a, ah |
ی (consonant) | y | j | j |
ی (long vowel) | i, ê | iː, eː | ī, ē |
ی (diphthong[2]) | ey | ej | ai, ay |
یٰ (always word-final) | â | ɒː | ā, á |
Other symbols or combinations
edit- ـاً (-an), ءً ('an) (always word-final) – an (The position of [fatHatan] is after the alef, not before, as is the current practice with Arabic)
- ء – ' (others: ʼ)
- ؤ – ' (others: ʼ)
- ئ – ' (others: ʼ)
- ـِ (-e) (ezâfe) (always word-final, unmarked in regular writing) – -e
- ـیِ (--ye) (ezâfe) (after long vowels ا (â) or و (u), unmarked in regular writing) - -ye
- آفْریقایِ جُنوبی ― âfriğâ-ye jonubi ― South Africa
- ـیِ (--ye) (ezâfe) (always word-final with ی (i), unmarked in regular writing) - i-ye
- جُمْهوریِ تاجیکِسْتان ― jomhuri-ye tâjikestân ― Republic of Tajikistan
- ـهٔ (-h-ye) (U+0647 U+0654), sometimes written as ـهی (-h-i) (always word-final) – e-ye. (Articles don't contain the hamze above "he", it is considered a diacritic and only used in the dsplay using
|head=
. Templates link to words without the hamze.) - ـهای (-h-i) - e-yi
- نه (“no, not”) - na (an exception)
- ـّ (tashdid) – geminate consonant (Arabic shadda)
- ـَ (-a) (fathe/zor) – a (Arabic fatha)
- ـِ (-e) (kasre/zir) – e (in modern Iranian, to check cases where it's "i") (Arabic kasra)
- ـُ (-o) (zamme/pish) – o (in modern Iranian, to check cases where it's "u") (Arabic damma). Used after consonants to make a short "o" sound. If used before و (o) produces a diphthong "ow":
- نُوْروز (nowruz)
- ـّ (shadda) – geminate consonant
- ـ۟ (sukūn/sokun) - marks absence of a vowel. Rarely used in popular Persian vocalisations, especially on final consonants. It may be necessary to use consistently in strict vocalisations to avoid any misreadings, allow automation and signalling that a word IS vocalised.
- ـه (-h) (in the word-final position after consonants for a large number of words) - e (no hyphen) (note with ezâfe the preferred spelling is ـهٔ (-h-ye))
- هَفْتِه ― hafte ― week
- ـه (-h) - when used as a colloquial copula in the 3rd person singular (he/she/it is) - -e (with a hyphen)
- تِهْرون پایْتَخْتِ ایرونِه. (colloquial) ― tehrun pâytaxt-e irun-e. ― Tehran is the capital of Iran.
- ZWNJ – - (hyphen)
- The use of hyphens for etymological reasons - suffixes, compound words, etc. when no ZWNJ is used is to be discussed. E.g. currently plural form suffix ـها (-hâ) is transliterated as "-hâ" regardless if ZWNJ is present or not. (Apart from cases described above and ZWNJ, the use of hyphen is otherwise dispreferred. A space is transliterated as a space and the absence of space or ZWNJ is transliterated as nothing.)
- Below are transliteration examples to contrast the use of ZWNJ on connecting letters, space, nothing and non-connecting letters:
- ZWNJ on joining letters کتابها (ketâb-hâ) (plural of کتاب (ketâb))
- Space کتاب ها (ketâb hâ)
- Nothing (joining letters are connected) کتابها (ketâbhâ)
- Non-joining letters (no ZWNJ is used) اتوها (otuhâ) (plural of اتو (otu))
Arabic loanwords
edit- ـة (-h) (always word-final) – a(t) (rare, only in unadapted borrowings from Arabic, normally adapted into Persian as ـت (at) or ـه (e))
- الـ - al (normally), can be "al-" (with a hyphen), if identified as the Arabic definite article. "l" can change to the following consonant if it's a "sun letter","a" can change to "o" (Arabic "u") in ezâfe, e.g.
- فارِغُالتَّحْصیل (fâreğo-t-tahsil) - here "l" is assimilated to "t" and "a" is changed to "o" following Arabic grammar rules.
- الـ - l or the next consonant (assimilated for Arabic "sun letters"). The alef is silent.
- بِٱلْفِعْل (bel-fe'l) from Arabic بِٱلْفِعْلِ (bi-l-fiʕli) where the alif is silent (أَلِف الوَصْل (ʔalif al-waṣl))
Tajik Transliteration
editThis transliteration should be used for Tajik terms attested in the Cyrillic script and in the pronunciation section of Persian entries. Any Tajik terms attested in the Arabic script should be treated as Dari and use the PRS language code, which will generate the Classical Transliteration.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
See also
edit- Tajik alphabet on Wikipedia.Wikipedia