Left-to-right mark: Difference between revisions

Content deleted Content added
top: other software is available
 
(45 intermediate revisions by 39 users not shown)
Line 1:
{{Short description|Control character in bidirectional text}}
The '''left-to-right mark''' (LRM) is a [[control character]] or invisible formatting character, used in the computerized [[typesetting]] of text that contains mixed left-to-right scripts (such as [[English language|English]] and [[Russian language|Russian]]) and right-to-left scripts (such as [[Arabic language|Arabic]], [[Persian language|Persian]] and [[Hebrew language|Hebrew]]). It is used to set the way adjacent characters are grouped with respect to text direction.
{{Merge|Right-to-left mark|Arabic letter mark|date=June 2024|discuss=Talk:Arabic letter mark}}
{{More citations needed|date=January 2019}}
The '''left-to-right mark''' ('''LRM''') is a [[control character]] or(an invisible formatting character,) used in the computerized [[typesetting]] of text thatcontaining containsa mixedmix of left-to-right scripts (such as [[EnglishLatin languagescript|EnglishLatin]] and [[RussianCyrillic languagescript|RussianCyrillic]]) and right-to-left scripts (such as [[Arabic languagescript|Arabic]], [[PersianSyriac languagealphabet|PersianSyriac]], and [[Hebrew languagealphabet|Hebrew]]). It is used to set the way adjacent characters are grouped with respect to text direction.
 
==Unicode==
In [[Unicode]], the LRM character is encoded at {{unichar|200E|left-to-right mark|html=}}. In [[UTF-8]] it is <code>E2 80 8E</code>. Usage is prescribed in the Unicode Bidi (bidirectional) algorithm.<ref>Unicode 12.0 standard, http://www.unicode.org/versions/Unicode12.0.0/UnicodeStandard-12.0.pdf, p. 880</ref>
 
In [[Unicode]], LRM is encoded {{unichar|200E|left-to-right mark|html=}}. [[UTF-8]] is E2 80 8E. Usage is prescribed in the Unicode Bidi (bidirectional) algorithm.
 
==Example of use in HTML==
Suppose the writer wishes to injectuse a run ofsome English text (i.e.a left-to-right script) text into ana paragraph written in Arabic or Hebrew paragraph,(a right-to-left script) with non-alphabetic characters atto the endright of the English text. (onFor example, the right).writer wants to translate, "The language C++ is a programming language used..." ininto Arabic,. butWithout withan theLRM "C++"control incharacter, Englishthe result renderslooks aslike followsthis:
 
<span dir="rtl">لغة C<span style="color:red">++</span> هي لغة برمجة تستخدم...</span>
 
With an LRM mark entered in the HTML after the ++, it renderslooks like this, as followsthe writer intends:
 
<span dir="rtl">لغة C<span style="color:red">++</span>&lrm; هي لغة برمجة تستخدم...</span>
 
کیرم تو کس خواهرت
Standards-compliantIn browsersthe willfirst renderexample, thewithout ++an onLRM thecontrol leftcharacter, ina the[[web firstbrowser]] example,will and onrender the right++ inon the second.left Thisof happensthe "C" because the browser recognizes that the paragraph is in a RTLright-to-left scripttext ([[Arabic script|Arabic]]), and applies punctuation, which is neutral as to its direction, inaccording coordination withto the moredirection prominentof (paragraph level)the adjacent text. The LRM control character causes the punctuation to be adjacent to only LTRleft-to-right text – the "C" and the LRM mark – and hence position as if it were in left-to-right text, i.e., to the right of the preceding text. <code>&amp;#8206;</code> or <code>&amp;lrm;</code> may be required by some software rather than the invisible Unicode character itself; the actual invisible character would also make copy editing difficult.
 
Some software requires using the [[HTML]] code <code>&amp;#8206;</code> or <code>&amp;lrm;</code> instead of the invisible Unicode control character itself.{{citation needed|date=April 2019}} Using the invisible control character directly could also make copy editing difficult.
 
==See also==
* [[Right-to-left mark]]
* [[Bi-directionalBidirectional text]]
 
==References==
{{reflist}}
 
==External links==
* [http://unicode.org/reports/tr9/ Unicode standard annex #9: The bidirectional algorithm]
* [httphttps://www.fileformat.info/info/unicode/char/200e/index.htm Unicode character (U+200E)]
 
{{Unicode navigation}}
Line 29 ⟶ 36:
[[Category:Digital typography]]
[[Category:Unicode formatting code points]]
 
 
{{typ-stub}}