Left-to-right mark: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 18:05, 2 August 2013 edit 67.210.164.158 (talk) →Example of use in HTML ← Previous edit		Latest revision as of 09:01, 21 July 2024 edit undo GhostInTheMachine (talk \| contribs) Extended confirmed users, Page movers 80,435 edits →top: other software is available
(45 intermediate revisions by 39 users not shown)
Line 1: {{Short description\|Control character in bidirectional text}} The '''left-to-right mark''' (LRM) is a [[control character]] or invisible formatting character, used in the computerized [[typesetting]] of text that contains mixed left-to-right scripts (such as [[English language\|English]] and [[Russian language\|Russian]]) and right-to-left scripts (such as [[Arabic language\|Arabic]], [[Persian language\|Persian]] and [[Hebrew language\|Hebrew]]). It is used to set the way adjacent characters are grouped with respect to text direction.▼ {{Merge\|Right-to-left mark\|Arabic letter mark\|date=June 2024\|discuss=Talk:Arabic letter mark}} {{More citations needed\|date=January 2019}} ▲The '''left-to-right mark''' ('''LRM''') is a [[control character]] or(an invisible formatting character,) used in ~~the~~ computerized [[typesetting]] of text ~~that~~containing ~~contains~~a ~~mixed~~mix of left-to-right scripts (such as [[~~English~~Latin ~~language~~script\|~~English~~Latin]] and [[~~Russian~~Cyrillic ~~language~~script\|~~Russian~~Cyrillic]]) and right-to-left scripts (such as [[Arabic ~~language~~script\|Arabic]], [[~~Persian~~Syriac ~~language~~alphabet\|~~Persian~~Syriac]], and [[Hebrew ~~language~~alphabet\|Hebrew]]). It is used to set the way adjacent characters are grouped with respect to text direction. ==Unicode== In [[Unicode]], the LRM character is encoded at {{unichar\|200E\|left-to-right mark\|html=}}. In [[UTF-8]] it is <code>E2 80 8E</code>. Usage is prescribed in the Unicode Bidi (bidirectional) algorithm.<ref>Unicode 12.0 standard, http://www.unicode.org/versions/Unicode12.0.0/UnicodeStandard-12.0.pdf, p. 880</ref>▼ ▲In [[Unicode]], LRM is encoded {{unichar\|200E\|left-to-right mark\|html=}}. [[UTF-8]] is E2 80 8E. Usage is prescribed in the Unicode Bidi (bidirectional) algorithm. ==Example of use in HTML== Suppose the writer wishes to ~~inject~~use ~~a run of~~some English text (~~i.e.~~a left-to-right script) ~~text~~ into ana paragraph written in Arabic or Hebrew ~~paragraph,~~(a right-to-left script) with non-alphabetic characters atto the ~~end~~right of the English text. ~~(on~~For example, the ~~right).~~writer wants to translate, "The language C++ is a programming language used..." ininto Arabic,. ~~but~~Without ~~with~~an ~~the~~LRM ~~"C++"~~control incharacter, ~~English~~the result ~~renders~~looks aslike ~~follows~~this: ‫ <span dir="rtl">لغة C<span style="color:red">++</span> هي لغة برمجة تستخدم...</span> With an LRM ~~mark~~ entered in the HTML after the ++, it ~~renders~~looks like this, as ~~follows~~the writer intends: ‫ <span dir="rtl">لغة C<span style="color:red">++</span>&lrm; هي لغة برمجة تستخدم...</span> ~~کیرم تو کس خواهرت~~ ~~Standards-compliant~~In ~~browsers~~the ~~will~~first ~~render~~example, ~~the~~without ++an onLRM ~~the~~control ~~left~~character, ina ~~the~~[[web ~~first~~browser]] ~~example,~~will ~~and on~~render the ~~right~~++ inon the ~~second.~~left ~~This~~of ~~happens~~the "C" because the browser recognizes that the paragraph is in a ~~RTL~~right-to-left ~~script~~text ([[Arabic script\|Arabic]]), and applies punctuation, which is neutral as to its direction, inaccording ~~coordination with~~to the ~~more~~direction ~~prominent~~of ~~(paragraph level)~~the adjacent text. The LRM control character causes the punctuation to be adjacent to only ~~LTR~~left-to-right text – the "C" and the LRM ~~mark~~ – and ~~hence~~ position as if it were in left-to-right text, i.e., to the right of the preceding text. <code>&#8206;</code> or <code>&lrm;</code> may be required by some software rather than the invisible Unicode character itself; the actual invisible character would also make copy editing difficult. Some software requires using the [[HTML]] code <code>&#8206;</code> or <code>&lrm;</code> instead of the invisible Unicode control character itself.{{citation needed\|date=April 2019}} Using the invisible control character directly could also make copy editing difficult. ==See also== * [[Right-to-left mark]] * [[~~Bi-directional~~Bidirectional text]] ==References== {{reflist}} ==External links== * [http://unicode.org/reports/tr9/ Unicode standard annex #9: The bidirectional algorithm] * [~~http~~https://www.fileformat.info/info/unicode/char/200e/index.htm Unicode character (U+200E)] {{Unicode navigation}} Line 29 ⟶ 36: [[Category:Digital typography]] [[Category:Unicode formatting code points]] ~~{{typ-stub}}~~