Wikipedia talk:Manual of Style/Text formatting/Archive 2

This is an archive of past discussions about Wikipedia:Manual of Style. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

Archive 2

Archive 3

Archive 4

Archive 5

Variable markup

A new policy says:

The symbols for variables normally use the <var>...</var> HTML element, which renders the variable in italics (<var>E</var> = <var>mc</var>2 renders as E = mc²).

Great. We switched from doing it that way to the more efficient way at about the beginning of 2003, IIRC. Normally one might write such code between 50 and 75 times in an article that it takes one 30 minutes (under the present conventional practice) to write, and if one writes maybe ten such articles in a day, that adds up to enormously more time it takes to write them. Was this policy written by someone who's never written a math article? How did this policy get there? I see it finally being mentioned at Wikipedia talk:WikiProject Mathematics after it's been put into the style manual. Michael Hardy (talk) 20:46, 12 September 2008 (UTC)

I agree, it would impose a huge load for no apparent benefit. I doubt there was ever a consensus for this among people who actually write formulae. Richard Pinch (talk) 21:41, 12 September 2008 (UTC)

It's not a "new policy"; this isn't a policy to begin with, it's a guideline. Getting hot around the collar about "policy" is a hyperbolic red herring. If someone has a problem with semantic markup, take that up with the W3C, since said someone has a problem with XHTML in general. If someone finds <var>...</var> too inconvenient to do somehow, then don't do it; some gnome will fix it later. It's not like much of anyone but gnomes pays any attention to this or other MOS pages. And please keep it civil and a lot less WP:OWNish. I'm no noob nor an "outsider"; I've been editing MOS pages for years. Just because a handful of editors seem to sometimes like to treat this particular MOS page as if it were somehow magically special does not make it immune from editing by others, much less those with legitimate concerns that math-focused editors may be unaware of, not fully understand, or simply ignore because they aren't personal concerns of those editors or they do not personally see the benefits to resolving them. One such concern is failure in various places in Wikipedia to use the XHTML semantic phrase elements for what they were intended for, an oversight that has implications for accessibility, metadata, the semantic Web, external repurposing of Wikipedia content, etc., etc. Just because this guideline touches on mathematics in a few places does not mean it should be dicated by the convenience of math editors.

If you find <var>x</var> too difficult in some way, try ~~{{VAR|x}}~~ {{var|x}} (same output). Given that some editors simply won't care, I don't see a problem with the guideline being more flexible (will edit it in a minute to do so). The fact that this was discussed 5 years ago, before many Wikipedians were thinking about Web semantics, Web 2.0, accessibility, repurposing of content, metadata, and using simple inline templates to ease repetitive keyboarding tasks, isn't particularly persuasive. "It's more convenient the sloppy way" is not a strong argument against doing something properly. I'm frankly shocked that mathematics editors would even use such an argument to begin with, given how insistent they are that the codes and conventions they use be done with absolute precision, to the great inconvenience of all other editors (who do not notice or recognize any difference between the minus and hyphen characters, etc.). I also have to ask how many new articles are being created that use italicized variables 75 times? Surely not many. I'm also skeptical about the claim that such an article would require an extra 20 or 30 minutes to write as a result; this would only be true for someone who doesn't know how to copy-paste. As I said, a gnome (or a bot) can fix it later, so it really doesn't matter if some editor will ignore the var recommendation. If someone finds even the template version tedious, a simple solution is to write the article in a text editor, and use \\x\\\ as a temporary token, instead of ''x'', and then simply search/replace all instances of \\\ with and \\ with <var> (in that order), once each, and it'll change document-wide. I do this sort of stuff all the time. Try it. Major time-saver for all sorts of things.

For anyone not following any of this at all, I'm not sure what to tell you other than to spend some time actually reading on the topic before dismissing it. Use of semantic markup has been one of the basic principles of web design, development and publishing (i.e., what Wikipedia is doing) since the mid-1990s. For some quick intros, see Separation of content and presentation here, and some concise W3C material on the topic. Be also aware that screen readers for people with visual problems will usually ignore italics (and bold and other non-semantic markup), but will usually indicate, one way or another, when something is marked up with one of the semantic elements. Intentionally ignoring simple semantic markup in favor slightly-simpler non-semantic markup is really pretty blatantly anti-accessibility. Let's not go there!

— SMcCandlish [talk] [cont] ‹(-¿-)› 00:05, 13 September 2008 (UTC)

PS: It's not an "in theory" matter, but an "in actuality" matter; the fact that most regular readers don't see a visual difference doesn't mean there isn't any difference. — SMcCandlish [talk] [cont] ‹(-¿-)› 00:50, 13 September 2008 (UTC)

Implementation

What you are proposing requires an enormous effort. Really, really, really enormous. I use the italics with purpose: How would a bot distinguish between italics which indicate a variable and italics which indicate emphasis? Since italics are not semantic markup, there is no truly reliable way to do that. Instead you must rely on actual humans. You can call them gnomes if you like, but they are flesh and blood, and they have better things to do. Portal:Mathematics says, "There are approximately 20097 mathematical articles in Wikipedia." I am going to spell out that number: Twenty thousand and ninety-seven. Consider group (mathematics), which is currently at FAC. The first section, "Definition and illustration", has 97 variables. In the first section of Derivative I counted 212 variables; the first section of spectral sequence past the history has 64; the first section of nuclear space has 40. Some math articles are stubs, so we can conservatively estimate the number of variables per article at 50. Using {{var}} then requires about a million changes, all of which must be done by hand.

If you can write a clever bot that consistently converts italicized variables to {{var}} and does not require human intervention, then maybe we can talk about making a million changes. Until then, this proposal is going to go nowhere; you can put it in the MoS if you like, but—despite your good intentions, and despite the fact that I fully agree this method is better—it will go nowhere. Ozob (talk) 01:11, 13 September 2008 (UTC)

While I can see theoretical sense in the proposal and the very large technical problems. We do need to follow what is says on the top of the page Before editing this page, please make sure that your revision reflects consensus. Things seem to have gone the wrong way with a unilateral change before consensus has been reached.

There is also the wider issue that math formatting is very broken in wikipedia. Ideally we should be using MathML <math> <mi>x</mi></math> which would correctly markup all maths content rather than a specific part. So this goes part of the way but not far enough. --Salix alba (talk) 01:40, 13 September 2008 (UTC)

Out-dented mass reply; this is a lot to cover in one message, so I'm going to number the points.

The very fact that we're in a situation where we can't distinguish from things italicized for purely stylistic/presentational reasons (book titles, foreign-language phrases, etc.), actually emphasized phrases, and variables, is precisely why Wikipedia needs to start paying more attention to Web semantics, or things will just get worse and worse.
Yes, it is a lot to do, but there is no hurry to do it (actually WP:ACCESSIBILITY people would probably disagree on that!) That it will require a lot of changes is of no concern. The switch to logical quotation had (and still has) the same consequence. So did the switch to linking of dates for autoformatting purposes years ago, and likewise its undoing. So did the deprecation of spoiler tags in articles on movies and other works of fiction (that was tens of thousands of articles). There are many other examples. That something will take a long time to trickle down is not any reason for MOS to not recommend best (instead of expedient) practices. It'll all work out in the end, as it always does (and yes, there will always be articles, especially newly recreated one, that do not comply with much of anything in MOS; c'est la vie).
I have not thought much on the bot matter, but it is probably not feasible except for certain very specific cases; as with date linking, it'll be a long slow process. Oh well. Business as usual around here. AWB can help tremendously.
Group (mathematics): I could probably fix that article in under 5 minutes, by reading it and noting non-variable uses of italics, doing two global search-replace on '', and then restoring italics to the non-variables. The changes do not have to all be done by hand, though they cannot be totally automated, and yes, mistakes will sometimes be made as with anything else wiki.
Gnomes: I am one. I have over 37,000 edits, over 15,000 in article space, and most of them are formatting fixes of one kind or another when they aren't typo/grammar fixes. Of my non-articlespace edits an enormous number are to categorization, template code, etc. I'm not denigrating anyone as a gnome. Also, those of us who focus on gnoming don't have anything better to do. We like tinkering with the small bits or we wouldn't do it.
20,000: Not a scary number compared to a lot of other things like date links, etc., already mentioned. And again, there is no hurry. Start with FAs and work down, since we expect FAs to do what MOS says, and are very surprised when stubs do.
Will go nowhere: Why would it go nowhere? The rest of the difficult changes MOS has implemented have not gone nowhere. The just trickle down over time. It's normal here.
Clever bot: It's not reasonable to suggest "if you can do impossible thing here then we can talk."
This method is better: Then it's what MOS should recommend.
MathML: Yes! I fully support that. I'm shocked it has been taking so long. But this isn't a math-only issue; <var> should be used for software source code and other things too, and the section in question here doesn't say "mathematical variables" but "variables". It should probably be updated to include a code example, too, just to make that clearer.
Consensus: We're discussing now, and building toward it thereby. No one has proposed "no, this should be reverted because it is not actually what <var> is for" or some other substantive reason, only that it will take a long time and lot of effort to fully implement, which isn't necessary right off the bat anyway. Was I bold? Sure. That's normal. It's actually rather abnormal to have a messagebox atop a page saying "don't be bold", I have to say. But, regardless, by point isn't irritating people or getting in a big debate, it's doing the right thing with the tools available, including XHTML, to make the encyclopedia better. I'll get into more of the rationales for doing this later. (Out of time right now).

— SMcCandlish [talk] [cont] ‹(-¿-)› 03:36, 13 September 2008 (UTC)

Update

{{VAR}} is now {{var}} (the original {{var}} turned out to be disused cruft, so I moved it and updated the 5 or so pages that actually used it). I have also created a {{varserif}} variant. Someone elsewhere asked if XHTML treats math and programming variables differently. The answer is "no"; a var is a var is a var. The same person also says that math vars must be italicized and serif font, while computer vars must be italicized and monospace font. The {{var}} template presently italicizes and monospaces.

Firstly I have to observe that I can't find a single math article that actually adheres to this. I find all sorts of things in actual WP practice, including italicization of entire expressions (search my edit history for "economic" and you'll find me fixing one of those), italicization of variables only and the entire expression in the same font as the regular text (almost all math articles seem to go this route), and (rarely) italicization of variables inside an expression that has been serifized as a whole with a larger span or div. I was unable to locate an example of a mathematical expression in default font, but variables italicized-and-serifed. Yet that seems, if this person's facts are correct (and the source for them is correct), to be the desired result.

Secondly {{varserif}} can be used to "enforce" what is apparently actually proper math notation. I created it because I and l are pretty much indistinguishable in most sans-serif fonts, something I've found irritating for a long time and only just now go around to fixing, but it looks like it might just be quite a bit more broadly useful. Gist: Instead of attacking me, try working with me, eh? I kind of breathe wiki template code, and can whip these things up pretty darned quickly. — SMcCandlish [talk] [cont] ‹(-¿-)› 08:45, 14 September 2008 (UTC)

Discussion of rationale

SMcCandlish, you are being vague. You speak of benefits of semantic markup. Be specific, with examples. You speak of accessibility being impaired. Can you say how that could happen? I have only your assertion and no specifics.

And yes, I think you're an inexperienced outsider if you find it implausible that a new article could contain 75 italicized variables. And how does copy-and-paste speed things up when the variable in the middle of <var>m</var> is different each time (this time it's m, next time it's x, after that it's a, etc.)?

You're proposing a HUGE extra burden. And with no discussion until after you put it into the manual.

And why Wikipedia:Manual of Style (dates and numbers) rather than Wikipedia:Manual of Style (mathematics)? Michael Hardy (talk) 02:30, 13 September 2008 (UTC)

It is not my intent to be vague. Rather, I do not want to get into longwinded lectures on things like separation of presentation and content; web developers (a fairly large subset of WP editors; the geekiness of wikicoding appeals to them, and MediWiki's partial-XHTML language makes them a natural fit, while both factors are significant turn-offs to non-technical people) already know all of this stuff and have for over a decade. I thought it would be presumptuous to go into it without being asked for it. The accessibility issue I've addressed below, with specifics. I can also go into the other reasons, but am short on time right now (can probably do it tomorrow or even later tonight [in my time zone]). The burden on editors does not seem all that huge to me, esp. given both an XHTML way to do it and a wiki template way to do it. It's not any more of a burden that filling in infoboxes and other templates, pipe-linking to articles, and all the other things we do on Wikipedia. "No discussion": We're discussing it now, which was what I expected. I'm surprised it took this long. WP:BOLD is policy, so I don't always feel the need to pre-discuss every idea for a change. I often do, and maybe I should have in this case, but I didn't think it would be resisted, as WP usually does the code-smart thing without a fuss. Why MOSTEXT instead of MOSMATH? Because it's only partially a mathematics issue, and also applies to computer source code and anywhere else a variable might be used, including in plain English (e.g. "the admissions office will only accept x number of out-of-state students per year, but refuses to disclose what number x will actually be for the 2009–2100 university year", and so on, and of course because the language is in MOSTEXT. — SMcCandlish [talk] [cont] ‹(-¿-)› 02:58, 13 September 2008 (UTC)

[copied over from WT:MOSMATH:]

Quote:

If you find <var>x</var> too difficult in some way, try ~~{{VAR|x}}~~ {{var|x}} (same output).

That's a weird suggestion. Repeatedly typing {{VAR|x}} is as bad as repeatedly typing <var>x</var>. OK, so you say there are concerns that math editors may not know about, and that somehow the simpler way of doing things impairs accessibility. Can those statements impress anyone if you don't say specifically what those concerns are or how accessibility could be impaired? Michael Hardy (talk) 02:16, 13 September 2008 (UTC)

Interpolated comment: There's a logic problem here. The idea that "the simpler way of doing things" will necessarily "enable accessibility" has no basis. Generally, the opposite has long, and consistently, proven to be false. — SMcCandlish [talk] [cont] ‹(-¿-)› 10:05, 14 September 2008 (UTC)

~~{{VAR|x}}~~ {{var|x}} is just an option for people who love wikicode and hate HTML. I've already explained this. What part of it wasn't clear? The accessibility one is simple: For most users with screen-reading software there is no difference at all between E = mc2 (E=mc2) or its partially wikified equivalent ''E'' = ''mc''2 (E = mc2), on the one hand, and E = mc2 (E = mc2). This is all presentational markup. When you do <var>E<var> = <var>mc</var>2 (E = mc²) there is a big difference (the variables are identified as variables, and the superscript as a superscript), and there is no impact at all on sighted users. I can go into the other reasons in more detail if requested, but really, shouldn't it be enough to know that without semantic markup, math (and other, e.g. computer science) articles are complete gibberish to a significant subset of users? (PS: Yes, I know no one would do the span thing to get a superscript; I just did not want to mingle both presentational and semantic markup in the same examples.) — SMcCandlish [talk] [cont] ‹(-¿-)› 02:47, 13 September 2008 (UTC)

My 2c: happy if <var>...</var> is presented as an alternative markup, but totally oppose anything describing it as the normal, general or recommended markup. It is overly cumbersome, unnecessary, and we will only have to rescue maths articles from well-meaning editors who try to implement the guideline without properly understanding it or the articles and mess stuff up - as I recently had to sort out a bunch of well intended but disastrous markup changes in triangle group. Gandalf61 (talk) 08:37, 13 September 2008 (UTC)

This recount of the event is incorrect. --Yecril (talk) 10:23, 1 October 2008 (UTC)

Since <var>, not  (which is what '' wikimarkup is, when XSLT trasformed and sent to the end-user browser), is the the "normal, general and recommended" markup for variables according to the W3C HTML 4 and XHTML 1 specifications (the only standards that exist for such markup), then I'm not sure what you are trying to say. I'm genuinely sorry that some dwid mangled the triangle group article and wasted your time – I have to clean up well-meaning but severely misguided edits like that all the time, so I can directly empathize – but that has absolutely nothing to do with the topic at hand. — SMcCandlish [talk] [cont] ‹(-¿-)› 10:05, 14 September 2008 (UTC)

Hmmm. W3C HTML 4.01 spec says that VAR "indicates an instance of a variable or program argument". The words "instance" and "program" clearly show that this refers to software variables, not mathematical variables. I see you have ignored my substantive points and instead focussed on technical nitpicking and condescension. As you seem to be more interested in scoring points than reaching consensus, I see no point in continuing this discussion. Gandalf61 (talk) 12:53, 14 September 2008 (UTC)

And I for my part have no idea what you are talking about, since I just went out of my way to sympathize with your position and to indicate that I understand where you are coming from. So, goodbye, I guess. PS: I cannot agree with your reading of W3C. The clauses separated by "or" are independent; if they were not, the word "program" would have preceded both of them. It emphatically does not say "program variable or [program] argument" (one concept) , it says "variables" (one concept) "or program argument" (a different concept). This is moot anyway - as someone else in this debate (on this page or one of the duplicate discussions that I've tried to centralize here) has pointed out, even mathematicians and their style guides say that variables should be italicized, which is what <var> does in a semantic way. I.e., I've yet to see anyone produce any actually viable evidence that <var> is somehow not the appropriate markup. — SMcCandlish [talk] [cont] ‹(-¿-)› 08:07, 16 September 2008 (UTC)

I completely agree with Gandalf and completely oppose making this markup recommended or anything like this. We are just striving to get reasonable articles in terms of content, and overburdening editors with such kind of errants is not helping us in attaining this aim. Jakob.scholbach (talk) 09:22, 13 September 2008 (UTC)

How could we not recommend that markup be used as intended? I see a significant number of angry-ish responses, but none of them give any reason to not use the markup language properly other than convenience. If we were to use convenience as our baseline editing rationale, we should simply delete every single MOS page right now. I want to repeat that I am bordering on shock to see mathematics-focused editors insisting that the (yes, rather geeky) semantics of markup is a pain-in-the-ass waste of their precious time, when mathematicians are among the most strict sticklers for very, very precise (geeeeeeky) markup (on paper, and in wiki pages that mirror paper) that is of crucial importance to them and which is of no importance at all to anyone else. Wikipedia has given you your way pretty much totally on maths matters, much to the chagrin of many; try bending a little for the needs of others. This is supposed to be a collaborative editing environment; for it to be one, compromise is required, not domination. — SMcCandlish [talk] [cont] ‹(-¿-)› 10:05, 14 September 2008 (UTC)

{{VAR}} has advantages over <var>x</var>. The inner workings of the template can be easily changed so at some point in the future its content could be changed to mathml or whatever. Mixing wiki markup with html seems wrong to me

Does a computer variable have the same semantic as a mathematical variable? They certainly want to be formatted differently mono-space as opposed to serif-italic. From Variable#Computer programming

Variables in computer programming are very different from variables in mathematics and the apparent similarity is a source of much confusion.^{[citation needed]} Variables in most of mathematics (those that are extensional and referentially transparent) are time-independent unknowns, while in programming a variable can associate with different values at different times (as they are intensional).

Math formatting on wikipedia has had a troubled history, we have had rendering engines which produce mathml Blahtex, various bug reports [1], we had a hard job getting any developers interested in the issue at all. --Salix alba (talk) 09:32, 13 September 2008 (UTC)

If it were done in MathML, probably another template would be needed. I think that {{v}} can be usurped, since {{view}} seems to be the same thing (to the extent they differ, probably mergeable), and it is not used manually, but is rather a part of other templates, namely the ones that put "v - e - w" and other such utility shortcuts at the far top right of navboxes, to-do lists and other contentful, boxy templates. MathML could not reasonably be applied to C++ source code and other uses of variables. What I'm talking about now is general use of variables, which presently includes most math articles (which are not doing math-specific things like serif font, etc., anyway). Math wants MathML. Count me in as a supporter. In the interim there needs to be a compelling reason to not use XHTML properly, and so far I haven't seen one. But I realize that Hardy and some others have put the ball in my court to provide the other side more compellingly, so I will do that. I just need a bit of time. I have 10+ relatives showing up in between 9 and 16 hours... Nothing is going to break in the interim. I'm not ignoring you, I'm just pulled too thin by the coincidence of timing to provide the full rationale - the version that does not presume anyone has been steeped in web development for years on end - right this instant. This will be my wikipriority #1, however. — SMcCandlish [talk] [cont] ‹(-¿-)› 10:05, 14 September 2008 (UTC)

In fact, if there's one thing that I think would help math formatting more than anything else, it would be improving <math>. When I counted variables in my previous post, I didn't count italicized functions: f(x) was counted as one variable, because only x is a variable. But f should still be italicized. And if we go down that route, we need semantic markup for everything. I can't imagine that we're going to try to make Wiki markup so expressive that it duplicates all the intricacies of MathML, so the only solution I can see is to make <math> automatically do the right thing; in particular, it should generate more HTML and fewer PNGs so that it can be used better inline. Ozob (talk) 14:58, 13 September 2008 (UTC)

Support, if I may. — SMcCandlish [talk] [cont] ‹(-¿-)› 10:05, 14 September 2008 (UTC)

SMcCandlish wrote:

and there is no impact at all on sighted users. I can go into the other reasons in more detail if requested

"Other"? As in: you've given some reasons, and there are also others? You've said it makes no difference at all to sighted users. That's not a reason for the change. You've also illustrated how it makes things more difficult for those editing articles. That's also not a reason for the change. Then you say you can go into other reasons as well.

"This change makes things difficult for those editing articles, but makes no difference in the way the article appears. In addition, there are also other reasons for the change."

How does that make sense? Michael Hardy (talk) 23:03, 13 September 2008 (UTC)

I'd like to reiterate my complete opposing this proposal. I am working on group (mathematics), and I can assure you, SMcCandlish, that it is not simply a matter of minutes to get an article like that into good formal shape. We are talking about hours devoted to such kind of errands. To be honest, if the article would not be up for FAC, I would hardly care of many style guidelines. As a fact, math editors are burdened with an enormous amount of additional editorial work of this kind. Usual professional environments, like TeX, help us in doing this work elsewhere, but writing math articles is, from the markup point of view, a pain. Already!!

You are saying that 20,000 articles is not enough compared to millions of others? This seems to be pretty much nonsense, why not compare 20,000 with billions of external web pages that all need good formatting?

To conclude, I think we all agree on the overall goal to make Wikipedia as good an encyclopedia as we can. What you are proposing, apparently without any prior discussion with the involved editors (which is outraging, btw), is a way to counteract that goal. SMcCandlish, please answer a very simple question: do you prefer 10 pages with markup where variables are put out non-italic for 1% of the readers, or do you prefer 5 pages where everybody can read variables properly, the other 5 never being written? Same question with 10 to 6, 10 to 7, 10 to 8, 10 to 9. Jakob.scholbach (talk) 00:21, 14 September 2008 (UTC)

I'm about to fall asleep at the keyboard, so I'll keep this as short as possible without sounding snippish as a result. I hear you and recognize that you are upset. However, please see WP:TEA, WP:MASTODON, etc. If, as you say, you are "outraged" then you need to step away. The sky will not fall in the interim. The worst that could possibly happen is someone could overexcitedly change some articles while not having any idea what they are doing, in which case some reverts will fix the matter. There is no enormous hurry. Secondly, try to see it from the other side. To serious web development people, what you all have done with the math articles is appalling. Unbelievably appalling. You've killed our kitten, and with unfeeling disdain. It's hard to think of an example that may resonate with hardcore math people. <a minute or so> Oh, okay, try this one: Imagine ~~someone~~ a large bloc of someones insisting that every occurrence of "π" be "pi" in every single article on Wikipedia other than those about the Greek language, on the basis that "they're the same" and "inputting that nerdy 'π' code is just too much of a pain", and reacting in a mass action against any move to undo that for reasons that seem to them to be geeky nonsense, before actually hearing out what those reasons might be, and what their ramifications might be, in full. Just thinking of that makes me empathize with how you are feeling, so I hope it has the same effect in the opposite direction. We are having only-intermittent real communication here, but I have faith it can be pretty wide-band if we let tempers cool (mine too; I've gotten fairly testy at some of the wording sent my way that I perhaps should have let roll off, duck-back-wise). And regarding the above template usurpation idea, {{v|x}} is only two characters longer than ''x''.. Anyway, please, please centralize this discussion at WT:MOSTEXT (which is the talk page of the disputed material). It is a very hair-pulling exercise to have to deal with detailed, emotional comments mostly from the same parties on three different talk pages. [I refer here to near-duplicate threads at WT:MOSMATH and WT:MATH.]— SMcCandlish [talk] [cont] ‹(-¿-)› 10:05, 14 September 2008 (UTC)

SMcCandlish's position

SMcCandlish keeps expressing how frustrated he feels with the inability of some of us to understand that the more difficult and cumbersome method, that would make all editing take three or four times as long, would achieve results that are no different, and that's a reason to prefer the more cumbersome methods.

He says if we don't adopt the more cumbersome methods things "will just keep getting worse". As far as I can see, the more cumbersome methods would only make things worse by being more cumbersome. In order for his comment about things "getting worse" to make sense, there would have to be something bad, that will get worse. What that bad thing is, he doesn't tell us, and the fact is that I don't know. Why doesn't SMcCandlish inform us on this matter? Or at least attempt to? The proposed change can only discourage contributions to Wikipedia by making editing difficult or impossible.

He tells us that there is some significant portion of Wikipedia users to whom articles will be incomprehensible without the more cumbersome editor-hostile conventions that he wants. Well, I could tell him that gremlins on the planet Pluto have difficulty reading Wikipedia because SMcCandlish is here, and I would at least have the virtue of being specific about who those readers are (gremlins on Pluto) and he would have just as much reason to believe they exist as I have to believe those users he mentions exist. Who are the people who are having difficulties because we're not using his proposed more cumbersome ways of editing? He doesn't say! How do those difficulties come about? Not even a hint from SMcCandlish. He asks whether it isn't enough to know those people exist. But I don't know that they exist. I don't know who they are, what they do, why or how the proposed editor-hostile methods are better for them, and SMcCandlish doesn't see fit to say. Can SMcCandlish even name one such user? If so, he hasn't tried to do so.

More than five years ago we switched from using <var>x</var> to using ''x'', and I and many others have devoted efforts over some years to improving articles by changing the obsolete form <var>x</var> to the improved ''x''. If we are asked to go back to the old form, at least some attempt should be made by someone to give some reason why that should be done. SMcCandlish doesn't want to even attempt to do that. Michael Hardy (talk) 23:49, 13 September 2008 (UTC)

Just to be clear, the affected users I mentioned in that paragraph were (obviously, I thought) the screen reader users I had been discussing in detail immediately above that. — SMcCandlish [talk] [cont] ‹(-¿-)› 10:09, 16 September 2008 (UTC)

Rational and concise – I don't know, I get it. He has explained semantic web and given many links to aid in understanding of what that is. Off the top of my head, there are two cases where the semantic tag works better that simple italicization: searches, e.g. searching for "the variable x" as opposed to searching for "x", and screen readers for visually impaired or podcast users, where the software can read "variable E equals variable mc squared" instead of reading "E equals m c two". I hope that explanation helps. Sswonk (talk) 00:31, 14 September 2008 (UTC)

OK, this part about searching SMcCandlish has never mentioned as far as I've found. Why must we wait for others than SMcCandlish to tell us what they think he might have meant? Michael Hardy (talk) 01:32, 14 September 2008 (UTC)

I don't understand Sswonk's E=mc² example. My understanding is that the markup for "squared" is 2 in both cases - so why does a screen reader read the first case as "squared" and the second as "two" ? Or is there a proposal for a different markup for exponents as well as for variables ? Gandalf61 (talk) 08:56, 14 September 2008 (UTC)

Well, it was off the top of my head, but basically the reading software would include heuristic code that "reasons" that if a character preceding a superscripted number is declared a variable with a <var> tag, that number is an exponent. That would not be a logical assumption for the software to make if the character preceding were simply italicized. The intent of semantic code is to allow software, including but not limited to search engines and screen readers, to make decisions such as this with much greater authority in a wide variety of situations. Sswonk (talk) 15:27, 14 September 2008 (UTC)

Quite often the number after a variable is a subscript; and superscripts after variables may well be tensor notation, not exponents. Septentrionalis PMAnderson 17:08, 14 September 2008 (UTC)

...Oh, I should have said, not that he "tells us that there is some significant portion of Wikipedia users to whom articles will be incomprehensible without the more cumbersome editor-hostile conventions that he wants", but rather that he acts as if we already know that. I still don't see any reason to suspect that any such people exist. Michael Hardy (talk) 00:42, 14 September 2008 (UTC)

This thing is crap. I reject it. Loisel (talk) 10:09, 14 September 2008 (UTC)

"I don't like it" arguments are invalid in WP consensus discussions, and the community has rejected them. Please try something more responsive to the issues raised. — SMcCandlish [talk] [cont] ‹(-¿-)› 20:06, 14 September 2008 (UTC)

Patience please. Because my requests to centralize discussion here have largely been ignored, I've been trying to address all of these concerns in three different places, and am now out of time for today. I will try to get back to all of this as soon as possible, probably in a bit under 24 hours as of this timestamp. In the interim, I have addressed some of this already at both WT:MOSMATH and WT:MATH before I tiredly arrived back here again. Please respond here not there, or the discussion will continue to be pointlessly fragmented. Very short response to a couple of the above issues: "Getting worse": I thought I made that really clear, but I'll spell it out again. The fact that a pretty conceptually simple change is angsty is entirely because of the failure to fully separate the content from the presentation despite the fact that this has been advised since ca. 1993 and codified in HTML 4 (now XHTML 1) and CSS 2 ca. 1998 (or maybe 1996; I'm tired and misremembering). Continued failure to get this right will only lead to more and far worse (scale-wise) problems of this sort as WP continues to grow and grow; 20,000 is a much preferable number to 120,000. "Significant portion of Wikipedia users": I must be writing very poorly this week if it isn't clear that the large paragraph of usability material immediately preceding my mention of this significant proportion didn't make it entirely clear that the referents are the visually impaired users I had just been talking about (was that a bad place for a paragraph break or something?), for whom in most cases non-semantic italicization simply doesn't exist. More later. I haven't even gotten to metadata, the semantic Web, more basic reasons for content-presentation separation, actually-proper mathematical style, etc. — SMcCandlish [talk] [cont] ‹(-¿-)› 20:06, 14 September 2008 (UTC)

For mathematicians: The idea of content-presentation separation is one of the things that underlies TeX and LaTeX. If you want to start a new section, you write \section{...}. You don't write \textbf{\large ...} even though the presentation might be the same. One of the advantages of \section is that when you use another style for the article (for instance \documentclass{amsart} for AMS journals instead of the standard \documentclass{article}), all the section headings will change automatically to match the new style.

To SMcCandlish: I'm afraid I don't see many practical advantages of using <var>. All the stuff about semantic Web and content-presentation separation is theoretically very nice, but how does it help us in practice? The use of screen readers by visually impaired users could be something, but again you need to be more specific. As far as I remember from the one or two times that I tried out a screen reader, they spell out words they can't pronounce, so both ''E'' = ''mc''2 and <var>E</var> = <var>mc</var>2 will be read as "ee is em cee superscript two" (I forgot what they do with superscript, and it probably depends on the actual software you use).

My opinion: That it's a lot of work is not an argument. If it's better to use <var> then the Manual of Style should say we should use <var>. If you think that the advantages of using <var> are so small that it is not worth the effort, then ignore it. For instance, I write sin(''x'' + ''y'') even though it is better to use non-breaking spaces, as in sin(''x'' + ''y''), because I can't be bothered to write the latter. Similarly, I probably won't bother to use <var>.

However, it's not clear to me that it is actually better to use <var>. The benefits are small, and have to be balanced against the disadvantages. One disadvantage is that it hurts accessibility for editors. The wikicode is already hard to read, and using <var> or {{var}} or (even worse) both plus variant templates mixed together inconsistently makes it even harder to read. -- Jitse Niesen (talk) 03:24, 15 September 2008 (UTC)

I appreciate your TeX example, since it may help maths editors here wrap their minds around the XHTML semantics issue a little better. Screen readers: It will vary from application to application. Some crude ones probably just do spit out bare letters, but modern (expensive), smarter ones are certainly aware of semantic markup, and would say "variable capital E equals..." As someone else noted above, how it would handle a superscript is also going to vary from app to app; I don't have the money to prove that there is one right now that will cleverly decided something is "2 squared" or "to the second power" based on numeric context, instead of saying "superscripted 2", but I'm not sure that's relevant. If there isn't one now, some day there will be, and Wikipedia is not a temporary project - we all expect it to outlive us, or we wouldn't bother working on it. I agree with you that "it's a lot of work" isn't a solid opposition position, and that (as with everything else MOS) those who find it too much trouble will ignore it. That's fine; the encyclopedia works day after day with this happening all the time, and stuff that needs tweaking slowly gets tweaked by people who enjoy doing such cleanup work. I understand the editor readability issue. I would not advocate mixing styles at all! The question to my mind is whether one more template or element among a huge pile of them that we're all used to is somehow going to be project-fatal. I'm skeptical that it would be, of course. — SMcCandlish [talk] [cont] ‹(-¿-)› 08:20, 16 September 2008 (UTC)

<subject>This</subject> <verb>is</verb> <complement>retarded</complement> <punctuation>.</punctuation> <signature>Loisel (talk) 14:53, 15 September 2008 (UTC)</signature>

Come on if your going to comment can you at least be WP:CIVIL. One more comment like that and I'll block you. --Salix alba (talk) 16:10, 15 September 2008 (UTC)

Well, that example would be incredibly crappy. But it's a slippery slope and straw man fallacy, since no one has actually proposed anything like that (and XHTML doesn't support it anyway, not having tags like <subject>). — SMcCandlish [talk] [cont] ‹(-¿-)› 08:20, 16 September 2008 (UTC)

Rationales outline

Reasons to use {{v}} – after it is usurped; it's code would be that presently at {{varserif}} – for maths variables

It's a boon to accessibility – modern screen reader software can actually inform the reader that {{v|E}} = {{v|mc}}2 is "variable capital E equals variable m c, superscript 2" (and maybe even better; it will depend on the screen reader app), instead of "e equals m c, superscript 2". Failure to use every feature of XHTML that can be used to enhance accessibility is a decision to intentionally alienate a substantial subset of our readers simply because we can't be bothered.
It has no negative or other effect on visual appearance, for sighted users.
{{v|E}} is only 2 characters longer than ''E'', but much richer. The "expense" of 2 characters in human editing time is practically nil, even in an article with 100 cases. If template name length were a significant factor, no one would use template names like {{fact}} or {{clarifyme}} – frequently used ones would shortened as much as possible to things like {{f}} and {{cm}}, and people would rather have template names like {{f2}} than {{fact}} if {{f}} were already taken.
Finally get the format right, for real: The template can be customized (e.g. to use CSS to use a serif font, which I'm told above is actually the style genuinely called for in mathematics for variables). The fact that the current italicization practice here is only half of the actual desired output, and thus does not really follow mathematics standards, is a pretty severe problem to surmount for those arguing to keep the plain italics.
There is no big hurry to update the 20,000 or so maths articles that may need conversion. Conversion should start with a) Featured articles and FACs, and b) new articles. WP has no deadline.
There is no mandatory anything here - WP:MOSTEXT, and WP:MOS as a whole, is just a guideline. Editors ignore what they find too burdensome, and other gnoming editors clean up after them happily. It's just the way it works, and it seems to work just fine.
Page conversion is actually much easier than has been decried. For most maths articles, there will be few instances of italics for reasons other than denoting variables. Read the article and note these cases. You can actually just convert them to  for emphasis (another good semantic and accessibility move to begin with!) and  for typographic-only italics (e.g. book titles). All that is left italicized will be variables (and perhaps other mathematical items that are also italicized, but which you will already have converted to  markup). A quick global search-replace will convert these variable remainders to {{v}}. If you personally find this too much work, then just ignore it and do something else you find more interesting; someone else will deal with it later.
Microformats, the semantic Web, and Web 2.0: All of these could be their own very detailed bullet points, but it may help to keep it simple. HTML-embedded microformats are entirely dependent upon the proper use of semantic XHTML markup. The myriad potential uses for them are only barely being explored today, but already result in embeddable hCards and hCalendars, auto-generated Google maps from geo microformat data (and even several at once - hCards can embed geo metadata, etc.), and so forth. The examples at the hCard, etc., articles mostly show rather clumsy use of the microformats – just by way of <div> and  – but it can be done much more interestingly and seamlessly by applying the microformat classes to even more semantic elements (see some off-WP tutorials for good examples). While I have not researched every single microformat in existence, I would be shocked if there were not some already making use of <var>, and more will surely come. The rel-tag microformat could certainly make use of <var>, as could some potential uses of the XOXO mf and obviously the forthcoming measurements mf which will necessarily have to include mathematics. The semantic Web, the ongoing natural evolution of the Web into a more meaning-rich (as opposed to simply pretty) environment, more easily processed by software to do things helpful for readers, likewise depends completely on separation of presentation and content. Wikipedia must do this, or it will fall behind and become technologically and information-architecturally archaic in only a handful of coming years. The constantly-developing "Web 2.0" demands semantic content, not author-side convenience or 1994-style "just make it look right, forget other concerns" markup techniques.
Search: As someone else noted, it will make searching easier, since "E" matches all manner of things, but {{v|E}} will be quite specific.
Cleaner code: It's just better coding practice to separate presentation and content; this has been a basic Web principle since the mid-1990s (it is why CSS was developed in the first place). Keeping them separate makes maintenance and growth vastly easier, as well as all the other benefits already mentioned.
Content repurposing and code portability: WP has been built from day one with many goals in mind, and one of these is that its content be reusable any way that any one sees fit, by any means (within the GFDL license terms). We cannot predict what forms this re-use will take, but we can be certain that some of them will use different formatting. WP content that, like maths material, is dependent on certain things in order for it to make sense needs, necessarily, to do these things in as semantic a way, not just visually, as possible for this content to be genuinely portable. For instance, one re-use paradigm might convert italics to underlines, but not do this for <var>, respecting the maths world's specification that variables are italic and serif.
Consistency, and editor "education". The current practice of simply slapping italics on things that mathematicians understand to be variables is confusing or even pointless to non-specialist editors, many of whom mistakenly assume these to be excessive emphasis and remove the italics, while others come under the incorrect impression that equations in general should be completely italicized, and so forth. Use of documented template, that in turn uses a pretty self-explanatory XHTML element, clearly identify variables in particular as the items being marked up, and marked up in a very specific way. Over time this will a) lead to more non-specialist editors being aware that maths variables are styled a particular way and how to do it, and b) engender increased consistency of math handling throughout WP. Also, as noted, the current just-italicize styling isn't even correct mathematical style anyway, so there is no logical basis on which to retain it.
Prepares for MathML: Eventual conversion to MathML will be greatly aided, by variables already being clearly pre-identified, and thus easily search-replaced with MathML markup.
Tools: Entry can be made easier with additional tools. For example, a button could be added to the editing palette to insert this code, and various user scripts could also be devised (e.g. to convert all italics to {{v}} in a selected block). There is also already a WP user script for making {{ and }} highlighted in the editing window, and other tools will be developed over time that make editing more visually clear.
Sets example: We also need to keep in mind that what Wikipedia does generally has a large influence on what all WikiMedia Foundation projects do, and even what a huge number of non-Foundation wikis do. Good semantic markup practices here inspire them elsewhere, and lazy, sloppy coding here help retard the spread of best practices. Also, WP editors actually making more use of semantic elements will help spur the MediaWiki developers to make the underlying web app behave more semantically as well.
The past was the best time to do this. Now is okay. Later will be worse, because there will be even more articles, and more variables in many extant articles, to convert.
It's what <var> is for: Despite someone's strained reading above, the HTML/XHTML specs clearly state that <var> is for "a variable", not "a computer variable" in particular. Specs documents like that are very, very carefully worded and reviewed by hundreds of people for precision and clarity; they mean exactly and quite literally what they say probably >99% of the time. The allegedly problematic distinction between mathematic and other variables is not helped at all by simply italicizing them both. That makes things far worse, because the markup is instantly confused with both emphasis and italics-for-display-purposes-only, by many if not most editors. The different between the eventual {{v}} (italics and serif) and {{var}} (italics only) tags and their attendant CSS will help not hinder distinguishing between these two variable types and displaying them properly.
Serious maths editors are already well aware of the value of content/presentation separation and semantic markup, as TeX and LaTeX use them, and really we've all used it in our heads since we were kids (when you make a heading in a school paper, you think of it as a heading – a semantic item – not "16-point and bold just for the heck of it").

Arguments for using {{var}} for computer variables

Almost all of the above; the only real difference is that the variable would only be italicized, not italicized and serif-fonted.
<var> works synergistically with other semantic markup in the CS context, including <code>, <kbd>, and <samp>. (Not all of these are presently enabled in MediaWiki; a MW Bugzilla bug report has been filed on the matter.)

Arguments against them

Not as convenient as just italicizing.
Over 20K articles to which to apply this change, some of which might use dozens of variables, or even over 100
Not practicably scriptable as a bot
Source code a little more difficult to read, due to addition of another markup item.
Comparatively low number of directly benefited readers.
Indirect benefits are not immediately apparent or important to maths editors.

That's all I can think of in one sitting, and sorry it took so long. I'm going to go self-revert the change to the guideline until this plays out more, especially since the solution now advanced requires convincing anyone with an interest in the extant {{v}} to merge it with {{view}} and convert existing deployment to the latter so that we can usurp the former. — SMcCandlish [talk] [cont] ‹(-¿-)› 10:07, 16 September 2008 (UTC)

I think you are severely underestimating how unreadable this will make wikicode for formulas. This is already present in your incorrect use of the markup. The code <var>E</var> = <var>mc</var>2 should in fact read <var>E</var> = <var>m</var><var>c</var>2 as {{v|m}} and {{v|c}} are two seperate variables and not a single variable {{v|mc}} which needs to be squared. (or while I'm nitpicking it should read <var>E</var> = <var>m</var>c2 because c is a constant.) Basically your asking too add markup to almost every other character appearing in formula much like the example given above of a sentence in which every word has been marked. A consequence of unreadable code is the it will increase the number of mistakes made in editing and thereby deteriorating the factual accuracy of the wikipedia math content.

There is a reason that math editors like TeX and LaTeX assume character that has not been marked differently to be a variable.

That being said your suggestion of a {{v| }} template is already much better, and may in fact improve readability over the use quotation marks which can be confusing if a single formula contains both scalar and vector variables.

This, by the way, is somewhat of a problem with your proposal; what to do with vector variables? Those should be type set in a non-italic boldface (as

{\vec {x}}

is not an option for HTML), but should semantically still be marked as being a variable. I don't think it is practically possible to provide full consistent semantic markup for formulae with simple wiki markup. The inability to apply this consistently might be a reason to not apply it at all, as inconsistent implemetation may actual do more harm than good. (TimothyRias (talk) 12:46, 16 September 2008 (UTC))

As I hope is already clear, I've backed away from the "just say <var>" position, in favor of {{v}} (i.e., the code presently at {{varserif}}). For vector variables, {{vv}} is available, and it would take about 1 minute to create that, using <var> to correctly semantically identify it as a variable ([X]HTML not being any more specific than that today), but using CSS to style it as you say, non-italic and boldface - locally-specified styles override globals (that's the "C" in CSS; apologies to all readers who already know this, but I don't want to assume everyone understands how the style cascade works just because I do; such assumptions have already bitten me in the butt here). So, we should (at least so far!) be able to do the full semantic markup with XHTML, though as you note not with wikimarkup ('', etc.) It won't be godlike, but it should suffice and (perhaps more to the point for maths editors) it will also provide a really clear way to identify particular items that need to someday be converted to equivalent by differently-coded particular items in MathML when MediaWiki almost-inevitably but who-knows-when finally supports that. I would agree that an inability to apply the markup consistently would be worse than useless, but fortunately we aren't actually stuck with that, because CSS is so flexible. NB: We could also add XHTML classes to {{v}} and {{vv}} to make them further identifiable, and styleable as a class. Question: Would vv need to be non-italic but bold (only), or non-italic, bold and serif? It's 5 seconds coding difference; just want to be clear. PS: My "incorrect use of the markup" was actually an incorrect assumption about relativity; I was not certain at the time of writing that m and c needed to be treated separately or I would have coded it that way, certainly. The idea did occur to me, but I went with the simpler markup on a (bad) hunch. That's probably hysterically funny to you maths geniuses. :-) The really funny part is I actually knew that c was a constant already, from another context, and just plain forgot. So, yes, I certainly agree with your final version – if we were to propose simply using <var> which obviously isn't viable. Templates with embedded style actually get us much closer to where we want to be (MathML being the actual location...) In closing for now, yes, I am asking to add markup to almost every other character appearing in a formula. But this is already being done anyway. It's simply different, richer markup, that is slightly more complex on a character-by-character basis (more keystrokes) but far richer, and easier to read, and more likely to lead to better formating in 6 months or 6 years or however long it takes for MediaWiki/MathML integration. — SMcCandlish [talk] [cont] ‹(-¿-)› 12:04, 17 September 2008 (UTC)

Version 2

SMcCandlish, I would like to know how you would like the following formula to be entered under ideal conditions. You're allowed to pretend anything you want about Wikipedia markup for the purposes of this exercise—just tell me how, in an ideal world, you'd like this to be entered:

f(\mathbf {x} ,t)=t^{\alpha }\mathbf {x} \cdot \mathbf {v} \,\!

Here x and t are variables and α and v are constants. At present I would enter this as follows:

''f''('''x''', ''t'') = ''t''α'''x'''•'''v'''

f(x, t) = t^αx•v

How would you like it entered? Keep in mind that x and v are vectors, so they must be boldface, and α and t are scalars, so they must be italic. Also keep in mind that only x and t are actually variables, whereas α and v are constants and f is a function. Ozob (talk) 00:02, 17 September 2008 (UTC)

Making reference to my immediately-previous discussion with TimothyRias, and suggest the following (again: this actually working would depend upon a probably-viable template usurpation of {{v}} and creation of a new bold, non-italic (and serif? you tell me...) {{vv}}:

~~{{v|f}}({{vv|x}}, {{v|t}}) = {{v|t}}α{{vv|x}}•{{vv|v}}~~ See better example below.

It's only 9 characters longer than the original, but much easier to parse, since it is not a jumble of apostrophe streams, and uses bracketing to surround things (something mathematicians are certainly familiar with!) Further benefits are that any editor around for more than a few days knows that any template almost certainly has documentation, and these things would have semantic value (esp. with classes added, e.g. class="var-scalar".

I'm not 100% certain I read your original correctly. I get the impression that your prose-form α referred to your code-form α, but the latter was not boldfaced in your example, while v was, and both are said to be constants. But shortly after you say that v is a vector variable, not a constant. I am thus confused on various points, such as whether constants should be bolded, italicized or neither, and/or whether other typographic constraints are needed. As an example, I will assume for the sake of argument (trusting your prose over your code, since after all we write in natural language better than we do in code) that:

Function: italic, sans-serif
Scalar variable: italic, serif
Vector variable: boldface, non-italic, serif
Constant: boldface, non-italic, sans-serif
f: function
α and v: constants
x: vector variable (I dropped v as a vector, since previously said to be a constant)
all others: scalar variables
All of the above is arbitrary and just for the sake of example; we might as well substitute "green text" for "serif" or "t" for "x"

and I stipulate the following:

A. {{v}} can practicably be usurped pretty quickly (it is mostly used by navbox meta-templates; its huge "what links here" is due to transclusion of them)

B. {{vv}} is presently unused

C. {{fn}} can be immediately usurped after AWB cleanup, as it is deprecated

D. A {{ct}} template could be needed because there could be special formatting concerns for constants [this may be a completely false assumption, in which case coding would be simpler] or a need to semantically ID constants as constants, even if no display effects are called for.

E. {{ct}} can be immediately usurped after AWB cleanup, as it is just a shortcut to {{Cfd top}}

F. Each of these (eventual) templates will use CSS styling to achieve the formatting listed above (and using <var> for variables, only, the rest being s, and in both cases with classes identifying them by type).

G. We'll need to gain consensus here and either usurp each template needed in turn with buy-in from editors responsible for the current deployment to the extent there is any, or take the matter to WP:RFC or WP:VP to gain WP-wide consensus to usurp them all at once; preferably the former.

The code I envision is the following, assuming D, above, is true (that constants need CSS styling):

{{fn|f}}({{vv|x}}, {{v|t}}) = {{v|t}}{{ct|α}}{{vv|x}}•{{ct|v}}

And the Einstein example would be:

{{v|E}} = {{v|m}}{{ct|c}}2

I think I got that right; I'm getting quite tired (it's 5:27 a.m. my time). There may be a typo in there, but I've read it over five times and I don't find any. If there is nothing at all stylistically that needs to be done for constants, the markup would be simpler. Then again, actually indentifying them positively as constants could be a good thing.

Compare that to the hard-to-parse original version:

''f''('''x''', ''t'') = ''t''α'''x'''•'''v'''

Computer code variables would simply be:

<var>foo</var> or {{var|foo}}

So, how about that?

PS: Is "•" really the "multiplication-dot" symbol? Or is it "·" or one of the various other bullet-ish things? The "•" one seems overly large to me, and looks like simply a list-item bullet. May just be an issue with my monospaced editing font, though; it looks smaller in the rendered text.

PPS: It looks to me from the <math> example that the "f" for the function is not actually a regular "f" at all, but the "florin" symbol or something very close to it: ƒ

PPPS: There may well be other things to template (exponents, roots, sets, cardinalities, whatever; I don't care about the particulars), but it should be easy.

PPPPS: The apparent desire to have the function designator be larger can also easily be done with CSS.

— SMcCandlish [talk] [cont] ‹(-¿-)› 12:04, 17 September 2008 (UTC)

Some issues:

There is no general consensus as to exactly how formulas should be formatted. As an example vectors: italic bold face or straight bold face? There is no clear consensus on this and in the end it is up to the author's preference. (The Springer style guide for example suggest using bold italic vectors for physics publications and and straightup bold vectors for mathematics publications. Force either as a general style guide for WP and it will upset at least one community. This will lead to endless and fruitless discussions on what is the "better" convention. The general practice in such cases is to let the editors of a particular article decide which style to use. This is however incompatible with using a semantic markup that fixes the visual presentation.
The above proposal will lead to the need for a very large number of templates. Mathematics is just very rich with different object that would need different formatting. Now we have functions, scalars, vectors, variables, which can appear in different combinations, but what about tensors, vector valued functions, or anti symmetric p-forms? The list of possible objects is pretty much endless. And in the end there is very little general concensus as to how to format them. Luckily, only a few type of objects will usually appear in a single article, and typography only needs to differenntiate between the different types used while relying on the context to make clear what is what.
In light of the above issues, your proposal will inevitably lead to article using some wierd Frankstein hybrid between your proposal and whatever is being used now as means of direct make up. This, in my view, will provide a much bigger disservice to people reliant on screenreaders, than just providing the proper formatting as straight up markup. (I assume most more advanced screenreaders will have a setting that explicitly spells out all the formatting, which might be necessary to properly understand mathematics articles.

Generally speaking I think your proposal should be fine for articles using only very basic equations. But for anything more advanced it will lead to a huge mess. (TimothyRias (talk) 14:18, 17 September 2008 (UTC))

That's fixable. Begin article or section with a {{maths}} or {{physics}}, that contain a <div class="maths"> and end it with a {{mathsend}} or {{physicsend}} that closes the div. The site-wide stylesheet would define all this stuff, and {{fn}} (with ) would simply inherit: div.maths span.function { styles here; } versus div.physics span.function { different styles here; } – the CSS wouldn't actually be in {{fn}}, {{v}}, etc., just classes. If chemistry and geology and whatever use further different styles, just add more classes. Trivial.
Tensors, p-forms, etc.: Unless they need special styling to be distinguishable from each other, and there's a standard way of doing it, no need to template them. My suspicion is we can already drop constants (proposed {{ct}}) from the list. Actually we could drop everything but variables, the only one we have an XHTML element for; I just came up with the rest mainly to show that it could be done. Perhaps this is a good KISS principle case. The simpler version would just be (using non-physics function style in this case) ''f''({{vv|x}}, {{v|t}}) = {{v|t}}α{{vv|x}}•v, and {{v|E}} = {{v|m}}c2.
Not so messy after all! The point of taking it further is to show that the underlying code systems can handle whatever needs to be done, if used intelligently. All I'm actually directly advocating is <var> with appropriate styles as applied by a template with embedded classes or local CSS. Screen readers can recognize them as variables since marked up as such. I don't think users of them turning on the feature to read out all formatting (italics, etc.) can be depended upon just because someone hits a math article, and math appears in many, many pages that are not math articles. At least identifying variables will be a boon to them. — SMcCandlish [talk] [cont] ‹(-¿-)› 19:29, 17 September 2008 (UTC)

To reply to your last point, screenreaders identifying every variable in an equation as such explicitly will in most cases actual hinder the understanding of a formula. Of all the possible things that in mathematics as a language are expresses through the formatting wether something is a variable is probably the least relevent. The <var> tag was never intended to be used in formula markup, its main purpose is to identify variables when used inside a sentence. In which case identifying it as such may be useful. But in the context of a formula it just isn't unless all the other semantics are also tagged and identified. (TimothyRias (talk) 08:33, 18 September 2008 (UTC))

I don't think you made any typos in the equation, and the semantics you've chosen are correct for the equations we have at hand. Regarding the multiplication dot, I agree that • is too big but it was all I could think of at the time. Also, the f really is just an italic letter f, not even stylized, but there are a very few math articles (like derivative) that use &fnof; to get an ƒ resembling what you'd find in a book.

Now that you've answered my question, I should tell you what I was really curious about. You see, one problem that occurred to me is that with semantic wikimarkup you have to specify all the semantics in advance—otherwise, you don't have appropriate templates at hand. Now, you might have proposed letting {{v}} mean "italic variable" and something like {{vb}} mean "bold variable", but that seems contrary to your intent, even though it is simple to do. Instead your idea is to semantically specify everything. So here are some other equations:

{\mbox{ch}}(f_{\mbox{!}}{\mathcal {F}}^{\bullet })=f_{*}({\mbox{ch}}({\mathcal {F}}^{\cdot }){\mbox{td}}(T_{f})).

(From Grothendieck–Hirzebruch–Riemann–Roch theorem)

H^{q}(X,L\otimes \Omega _{X/k}^{p})=0.

(From Kodaira vanishing theorem)

\int _{M}d\omega =\int _{\partial M}\omega .

(From Stokes' theorem)

And here they are in HTML:

ch(f_!F^•) = f_*(ch(F)td(T_f)).

H^q(X, L ⊗ Ω^p_X/k) = 0.

∫_M dω = ∫_∂M ω.

(Yes, the bullet is really supposed to be that big in the first equation; italic F is usually considered an acceptable substitute for calligraphic F and is sometimes used even in published books; and I've modified the notation on Stokes' theorem from that of the article not just so that it can be translated to HTML but also to reflect my own typographic preferences.)

None of the things appearing in any of these equations are vectors or variables. There are some functions (f in the first theorem and its decorated versions). But the things in the first theorem are sheaves and classes in K-theory, the things in the second theorem are cohomology groups, algebraic varieties, and line bundles, and the things in the third theorem are manifolds and differential forms. One could plausibly argue that some of these, like Ω^p_X/k, are constants, but the only reasonable interpretation of ch and td are that they are functions (they are the Chern character and the Todd class) yet it's standard to put them in roman type. (See, e.g., Fulton, Intersection Theory.) And while X and M certainly aren't changing, that's because they're spaces; usually one doesn't refer to a space as a constant.

It's very unlikely that we could agree on a small number of semantic identifiers that would appropriately distinguish all the different objects appearing in the above equations. If I might hazard a guess, one of the reasons why I think so many mathematicians have objected so vocally to your proposal is that we're aware of all the possible semantic differences (all of them, even ones you might not make) and the prospect of encoding them is nightmarish. Even if we decided to use a large number of semantic identifiers, we'd face the problem of matching standard presentation: Some functions are in roman type, not italic (an elementary example is sine and cosine). I don't know how to get around this without explicitly specifying "italic function" versus "roman function", because there will always be functions one hasn't heard of which should be upright (you'd probably never heard of ch and td before, for example). Which puts us back where we started, using {{fn|f}} but {{fnr|ch}} (fnr = "function, roman").

You know more about how web presentation works. Do you see a solution? Ozob (talk) 18:20, 17 September 2008 (UTC)

Yeah, that sounds like pretty much the same "a very large number of templates" objection that TimothyRias has, and I'm sold on it. As I replied to him (I've inserted that before your post, as there was an edit conflict but since mine is responding to numbered points in his, it's easier to follow with mine first), it wasn't my intention to create a new templated class for every possible maths object, just to show that whatever classes were wanted could be created and applied easily. The only one I personally care about is the <var> case (styled as per {{v}} and {{vv}} as needed, and with {{var}} for CS variables), since that's not just a  with visual style attached via a class, but is an actual XHTML element intended for variables that we are simply wasting.

That said, I think it would not be particularly difficult to create a new embedded microformat for maths-in-XHTML using divs and spans (and vars!) with CSS style classes. For every object desired, just create a class for it, with styles (for example, one might want one for functions that looked more like <math> output, with a larger, italic, serif "f", maybe something like div.maths span.function { font-size: 200%; font-style: italic; font-family: serif; }). XSLT could then be used to transform such markup into MathML or whatever by scripts, other scripts or applications could extract the data from the article (e.g. just "rip" the equation, and insert it into a LaTeX document). And so on. That's quite a bit more ambitious than simply getting <var> used more often, but it could be done pretty easily (not counting the implementation in extant articles, of course) if WP:MATH or WP:UF saw a need for it. So far it sounds like this would be too difficult to use from an editing perspective.

Oh, and there wouldn't be any particular problem with having separate templates for italic and roman functions defined generically in {{fn}} and {{fnr}} instead of as very specific objects. Some hardcore CSS purists might object, being a little too gung-ho about object-oriented Web development, but they'd be missing the point that there's nothing wrong with a more generic style class to apply if it obviates the need to have an ever-growing number of objects that would eventually exceed the ability of people to remember them all, which would clearly be the case here. But this sounds like a moot point anyway. — SMcCandlish [talk] [cont] ‹(-¿-)› 19:29, 17 September 2008 (UTC)

Just for isolated variables in text not in equations

(copied from above) To reply to your last point, screenreaders identifying every variable in an equation as such explicitly will in most cases actual hinder the understanding of a formula. Of all the possible things that in mathematics as a language are expresses through the formatting wether something is a variable is probably the least relevent. The <var> tag was never intended to be used in formula markup, its main purpose is to identify variables when used inside a sentence. In which case identifying it as such may be useful. But in the context of a formula it just isn't unless all the other semantics are also tagged and identified. (TimothyRias (talk) 08:33, 18 September 2008 (UTC))

This sounds the most sensible thing yet. Have <var> as semantic markup for a varibles outside of equations, use various templates to apply required styles. Possibly have some tag <eqn> to semantically identify an equation but don't apply semantics for anything within that. So you would have

<eqn>''E''=''m'' ''c''2</eqn> where <var>''E''</var> is energy, <var>''m''</var> is mass and <var>''c''</var> is the speed of light.

Trying to semantically markup an equation is a very complex task and posibly beyond wikipedias mission. For correct semantic markup you need something of the complexity of Content-MathML to define the structure with OpenMath Content Dictionaries to define the meaning of individual elements. Neither TeX or Presentation-MathML are semantic systems, they define how it appears not it meaning.

An incomplete attempt at semantically marking up E=m c² would be

<mathml  xmlns=\"http://www.w3.org/1998/Math/MathML\">
 <apply>
  <eq/>
  <ci>E</ci>
  <apply>
   <times/>
   <ci>m</ci>
   <apply>
    <power/>
    <csymbol encoding="OpenMath" 
       definitionURL="http://www.openmath.org/cd/physical_consts1.xhtml#speed_of_light"
       >c</csymbol>
    <cn>2</cn>
   </apply>
  </apply>
 </apply>
</math>

--Salix alba (talk) 09:36, 18 September 2008 (UTC)

Agree. It would be great if the <eqn> container (either implemented as a tag or as a template) could replicate some of the features of TeX math mode. (i.e. default to italics and no automatic line wrapping.) That would actually save a lot of markup making it more probable that editors will actually pick it up. (TimothyRias (talk) 10:32, 18 September 2008 (UTC))

Well, the right solution here is probably (again) to make <math> work better. If we reached the point where it was always appropriate to use <math> for any equation or expression, then we could make it contain the marked-up equation in <eqn> (or not include it, if it turns out not to work). At that point, it might be reasonable for the MoS to specify use of <math> for any equation or expression and use of <var> for any isolated variable, constant, or other bit of notation. Unfortunately, changing the operation of <math> seems to be a bit like fantasizing about proving the Riemann hypothesis. Ozob (talk) 15:24, 18 September 2008 (UTC)

Alright; this sounds like genuine progress, and a way to word this is in the guideline is pretty clear. Will take a stab at that. As for the proposed <eqn> that would more likely be <div class="equation">, and handled with a {{eqn}} template. That's all also a strictly maths issue, so I think that the discussion has resolved itself for the purposese of WP:MOSTEXT. Thanks for your participation! :-) — SMcCandlish [talk] [cont] ‹(-¿-)› 21:29, 18 September 2008 (UTC)

I think you jumped the gun when you changed the MOSTEXT text to say that <var> or its equivalents should be used for isolated variables. I have changed should to may - as far as I can see, consensus here is still that simple italicisation is sufficient and its use should not be denigrated. Gandalf61 (talk) 08:42, 19 September 2008 (UTC)

Hmm, I haven't actually seen anyone justify, for the regular prose and source code cases, not using XHTML as intended to be used, nor have I seen any objections raised that were not with regard to the use of <var> in mathematical contexts. Silence equals assent in consensus debates. Do you really think we need to go through this entire debate all over again? If so, what new arguments do you expect to see from either side? — SMcCandlish [talk] [cont] ‹(-¿-)› 19:26, 19 September 2008 (UTC)

I don't think you can reasonably call all the above spilled ink silence; there's a lot of objection to making <var> a "best practice", and I think you're only person vocally arguing for its consistent use. That doesn't mean it's wrong (as I said quite a while ago, I think it's the right thing), but it's not reached consensus. (Yet?) Ozob (talk) 20:42, 19 September 2008 (UTC)

Response to SMcCandlish: if I understand you correctly, you are saying that you intended the phrase "should be marked up with the <var> element" to only apply to instances of program variables. I am afraid that wasn't how I understood the text. I understood "Variables and symbols for variables within plain-English prose and in computer source code" to include isolated mathematical variables, and the following section starting "Symbols for variables within mathematical formulas" to apply to mathematics variables used within formulae, but not in isolation. And the first example in the first section "If there are <var>x</var> apples per basket..." was an example of a mathematics variable, not a program variable, which further misled me.

Anyway, I have no strong opinion on the program variables issue, so I have amended the guideline text and examples again to try to make the distinction clearer and avoid the misinterpretation that I made previously. Gandalf61 (talk) 08:44, 20 September 2008 (UTC)

`<var>` is intended for maths after all

See http://www.whatwg.org/specs/web-apps/current-work/#the-var – the W3C has clarified that in fact <var> is intended to be used in mathematical formulae. I'm willing to not push on this particular sub-issue, since the WP:MATH crowd have made a reasonable case that implementation would be difficult. — SMcCandlish [talk] [cont] ‹(-¿-)› 19:26, 19 September 2008 (UTC)

I have written to WHATWG asking them to deprecate <var>. Ozob (talk) 22:48, 19 September 2008 (UTC)

I have written to WHATWG asking them not to deprecate VAR. --Yecril (talk) 19:21, 29 September 2008 (UTC)

Really? I didn't see that message come through the mailing list. Those interested in reading the discussion following my original post can read it at [2]. There didn't seem to be any particular consensus. I recall that one person mentioned that <var> is really a holdover from the early days of HTML, and it wouldn't be in the spec if it hadn't always been there. There were good points about uses of <var> in non-mathematical contexts. Eventually I made a specific proposal, namely to amend the spec to read:

"The var element represents a variable. This could be an actual variable in a programming context, or it could be a term used as a placeholder in prose. Use of var in a mathematical context is deprecated in favor of MathML content markup."

It didn't gather much of a response. WHATWG is currently spending a lot more time discussing the (very serious) class of UI redress vulnerabilities and what can be done to mitigate them. Ozob (talk) 15:38, 30 September 2008 (UTC)

Coda

The specification for var has been changed. The new description of var reads:

The var element represents a variable. This could be an actual variable in a mathematical expression or programming context, or it could just be a term used as a placeholder in prose.

In the paragraph below, the letter "n" is being used as a variable in prose:
If there are <var>n</var> pipes leading to the ice
cream factory then I expect at least <var>n</var>
flavours of ice cream to be available for purchase!
For mathematics, in particular for anything beyond the simplest of expressions, MathML is more appropriate. However, the var element can still be used to refer to specific variables that are then mentioned in MathML expressions.

In this example, Pythagoras' theorem is solved for the variable a. The expression itself is marked up with MathML, but the variable is mentioned in the figure's legend using var.
<figure>
 <math>
 <mi>a</mi>
 <mo>=</mo>
 <msqrt>
 <msup><mi>b</mi><mn>2</mn></msup>
 <mi>+</mi>
 <msup><mi>c</mi><mn>2</mn></msup>
 </msqrt>
 </math>
 <legend> Pythagoras' theorem solved for <var>a</var> </legend>

</figure>

We did mostly come to a consensus before, so I see no need to reopen the discussion. But I thought that some of you might be interested in this change to the meaning of var. Ozob (talk) 05:24, 18 December 2008 (UTC)

Removal of merge proposal

I have to admit I haven't read all of the above, but gather that most maths editors prefer to use ''x'' than <var>x</var>. What seems to have been forgotten is the merge proposal that started at Wikipedia talk:Manual of Style (dates and numbers)/Archive 111#Text formatting math section merge proposal and Wikipedia talk:WikiProject Mathematics/Archive 41#Variable format and led to the discussion above and presumably several improvements but apparently no decision on the merge. The merge proposal of the computing and maths variable sections was to Wikipedia:Manual of Style (dates and numbers), which was disputed and seems clearly wrong to me, and indeed I don't see any current overlap with that WP:MOSNUM. In fact, there is minimal overlap with MOS:MATH and a "Main" redirect to there. What is currently on this page is suitable for non-technical editors who might quickly want to know how to format scientific terms, and as such I think it should stay. The actual method used to produce italics is less important than what should be italicised (otherwise I might insist that the <dfn> tag be respected by Mediawiki). I think the merge proposal has fallen by default, so I've removed the template. --Cedders^tk 21:49, 8 October 2009 (UTC)

I forgot to say there was also some discussion at Wikipedia talk:Manual of Style/Archive 102#Text formatting merge proposal. I guess any duplication has been removed in the meantime, but if anyone can see any remaining, let's start a new discussion. --Cedders^tk 21:57, 8 October 2009 (UTC)

Bolding demonyms

In articles about cities and countries, should demonyms be in bold (in the lead)? Dabomb87 (talk) 22:14, 6 December 2008 (UTC)

No. Rich Farmbrough, 18:40 13 May 2009 (UTC).

When not to use emphasis

In the When not to use emphasis section, why does it state "The following are proposed guidelines"? If that section is being proposed, then should it not be in this guideline? --Silver Edge (talk) 05:22, 4 January 2009 (UTC)

I've not checked why that was there, but it seems out of place, and I've taken it out. --Cedders^tk 21:51, 6 October 2009 (UTC)

Animals sounds

It is standard practice in Wikipedia to italicise representations of bird songs and calls (in the real world this is also a standard, but not universal, convention). Following this discussion, Sandy suggested that it should be added to the list of italic usages, but looking at the project page I'm not sure where it fits best. Any ideas? jimfbleak (talk) 15:18, 10 April 2009 (UTC)

Petition To Change Wikipedia's Fonts To Avoid Homograph Ambiguity

Wikipedia should avoid the use of any font wherein the capital i is an homograph of the lower case l. The font currently used for the titles of articles suffers from this defect. Fonts should be chosen so that there are absolutely no homographs. Jwray (talk) 03:23, 12 April 2009 (UTC)

Latin (language) and all-capitals

There's a trend on Wikipedia for many terms in the Latin language to be set in all-caps, like so:

HISPANIA (entered as [[Hispania|HISPANIA]]; example from Names given to the Spanish language)

I understand the rationale — the Latin alphabet didn't gain lower case letterforms until after the time of the Romans. Personally, though, I find this both distracting (in much the same way as overuse of bold or italics) and less accessible (readers with astygmatisms, including myself, find blocks of capital letters more difficult to read).

I'd suggest the spirit of the "don't overemphasise" rule would argue against this use of smaller capital letters, but I don'think there's ever been any discussion on the issue. I'd like to propose that this practice be discontinued. Thoughts, anyone? — OwenBlacker (Talk) 20:13, 15 April 2009 (UTC)

I agree; it is unnecessarily unreadable. I don't know when that sort of accurate rendering is necessary, but here is certainly not such a place. Even Wheelock's Latin, one of the more popular Latin textbooks, uses lowercase. -Rrius (talk) 20:57, 15 April 2009 (UTC)

Only the hopelessly pedantic would think that Latin needs to be set in small caps. Latin words should be set in italics, just as all foreign language words should be. I've changed Names given to the Spanish language#History of the term 'Spanish' accordingly. --Akhilleus (talk) 21:07, 15 April 2009 (UTC)

I agree as well. All standard Latin texts nowadays use mainly lowercase letters throughout a work, except for the first letter of names and (sometimes, depending on the publishing house) the first letter of the first word of a sentence. In secondary literature, it is standard to italicize all Latin and to retain the mostly lowercase type. Wikipedia is a kind of secondary literature and should do likewise.Jwhosler (talk) 21:15, 15 April 2009 (UTC)

I agree with the general sentiment against all-caps. However, I suggest there are occasions where all-caps are appropriate, e.g. when quoting inscriptions from walls/columns/portals ("ROMANES EUNT DOMUS", "M AGRIPPA L F COS TERTIVM FECIT"). Michael Bednarek (talk) 02:53, 16 April 2009 (UTC)

A good thought, but the common practice of secondary literature even on Latin epigraphy is to use mainly lower case letters in the same way that I described above.Jwhosler (talk) 15:11, 16 April 2009 (UTC)

Why would that require all caps, and why would it be different from English-language inscriptions? -Rrius (talk) 04:39, 16 April 2009 (UTC)

MOS:QUOTE - Minimal change, a "strict requirement". Michael Bednarek (talk) 05:25, 16 April 2009 (UTC)

Capitalization is along the lines of the other minimal changes. At any rate, see WP:ALLCAPS (especially the part regarding quoting proclamations). -Rrius (talk) 05:33, 16 April 2009 (UTC)

I agree with Rrius. Using lowercase letters in Latin texts is a "change" that has been accepted by the most rigorous of scholars throughout academia for hundreds of years; moreover, many of the manuscripts from the Middle Ages from which our modern texts derive were written in lower case letters by medieval scribes. For Wikipedia to reverse all of that and go back to all capitals would be nonsense.Jwhosler (talk) 15:11, 16 April 2009 (UTC)

Follow your sources. These should not be the Pantheon, but a secondary source (that way you get the emendations); this will almost invariably be lower case. This can be modified for uniformity within an article (there's no reason to confuse the reader by switching between upper and lower case just because you changed source), just like the choice between u and v. Septentrionalis PMAnderson 17:00, 16 April 2009 (UTC)

This looks rather like a consensus and I did highlight this discussion at Wikipedia talk:Manual of Style (capital letters) and Wikipedia talk:WikiProject Classical Greece and Rome, so I'll draft a piece of text to add into the MOS later. — OwenBlacker (Talk) 14:55, 25 April 2009 (UTC)