You skipped "fuck semantic HTML and CSS, we use Tailwind now."
<i> as italic was great like Tailwind is great: sometimes I just want to design something basic without having to context switch to CSS. I want italic, I use the <i> tag.
What has semantic HTML ever done for us? Does the browser care that I used <i> instead of <em> for italic? Do screen reader even care?
The accessibility aspect is huge and deserves more attention and kudos than it gets.
Also, from a purely selfish aspect, I vastly prefer documents with at least a minimum of semantic organization, even if it's just header, footer, main, and meaningful heading levels.
Without those, well, it's just a document full of undifferentiated text with various attributes to distinguish it visually or typographically, and usually inconsistently used. Is this piece of text styled "bold with fontsize 16" supposed to be a second or 3rd level header?
What stops screen readers from treating the <i> tag the same way they treat the <em> tag? If <em> functionally ends up doing the same thing that <i> did, what's the point?
A screen reader that did this would need to heuristically determine if the speech should be emphasized or not. Because you don't emphasize much of what is read when reading italicized text.
The point being, if italics were already used to add emphasis and to separate an expression from the rest of the utterance, why was a dedicated tag required to express emphasis?
(The heuristics only apply now, as there are two kinds of emphasis, which introduces some uncertainty of how things should be expressed. Moreover, as type styles have moved to CSS, there's a chance that a screen reader presentation may miss an intended separation of text.)
In terms of typography, italics are also applied for titles, for quotes (in some contexts), for foreign expressions, sometimes for dialogs, and sometimes simply because they are easier to read. For a screen reader, you don't want to emphasize these use cases.
They are still used to separate that expression, to make the expression stand out from the rest of the text. Surely, if I set a quote in italics, I mean this to be read in a distinctive voice by an aural presentation. (This is the entire point of separating it by a distinctive type style. I'm actually somewhat alarmed that this isn't the case.)
Edit: There's a clear meaning to the use of distinctive type styles. I don't see an intrinsic value in pretending and that there is no such intended meaning and that there should be an artificial separation, rather than generalizing on presentation styles.
Edit #2: In Antiqua, italics isn't just oblique text, it's an entirely different script. So what is the use case and the intended meaning of using a different script? Isn't this more conceptual than just using a fancy visual presentation style, which may be happily ignored in any other representation as there's probably no intended meaning to this?
<em> and <strong> have always been there as long as I recall, and are part of the 1995 HTML 2.0 spec. Before NCSA Mosaic, your hypertext was being rendered on text based terminals. Most of them couldn't render italics, and for emphasis tended to invert foreground and background, use underlining, or color if you were lucky. The use of italics and bold to add emphasis started when people started using the first WYSIWYG HTML editors in the later 90s. People who had never learned HTML or even written hypertext before, concerned primarily with making things pretty. And this problem drove the creation of CSS, to get people to stop using tables for layout but never quite got people to understand <i> and <b> were exclusionary, except at the Universities where they had to support braille terminals and screen readers for visually impaired students.
Yes, <em> and <i> where introduced at the same time in HTML 2 (1995), as were <b> and <strong>. But for any practical purpose <em> and <strong> were ignored until HTML 4 / XHTML. I've been coding websites professionally since 1996 and I do not recall encountering these tags in the wild. (When I started, the use of head and body elements was the dernier cri – to be coded with <i>. :-) )
Fun fact: the original HTML definition [1] has "typewriter" as the only styling element and uses surrounding underscores ("_are_") for emphasis.
> It's worth noting that the W3C specification says that a reference to a creative work, as included within a <cite> element, may include the name of the work's author. However, the WHATWG specification for <cite> says the opposite: that a person's name must never be included, under any circumstances.
I find this interesting, and wish the W3C could work this out more: In one possible future, 'cite' elements (and the absence thereof) could be used to inform/assist human and automatic fact checking.
It doesn't have clear semantics. <cite> is basically used as "I want something in italics, and it is the title of something" rather than "here is a reference to a creative work". And it gives the lie to semantic markup: we only ever have semantic markup for things we can reasonably expect to be visually set-off. Something like a persons name or a cut-off phrase are never marked up semantically; there could be massive benefits from semantically attributing lists of city/state combinations in flowing text but it just doesn't happen. There's nothing more than semantically attributed visual markup, rather than being semantic markup, and this kind of renaming of <i> is just going overboard.
There are also other things that are not emphasized, but require italics per typographical convention, <i lang="la">e.g.</i> foreign-language fragments.
Modern web developers don't use semantic HTML for screen reader support. They manually apply screen reader interpretation data the same way they manually apply Tailwind styles to every element. (That's what WAI-ARIA is for, most notably the "role" attribute.)
Once upon a time, HTML was a document format, and there was a dream that these documents would be interconnected with deep meaningful semantics that would be machine readable.
But then HTML became the rendering layer for a universal sandboxed VM, and all that went away.
Why, it has liberated us from European colonialism. <i> being no longer italic, you can wrap it around any script from the world and enjoy the semantics of it.
The joke is that italic can also mean Italic, the ethnolonguistic group of those who spoke Italic languages and why Italy is named so.
The phrase "What has semantic HTML ever done for us?" is a riff on the quote "What have the Romans ever done for us?" from the movie Life of Brian, about a Jewish-Roman man mistaken for another Messiah.
No, that's not the joke, though for a brief split second, some moments after I wrote the comment, it occurred to me that the interpretation might arise.
Italic type is derived from a form of semi-cursive writing, and is entrenched in the European printing tradition. No punning with Italy intended.
So what does it mean if you have <i> around Chinese, or Devangari or what have you? Do you just apply shear to make the text slanted?
I suspect that it bothered some "woke" types that HTML contains Euro-centric typographical directives, so they have been repurposed to have some sort of, culturally neutral semantics that is more inclusive of the planet's diversity (pardon me if I'm not nailing the terminology here).
In short, <i> no longer belongs to whitey and his writing system.
While Italic type is a European thing, it is a very useful thing that does have some parallels in other scripts (Consider Rashi script for Hebrew, or how Japanese uses Katakana in a similar way). No need to get all 'anti-woke' about it: if it were the case that people had a problem with 'Italic type', the phrase 'oblique type' is right there for use.
Probably there are other useful bits of HTML/CSS for Asian languages. (right to left and/or vertical writing, anyone? I always wanted to fart around with pretty-printed Chinese poems and, like, calligraphy script fonts with procedurally generated jitter.)
Most CSS best practices are based around semantic naming: you name your elements ".call_to_action" or ".sidebar". HTML5 introduced a lot of semantic elements such as <header> or <article>.
Whereas with Tailwind or HTML < 5 you name you elements based on what they look like. ".font-bold" or "<b>"
It definitely looks like bootstrap in syntax. A lot of "css superfans" actually really enjoy tailwind in use. It solves more problems than it causes for some people.
It’s basically a return to the days of HTML without CSS. CSS classes are not applied as an abstraction of different visual styles or components, but purely as tools to define visuals in situ.
I suppose it’s easier to reason about, but it feels like it was invented by that grug brain dude who likes solving lots of problems but hates thinking.
I wonder if AI screen readers will ever use the visual appearance of something to determine what it means. So even if you draw on a `<canvas>` it can still read the text. That would be sick.
In my view all canvas-related JS libraries should treat accessibility as a foundational requirement. It's not "too difficult" to implement.
Here's an explanation of the accessibility features I've added to my canvas library[1]. These features may not be the best approach, or even the most appropriate approach ... but at least they're there, and they can evolve into something better as/when people offer feedback.
You have a fantastic idea there, and if you'd like to attempt to make it a reality, I'm happy to brainstorm it with you. You keep full ownership of the concept; I'm happy to contribute given the people it'd help.
You can contact me directly using my handle at my domain, which is 21337.tech
z3c0, much appreciated, but I am prioritizing other things at this time. Let me hit you up at a future time if no one has already built something with this feature already!
I feel that this is where accessibility needs to go. We cannot rely on toolkits to implement support for accessibility standards because there will always be people building apps in niche toolkits / homemade gui's which don't support accessibility, especially as wasm becomes more common on the web.
We need to treat the screen like a PNG which an AI can parse to determine what the text is, what text is relevant to read to the user, where and what the buttons are, how to emphasize reading the text based on the visuals, etc. It's the only thing that will scale and while difficult it seems that creating something like this is within reach.
No. That’s like saying OS should use AI to try and figure out what buttons are to make them clickable, rather that developers actually building an interface for users.
All operating systems and platforms have pretty fluent and in-depth APIs for exposing an “accessibility tree” for assistive technologies to use. It’s your responsibility as a developer to build interfaces for actual users, and there’s very little reason not to.
> That’s like saying OS should use AI to try and figure out what buttons are to make them clickable, rather that developers actually building an interface for users.
Especially now that designers are making buttons/clickable areas hard to distinguish ... I could totally use AI to highlight all the things that are actually, you know, interactive.
You don't need an AI to do this, just make your own custom build of whatever operating system you use with styles that aren't crap. It's almost certainly a more tractable task than trying to make an AI work out what the interactive bits are - considering it seems to be a business objective for many companies nowadays to make it hard to work out what the interactive bits are.
I've seen systems at Boeing that understand CAD drawings semantically, like how arrows work and what regions are alternate views of what other regions, and whether a drawing is of the nose landing gear or the flap control, all based on scanned drawings from the 50's. So ... yes. Absolutely.
It's 2059. We're using browsers that render pages created with Dreamweaver and Frontpage using advanced machine learning techniques to determine that <blockquote><font size="+1"> is an entry in the table of contents, while <p><font size=+2> is a level 1 header.
YES. I have seen layout prototypes for this kind of thinking, and they're magical. It's not trivial, but will come more and more into reach. All of visual layout delivers a spatial visual language that all humans understand. We know what headers _are_ because we know what headers _look like_. And not for a single style of article, but for all articles.
I also used to believe that rigorous semantic tagging was The Way To Do Documents, and it was certainly a useful crutch, but we absolutely need to move beyond that.
The screen reader cares about the big stuff, such as using headings, tables, and lists correctly. And of course, implementing ARIA attributes correctly. Other than that it doesn't matter too much.
Yes, e.g. what if you are quoting something with italics in the original text? For older texts in English, italics might easily indicate a word or phrase in a non-English language. Showing that with semantic "emphasis" might convey the wrong impression -- e.g. without a note such as "emphasis added".
The OP addresses "idiom in another language" as one case, but if it is within a scholarly quotation, can one change the typography to a different convention?
Then there's the matter of languages such as Japanese that use a different set of characters (katakana) both to indicate emphasis and to mark foreign loan-words. Would katakana characters still be enclosed in an <i> element? Would styling be modified accordingly?
but... what if the browser chooses to render emphasis in some fashion other than italic? How then could I force my design opinions on the reader, despite the contrary configuration of their user-agent? Surely we can't allow the end-user to control their experience when it's the designer's intention which really matters.
okay, fine, you’re all still using “i” so we’ll put it back in the standard
Going from HTML/XHTML to HTML5/XHTML5, several elements were redefined, including <i> and <b>.
It’s nothing new.
Here’s an article from HTML5 Doctor explaining this more than 12 years ago [1]. I hope web developers serious about their craft aren’t just now discovering this.
If they wanted people to use a different tag for such text, they should have created an idiomatic tag instead of attempting to repurpose existing ones. Adding alternative elements is usually the better solution.
If they wanted people to use a different tag for such text, they should have created an idiomatic tag instead of attempting to repurpose existing ones.
If you read the spec, it mentions several different uses for <i>, depending on the context. Otherwise, we’d need a different element for each context, bloating the spec by having special purpose elements instead of general ones.
For example, <i> is used for ship names:
<p>They came over on the <i>Mayflower</i>.</p>
It wouldn’t make a lot of sense to have an element just for names of ships in a general purpose markup language like HTML.
Turns out, if you look at the Wikipedia article about the Mayflower, the name is marked-up with the <i> element [1].
I kind of like using <i> for other user’s names that appears in a sentence. For example if a joker calls them self, “you piece of shit” it would be kind of unfortunate for another user to open up a dialog and it says: “Are you sure you want to delete the user you piece of shit?” Now with the user’s name in italics, this is less ambiguous.
In general I think many UXs are too shy about using italics.
MDN's "interpretation" of <i> has, in that span of that last one, also changed between "alternate voice"(?), "interesting", and now, apparently "idiomatic" text—I'm sure I've missed a few. I feel like it'd be a service to the world to just suck it up and let it be the "legacy italics" tag, though.
As far as I know, the spec still calls it "alternate voice or mood"; it's MDN, not the spec, that now calls it "idiomatic" text. So I stand by it being the MDN interpretation that's changed, not the spec (to pretend that it's not "we tried to remove it and failed" by making it something that at least actually starts with an I now, I guess, twice).
The i element represents a span of text in an alternate voice or mood, or
otherwise offset from the normal prose in a manner indicating a different
quality of text, such as a taxonomic designation, a technical term, an
idiomatic phrase from another language, transliteration, a thought, or a ship
name in Western texts.
The entire HTML standard process has traditionally been dominated by people with Very Strong Opinions and little sense of practicality. It's a little better now though.
The practical thing to do was to just re-define <i> and <b> to mean what <em> and <strong> were defined as then add additional elements for when more fine-grained meaning if needed. Instead, countless of hours had to be lost on s/<b>/<strong>/g instead.
To be fair, <em> is semantically not appropriate for the use cases that <i> is these days recommended such as H. sapiens and RMS Titanic. These are not about emphasis but a typographical convention.
Do you need to updated the parser though? I kind of thought SVG elements were implicitly namespaced. For example we have the two <a> tags. One HTMLElement <a> and another Element <a> with the namespace "http://www.w3.org/2000/svg". I don’t know of any meaningful difference between those two elements, except if you try to insert the former into an SVG DOM tree, then weird things might happen (though I haven’t tested it).
Yeah, only a terminally-dumb parser could confuse an SVG <i> tag with an HTML <i> tag. A proper HTML parser should ignore anything inside tags it doesn't understand - such as the <svg> tag.
This is quite annoying. There is a lot of web-pages that use <i> to mean "italic"; possibly most of them. Many of those are not going to be updated just because of a change in the definition of the tag.
Instead, they should have introduced a new tag, e.g. <idiomatic>, while deprecating <i>.
Most internet standards are descriptive, not coercive. The semantic web's advocates have been coercive from the beginning, but despite the repeated failures of coercion, they keep on at it.
What does "idiomatic" mean anyway? If I say that idiomatic markup is a waste of time, do I have to put "waste of time" in an <idiomatic> element? How is text in "another language" idiomatic text? [Checks for another example] Oh, that's the only example they give of something that's "idiomatic".
It seems clear to me that "idiomatic" isn't what they mean at all; what they mean is some text fragment that should appear differently, because in some way the "mode of speech" is to be treated differently. It looks as if they've hijacked the <i> tag because they want it to stop meaning what it means; they've chosen a meaning that starts with the letter "i" for that reason.
The way I parse it, "idiomatic" text is any kind of text that would normally be set apart by being presented in italic. That is, it's not even semantic markup at all; it's a politically-correct gesture to the semantic markup die-hards.
Fret not. The term “idiomatic” is just MDN’s editorialised and poorly-chosen title for the underlying element, heavily incomplete and simplified past the point of conveying the meaning at all. They should have given up on choosing a single word for it (like they haven’t tried with <s>). The actual spec describes it much more reasonably:
> The i element represents a span of text in an alternate voice or mood, or otherwise offset from the normal prose in a manner indicating a different quality of text, such as a taxonomic designation, a technical term, an idiomatic phrase from another language, transliteration, a thought, or a ship name in Western texts.
> Do not confuse the <b> element with the <strong>, <em>, or <mark> elements. The <strong> element represents text of certain importance, <em> puts some emphasis on the text and the <mark> element represents text of certain relevance. The <b> element doesn't convey such special semantic information; use it only when no others fit.
This is really hard to take seriously, even as someone who advocates for semantic HTML.
Surely all these nuances mean something tangible, but I don't see it.
Compared to <b> vs <strong> vs <em>, <mark> and <i> seem comparatively sensible.
I agree. I like the idea of semantic HTML with close to no classes, but the standard is really failing here. I generally just use <em>, and very rarely <strong>.
Also not to be confused with "<mark>" to indicate relevance. A use case might be if the respective part of the text is of strong relevance and you want to bring attention to it by emphasizing it.
But now I'm unsure if I should consider these distinctions <bullshit> or <horsecrap>?
Thing is, when you're composing prose in your native language, you don't generally parse it and re-parse it to figure out whether some term is "relevant", or requiring attention, or in an "alternative mood". A native speaker/writer isn't normally aware of the grammatical aspect of what they write; they know how they want the text to appear.
Trying to force writers to instead overlay a layer of meta-meaning on their prose, so that the meaning is expressed both in the text and the markup, is a fool's errand. It's a bit like a programming system that requires the developer to express his meaning in both Forth and Python; it's just asking for an author to write text that directly contradicts the markup.
They are part of the changes to the WHATWG HTML specification, so it's not just Mozilla. MDN is just reflecting/documenting what is in the different standards.
> The b element represents a span of text _to which attention is being drawn_ for utilitarian purposes without conveying any extra importance and with no implication of an alternate voice or mood, such as key words in a document abstract, product names in a review, actionable words in interactive text-driven software, or an article lede.
> The i element represents a span of text in an alternate voice or mood, or otherwise offset from the normal prose in a manner indicating a different quality of text, such as a taxonomic designation, a technical term, _an idiomatic phrase_ from another language, transliteration, a thought, or a ship name in Western texts.
The element (and other style-related elements) were deprecated due to CSS. For b and i the strong and em elements were created as semantic alternatives. Now the b and i elements have been given specific semantic usage and are now not deprecated.
Actually no, that’s the most accurate one of them (though <b> as “bring attention to” is fairly close too). The actual spec:
> The u element represents a span of text with an unarticulated, though explicitly rendered, non-textual annotation, such as labeling the text as being a proper name in Chinese text (a Chinese proper name mark), or labeling the text as being misspelt.
> Do not confuse the <b> element with the <strong>, <em>, or <mark> elements.
> The <strong> element represents text of certain importance, <em> puts some emphasis on the text and the <mark> element represents text of certain relevance.
This really confused me. Isn't this all basically the same?
No. This is what MDN choose to call it. The actual spec uses no such silly names. Concerning the actual semantics, which are WHATWG’s domain, that was done over fifteen years ago.
Mozilla stopped spending their money on making a good browser, instead started blowing all their Google-cash on “inclusivity” and “diversity” efforts many, many years ago and fired technical staff which would not tow this new political line.
Looking at stuff like this, I think the results are pretty obvious. Despite the “diversity”, the value of what Mozilla offers hasn’t simply stop increasing, it has now started decreasing and in cases like this actually brings negative value to the internet overall.
This is what firing/alienating the actual people who did the actual job does. And no amount of “diversity” can make up for it.
<b> is amazing if you need some titles in contexts where <h1>-<h6> doesn’t make sense (i.e. will confuse assistive technology). Pair it with aria-labelledby and you’ve created a title for things like <aside>, <nav>, etc. without confusing normal navigation.
The explanation for the i element has had a rough history: I remember wording such as "the i element is for content that's normally put into italics" lol. Demonstrating the whole concept of "semantic" HTML is simply flawed at a fundamental level: HTML is a vocabulary for casual academic publishing, with headings, footers, lists, etc. for structuring, and other very specific elements (such as <var>) typical of such use case. But what about other pieces of text, such as threaded discussions like this page? For some, using a vocabulary fit for a specific purpose with mapping to a rendering vocabulary is the entire point of markup.
"Semantic" HTML seems to be pushed after the fact as a justification for CSS to have syntax completely separate from HTML, reserving attributes for "behavior". The idea that markup attributes shouldn't contain style info is laughable when they're there for the exact reason to associate typed properties with elements.
Especially if it's h1 fushcia and red text on a yellow background blinking with bold, italic, and underline. Something like "1995-12-10 Please read my website completely before emailing me directly: bob941 (at) compuserve.com . . . . . "
Of course it is funny to look at a committee trying to invent some grand vision for something that was simply a small set of options available on desktops around 1990.
The real problem, however, is that many writing systems have never had any kind of oblique style for emphasis. Or cursive (which is a different thing). Or they have cursive, and use it all the time in print, but technically that's a different font without any connections to the main one. So any person making some kind of template should be aware that the effect of <i>-as-italic can range from “obvious thing everyone expects and understands” to “does nothing at all” depending on language and font. I think they should have got straight to that point in the article.
It is pitiful how those unreflected small choices can put fences in one's mind. Everyone knows the font selection dialog in the browser, and probably thinks that the world is made of serif, sans-serif, monospace, and some Comic Sans. But that's nonsense, even for Latin.
Even more pathetic is inability to use basic punctuation marks. Something as barebone as Fixedsys font had all the quotation marks and dashes since Windows 3.1, long before programs became Unicode-aware. All that time absolutely nothing has been stopping you from pressing the key and getting the proper symbol just like you type any other… except the typewriter key labels hardware makers still use, and the default system layouts being just as stale.
> Of course it is funny to look at a committee trying to invent some grand vision for something that was simply a small set of options available on desktops around 1990.
I honestly find it embarrassing rather than funny.
"The foundations of the modern web were built by people who couldn't possibly have imagined the importance their inventions would play in the future, and the sheer breadth of applications they would be used for. One of the consequences of the web's convoluted history is that tag names like <i> are historical artifacts that don't make sense in the context of modern semantic elements."
Is it really so hard to admit that? Why make up bogus explanations?
<em> does have its uses. If you have an entire paragraph that's styled as italic and you want to emphasis something within that, usually what you do is make the emphasized span non-italic.
Italicized paragraphs or long spans are more common than you might guess. Sometimes they're used to indicate speakers in a dialog or to signify editorialized interjections. No idea if browsers are smart enough to render nested <em>'s this way now without specialized CSS.
Of the available elements, which one is suitable for icons, especially when being used in text (just yesterday I learned that emojis are even used in code like css). Clearly they should be in an element to separate them from the text, but which one is appropriate?
God, can we just kill the semantic web stuff already? It's like a bizarre religion that won't die.
By the time the web is finally "semantic" and everything is marked up appropriately, we'll have AIs that just infer the semantics and don't need all the meticulous tagging.
Doing whatever we want and just hoping AI will eventually fix the mess it is also how we got sub-par public transport but billions invested in self-driving cars.
Some people even tried RDF and triplestores to structure metadata and separate from presentation. XSLT was supposed to become an almost universal presentation layout language/transformation language. Nawh, we have REST APIs, ORMs, and HTML and JSX templates that present rendered views devoid of data and logic behind them.
I don't doubt billions have been invested in public transport. I also don't doubt those billions have been (and will continue to be) more cost-effective than the billions invested in self-driving cars.
It's overly semantic.. but the core /document/ semantic outlining does seem useful to me, particularly when I started having to take accessibility into account when designing applications and pages.
I will spend time considering if something other than a <div> is more appropriate for an element. I'm not going to spend very much time pondering the difference between <b> and <mark>.
I'd rather people just collectively give up on semantic HTML. What is even the source of hope for this cause? The rest of us gave up a long time ago. I'd much rather see energy spent on expanded ARIA support, something that actually moves the needle on accessibility.
Absolutely. Leave HTML as is and focus on actual accessibility features. They could even add screen reader specific elements and indicate that all other page elements should be ignored. So, you would show one presentation for visual purposes and the other for audible purposes. (Or even braille for feedback purposes.)
Add something to the document that tells it to ignore screen readers as long as a <screenreader> tag is present that puts everything in one place.
Documents and humanity in general are complicated and various. You're never going to be able to fully encapsulate the complexities of a document into any (structured) DSL as a result. The recommended approach, failing all other options – even to the semantic people – is to use some standby element, such as DIV or SECTION. Unless you're including people who would say "just change your document to not do anything not representable by Semantic HTML". Add to that, that the list of supported elements aren't even close to approximating what you'd write in 50% of documents.
Fun fact: in manuscripts, italic is represented by a single-stroke underline. An equivalent print representation is spaced text. These forms are sementically identical.
BTW: The <u> element has been rebranded as "Unarticulated Annotation". :-)
Indeed: "<em>What is this writer talking about, anyway?</em>". I don't find that ironic, though; I think it bespeaks the carelessness with which the HTML "standard" is being jerked around by various unaccountable pressure groups.
Nonsense. This is MDN docs, not the spec. Don’t conflate or confuse them.
In this specific case, it’s probably just a bad mechanical translation, possibly from long ago as part of a batch i → em change, or possibly recent in their conversion from HTML to Markdown (might have done both em and i → *). Ill-conceived in either case, but not the end of the world.
You're right; I have conflated them, for many years. Fact is, MDN docs is a million times easier to read than the specs, which used to be clear, but are nowadays full of struckout text and references to other documents.
If MDN docs is autogenerated, that's regrettable. I swear I'm not going to fall back on W3Schools.
Incidentally, whatever happened to W3Schools? There was a time when every google search for anything remotely similar to web development came up with the top 5 links being to W3Schools. Nowadays not so much. Did google finally decide that W3Scools content amounted to disinformation?
And then there’s me who uses nothing but div elements and hacks at them with css like a drunken butcher moonlighting as a surgeon until my webpage kinda looks correct on my browser and monitor.
Using CSS you can make most html tags behave like other html tags. I’ve made all the divs look and behave like strong tags and vice versa to demonstrate to people new to CSS that the default styling of tags is mostly artificial.
The mildly uncomfortable aspect of it is that <i lang="la">veni, vidi, vici</i> might change the font the text is rendered in, because the generic font families (e.g. serif, sans-serif, cursive, monospace) can vary by language. For me, for example, my usual serif is Equity A, but in Latin it changes to Noto Serif and I haven’t been bothered to patch this up in my Firefox config because it takes far too much effort, requiring changes in a number of languages (not such as French (fr), but yes such as Latin (la) and Māori (mi)). Or sans-serif: Concourse 4 becomes Noto Serif. This will honestly cause me to omit lang=la in some iffy cases where I would write lang=fr on a similar word or expression from French.
In human-machine interaction "semantic" means "you know exactly what the computer will do with that". So "<i>" or "<b>" tags have very clear semantic, while "<em>" or "<strong>" are less so and things like "<article>" are totally vague.
Although this is not quite correct. "Semantic" is what makes sufficient distinctions for a given case. For example, assume I'm going to publish Ashby's "Introduction to cybernetics". It has an interesting structure: numbered chapters ("2"), smaller divisions within a chapter with a title but without a number, and yet more smaller units that are numbered like "2/17". Some of these smaller units have a title, some just a number. The point is that all this is rather unique.
So I'm about to mark up the text to indicate all this. A semantic way to do that would be to invent a notation that is as unique as this book. There will be "<ashby-chp>", "<ashby-div>", "<ashby-unit>" and such. We do not specify how this is going to be rendered, but at least we faithfully describe what we have without losing anything and without adding anything irrelevant.
Now we want to render it on a visual media. Here we have a different notation that describes fonts, styles, spacing and so on. These are very different distinctions from the structure of the book, but are very approriate for typesetting. We decide how we represent our "ashby" distinctions with these tools and write a transform from our notation into the visual one.
Same for a screen reader. Here we have a notation that describes pitch, speed, etc. No spacing or fonts, of course. We decide how we are going to represent our distinctions with these and come up with another transform.
At each step each element in our notations has a clear purpose. The "ashby" notation captures all the distinctions the author needs to make his point. The "visual" and "aural" notations are tools to express distinctions that can be made on a specific media. This is semantic. And this, by the way, is the original idea of XML (a multitude of notations) and XSLT (the notation transformer).
[And the description of the transforms also uses yet another notation with yet another clear purpose :)]
I prefer ∠this⦢ way to mark text ∠italic⦢ and ⁎this* way to mark it as ⁎bold*.
These are the symbols:
—
Italic:
∠ ANGLE
Unicode: U+2220, UTF-8: E2 88 A0
⦢ TURNED ANGLE
Unicode: U+29A2, UTF-8: E2 A6 A2
Bold:
⁎ LOW ASTERISK
Unicode: U+204E, UTF-8: E2 81 8E
* ASTERISK
Unicode: U+002A, UTF-8: 2A
—
I just made this encoding up. Ah, it would be more correct to say I just made this markup up. I made it up and nobody understands it. …Or yeah, actually most people will understand it?
Yeah, but your editorialised title suggested a recent change, whereas this change was made over fifteen years ago. (Can’t trivially find exactly when the change was made, but it’s already present in <https://web.archive.org/web/20071001102153/https://www.w3.or...>.)
Hilariously, the modern prescription for <b> ("text to which attention is being drawn for utilitarian purposes without conveying any extra importance") probably covers more uses of boldface than <strong> ("strong importance, seriousness, or urgency for its contents").
If I have a label to some form element in italic, it would not be correct to write it as <label><i>Name:</i></label> , right?
There's no surrounding text it needs to distinguish from and the fact that it is a label should already point out to screen readers that it's a label, while adding an italic font to the label without the i should distinguish it visually, is it correct?
Fun fact! Italic typefaces was originally not designed for emphasizing text but as a way so save paper since the slanting allows the letters to sit tighter.
Another fun fact. In linguistics, italics is used to denote the utterance rather than its meaning. It's not emphasis, but a visual signal that says "this bit is in another language or a single syllable or whatever we're discussing".
This is useless insofar as in a rich-text editor, when the user clicks the “i” button or presses Ctrl+I, they still mean italics, and there is no separate “idiomatic” button, which if present would furthermore only confuse most users. Same for bold.
Also Markdown doesn’t support the distinction.
This is not to say that browsers should prevent styling <i> elements differently, but the intended meaning is still “whatever italics means”.
HTML is really rightly coupled to layout, even though it's a ridiculous format. And semantics isn't linear, like HTML is. If you want some vague form of semantic representation (because there's no good, universal semantic representation, and the rest is just data tagging), use something else, or add a new namespace if you want to abuse HTML even further.
I've been using <i> for things like inline simple math. I set it to display using serif font and in italic using CSS (while the surrounding text is not in italic and also using sans-serif fonts). The results is good visually, it works. And I felt like it was a good use of this tag. I'm now even more convinced! Thanks for sharing :).
No. Idiomatic means "the normal way of doing things" basically. So in that context they are trying to say "et cetera" is the normal way to say "and so on" even though it's not English. It's a big stretch though. You wouldn't use "idiomatic" to describe that text unless you really wanted to find a word starting with i.
What you're saying seems to agree with my point. If something is "the normal way of doing things", why would it be "set off from the normal text". You're saying idomatic means normal, whereas MDN says it means "set off from the normal", aka abnormal.
Conversations around semantic HTML make me wonder: asking devs / product teams to use semantic tags, ARIA attributes, etc. doesn't seem to work; perhaps we should focus on making screen readers and other assistive tools smarter?
Put another way, where's the assistive technology equivalent of OXO Good Grips? I'd imagine a tool that is part ad-blocker, part auto-summarizer, part navigation helper, part personal curation assistant; that does the Right Thing (tm) 90-95% of the time, even on highly dynamic web applications (since that's already _way_ better than the proportion of sites / applications built with a11y in mind); that people in general find to be a superior user experience, regardless of whether they have disabilities or not.
In Unicode, Chinese, Japanese, and even Korean and Vietnamese characters are combined into something called "CJK Unified Ideographs",or just "Han".[1][2]. So they all get the same treatment for something such as italics. Not sure what HTML currently specifies for that.
However, a problem with sarcasm tag is that it wouldn't really help accessibility compared to say saying "Sarcasm:" or something like "(The preceding remark was sarcastic.)".
- <i> makes text italic
- no, presentational elements are bad, use <em> instead
- okay, fine, you’re all still using “i” so we’ll put it back in the standard
- here’s a convoluted meaning to pretend that it’s in the standard for some reason other than “we tried to remove it and failed”