Hacker News new | past | comments | ask | show | jobs | submit login
The Poor, Misunderstood InnerText (2015) (perfectionkills.com)
38 points by bobajeff on Nov 22, 2020 | hide | past | favorite | 13 comments



I’ve been working on rich text editing ideas for the past month or so, and I have to say, innerText is probably not the solution you’re looking for, for that use-case. Despite what they say about standardization, there are still subtle differences between browsers over how exactly they handle certain elements and styles. The innerText property also does a feeble attempt at representing rich text whitespace, by making table cells tab-separated and adding an extra newline of padding around p elements, which is probably not what you want. Worst of all is that this algorithm relies on layout and styling information like the white-space, display and visibility properties, so it’s likely that it will never be supported in non-browser environments like JSDOM, and even when supported it is likely to cause crushing layout shifts if you’re not careful about when you read it.

If you’re interested in getting the “text” of some DOM, you should know that none of the built-in browser APIs will get you what you want, and moreover you’re gonna want it to be extensible in some ways so that you can, for instance, replace img elements with some equivalent markup, or omit the rendering of off-screen parts of large documents via windowing/virtualization. At the end of the day, there is no canonical representation of some DOM as plain text.


Thankfully, `innerText` has been standardized by WhatWG for a while now [1]. Usually, it accomplishes what you want to do (lay text and have it be visible in that form) as well or better than `textContent`. As such, I generally use that today.

1. https://html.spec.whatwg.org/multipage/dom.html#the-innertex...


And, has been supported by all the browsers for years: https://caniuse.com/?search=innerText

I'm not sure why anyone would use `textContent` anymore. I think I had even forgotten that it exists.


You say "for years" meaning a year after this blog post was originally written.


4.5 years is "for years" especially in the frontend world


I'm reading a book by Richard Hamming and in it he discusses the methodical precision by which researches developed the first computers and programming languages. The thought, the debate, the endless thought experiments done before a single component was built.

Then on the front page of HN today there is an article about K&R and Unix, which lead to a rabbit hole of reading. Again, the thought process, the methodology, the amount of design and specification done before even beginning to build Unix.

Now compare this to the first big invention of the late 20th/early-21st century: the web browser. It was driven by a race to monopolize the internet and maximize profit, a war of standards, and the consequences of poor planning hurt us today.

`innerText` isn't just -another- quirk that wastes precious human memory, at a higher level it is also a testament to what I see is a decline in rigor among applied computer science that started in the 90's and has dominated: Hack. Ship. Patch later (maybe). Or to quote the anti-mantra: "move fast and break things."


I think the web browser was developed in an online world of instant communication and broad participation. It was a new paradigm.

Read about how andreesen at uiuc would would get an email and just add the feature.


The decline of rigor made computing accessible. Do you really want computing to become like civil engineering or biotechnology? Where changing the shape of a button would require several committee meetings followed by ethics board clearance and regulatory approval before anything gets done.


"A straw man ... is a form of argument and an informal fallacy of having the impression of refuting an argument, whereas the proper idea of argument under discussion was not addressed or properly refuted.[3] One who engages in this fallacy is said to be "attacking a straw man". "

https://en.wikipedia.org/wiki/Straw_man


I don't think it's a straw man to note that what you want (rigour) comes with a cost (accessibility of development). The decline in rigour among applied computer science corresponded with an increase in software availability, economic output, and probably also literal lives saved.

I don't deny that it's frustrating to be stuck with poor choices, ones which cause confusion and make development difficult. However, I think that this method got more software in the hands of people sooner than the counterfactual high rigour world. There's definitely an argument to be made that there's an inflection point, where the rigourous method is slower up until some point where it becomes a better baseline, and from there software development accelerates and overtakes our current world. I think it's hard to be confident in that argument, though.


A must read for anyone considering writing or maintaining JavaScript text editor code. The author is a true JavaScript badass (creator of fabric.js).


tl;dr Internet Explorer introduced a nonstandard mechanism, x.innerText, that returned the 'plaintext representation' of some node tree, in contrast to x.textContent (which includes the source whitespace) or selection.toString (which takes longer). This article looks at the use case, compares then browser compatibility (including jQuery), and looks at some previous standardization efforts.


Firefox has supported innerText since version 45.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: