Semantic Linefeeds (2012)

saagarjha · on Feb 26, 2019

Maybe I'm misunderstanding the point of this article, but I always put linefeeds where they should logically be. If people would like to view it in a certain way, it's always easy to put it through a text-processing program, which will do a much better job on semantically separated content rather that that which has had formatting hard-coded into it.

(Incidentally, this is also why I use tabs for indentation, since it's a lot easier for tooling to relayout. But I'm not trying to start the holy war here.)

WorldMaker · on Feb 26, 2019

One way to look at the article is not everyone realizes that there isn't anything stopping people in some whitespace ignoring places like Markdown (or HTML or LaTeX etc) from writing prose more like poetry with linefeeds at clause level and other visually interesting places rather than just at paragraph level or at some technically simple compromise like 72-character word wrapping.

Which is sort of the reverse inference from the one you are presuming. ("I want to format this text, so I'll use tools" versus more in this case "I'm using tools to format this text anyway, maybe I'll flow it like I want in its raw source form to be what looks interesting / dynamic / poetic.")

nemoniac · on Feb 27, 2019

Perhaps the point your missing is that "your version-control system will love semantic linefeeds". You can have your editor reformat your text for viewing any way you like but adding a word at the beginning of a long paragraph can lead to a big diff in line-based version control. While there are other approaches to mitigation, semantic linefeeds are simple and straightforward.

emddudley · on Feb 26, 2019

Previous discussion: https://news.ycombinator.com/item?id=4642395

choeger · on Feb 26, 2019

That is actually a remarkably good idea. Now if we only had a tool that could cope with such a format and parse sentences and paragraphs out of it. When I run my thesis through grammarly I had to remove all the superfluous line feeds for it to correctly determine the actual sentence structure (which it did then well enough for me to do this semi-automatically for 200 pages).

WorldMaker · on Feb 26, 2019

There's the usual markdown/reST/et al approach of strip newlines unless they are doubled up at a "paragraph" boundary.

When converting to/from Inform 7 where newlines are sometimes syntactically important, but I want to version control it (and sometimes write/edit it) a bit more "semantic newline", or at least "72-character wrapped", I have a tool converting "hard" newlines to/from Pilcrows [0] (¶). Which I think look reasonable in source document form and is an appropriate mark for the meaning (denoting a new paragraph).

An example: https://github.com/WorldMaker/APrincessOfMoons/blob/master/_...

The bones of my converter: https://github.com/WorldMaker/APrincessOfMoons/blob/master/i...

[0] https://en.wikipedia.org/wiki/Pilcrow

tannhaeuser · on Feb 27, 2019

SGML (= superset of XML and HTML from 1986) does this via "short reference delimiters", which are context-dependent rules for interpreting custom sequences of chars such as double-newlines as eg. end-paragraph tags.

ivan_ah · on Feb 27, 2019

I'm so happy I have a name for this concept now! I independently started using the same approach while editing my books. It looks weird, I know, but being able to move entire entire sentences using only line-oriented commands makes it worth it.

A side-benefit benefit of this approach is that any sentence that is overly verbose becomes apparent right away --- why is this line so long?? because it's a bad run-on sentence!! Fixed.

Some related editor packages:

- https://github.com/Konfekt/vim-sentencewrap

- https://github.com/jlevy/atom-flowmark (not exactly since only "prefers" semantic cuts, but not always)

TOGoS · on Feb 26, 2019

I've done this for a long time, but that's because I'm extra fussy about keeping diffs as small and focused (on the thing I actually intend to change) as possible. It's the same reason I prefer always-use-curly-braces and indenting multiline parameter lists a single indent (as opposed to lining them up after the function name) in C++. And avoid reformatting for its own sake when making chnges to things. It can be hard to convince others of the utility of these practices, though. :P

ggchappell · on Feb 27, 2019

Nice name. I do something along these lines in my LaTeX source; never thought of it as a thing.

nycticorax · on Feb 27, 2019

There's an xkcd about this:

https://xkcd.com/1285/