Beyond Markdown (2018)

egypturnash · on July 2, 2023

> To dramatically reduce ambiguities, we can remove the doubled character delimiters for strong emphasis. Instead, use a single _ for regular emphasis, and a single * for strong emphasis.

I would love to see * gone but I must note that _ is annoyingly hard to type on a screen keyboard.

Back in the days of USENET one common choice was using a / to delimit /emphasis/ - the usual reading was that this indicated words that would normally be rendered as /italics/. You'd often see it used to indicate the titles of books and movies, as well, since the typographical convention was typically that these were italicized as well - note that both and <cite> typically render as italics, for instance. I have always disliked Markdown's choice to use * as a delimiter for both italics and bold; / always implied italics to me, and * always implied bold.

Anyway. I propose that / would be a much better delimiter for emphasis than _. On a US keyboard, it can be typed without a shift key. And on a US IOS screen keyboard, it is a simple swipe on B, versus shifting to the numeric entry page and swiping on &.

jitl · on July 2, 2023

But I like to write paths like /usr/bin/ in my text and don’t want to worry about backtick code quoting them every time

avgcorrection · on July 2, 2023

This is a real problem considering that some programmers are just sloppy with code-quoting paths and such. And it’s even debatable whether (sort of stylistically) you should even code-quote paths.

anamexis · on July 2, 2023

There's the same problem with variables and filenames containing 2+ underscores, resulting in unwanted italics.

ilyt · on July 2, 2023

That at least is rarer than paths.

SoftTalker · on July 2, 2023

Most people don't write file paths. Only a concern for programmers, who should be fine escaping them by whatever mechanism.

jitl · on July 2, 2023

Most people don’t write markdown! What’s the demographic of markdown writers who don’t write file paths? Bloggers?

Macha · on July 2, 2023

Notion users, Reddit users, etc.

I will say that forward slashes are still more common in regular english text even among non-programmers than underscores. For example, listing options a/b/c.

And you know, URLs.

jitl · on July 2, 2023

I don’t really think of Notion as “markdown” but I suppose you’re right since we support a bunch of markdown conventions. Some things are different though like `> ` is a toggle block, and `” ` a quote block. Unfortunately we already abuse / for a command menu, which is by far my least favorite feature. I want to make a setting to disable it but it goes against our anti-settings philosophy.

melagonster · on July 2, 2023

this may be a huge number, because Evernote and a lot of note applications relying on markdown.

zmix · on July 2, 2023

> Evernote

Since when?

melagonster · on July 3, 2023

I apologise, I confused it.

dyarosla · on July 2, 2023

Uh.. what about all those non programmers that write URLs?

kej · on July 2, 2023

I think many/most people use slashes more often than they use asterisks, however.

simonklitj · on July 2, 2023

Sure, but a slash before and after a word? I don’t think that’s common outside of file paths.

kej · on July 3, 2023

Using slashes for a short/small/abbreviated list is relatively common. For example, the top post on the hiring thread right now is for a "Full-stack/frontend/product engineer". There's another one describing a "M/W/Th" hybrid schedule.

oneeyedpigeon · on July 2, 2023

Not even common including file paths, unless you mean directory paths. Even then, only necessary if you really want to unambiguously indicate a directory in the path itself.

doubleunplussed · on July 2, 2023

And in markdown you chuck them between backticks to indicate they're verbatim text to be rendered in monospace.

blooalien · on July 2, 2023

> "And in markdown you chuck them between backticks to indicate they're verbatim text to be rendered in monospace."

Only if the intent is for the URL to be copy/pasted, otherwise you enclose the URL in <url> or [text](url) to make it clickable.

mort96 · on July 2, 2023

Most markdown implementations make URLs clickable without the angle brackets.

FinnKuhn · on July 2, 2023

no, but most people definitely use links in their texts and they have the same problem. / is also regularly used for fractions or/and in a situation where you could use two words

cbarrick · on July 2, 2023

In IPA, slashes are used for abstract phonemic notation ( pronunciation guides). Converting that to would be annoying for certain communities.

https://en.wikipedia.org/wiki/International_Phonetic_Alphabe...

dredmorbius · on July 2, 2023

There's the concept of escaping notation using a single backslash: \

So:

  **strong text** -> bold "strong text"
  
  \*\*doubled splats\*\* -> "doubled splats" w/ "**" on either side

This is also cumbersome to type, but at least there's a path to what you want to present even if the character is reserved for markup.

cbarrick · on July 2, 2023

Yes. I'm less concerned about it being cumbersome to write than I am about it being cumbersome to read.

And again, this complaint will only be relevant in contexts where this specific convention is used.

ilyt · on July 2, 2023

rare enough to not matter

qingcharles · on July 2, 2023

I really like your proposal, but in the days of USENET the // wasn't interpreted by the machine but simply by our minds, just like * for bold. Would there be any extra issues caused by italics being / rather than *?

I'm honestly with you on this and I'm in the middle of building a huge Markdown site where I have the freedom to change the syntax now if I want.

SoftTalker · on July 2, 2023

I'd like to see

  *this is bold*
  /this is italics/
  _this is underlined_

Beyond simple conventions like this, I'd just as soon drop into HTML as deal with some other markup that ends up being just as complex. We don't need to allow permutations and combinations such as bold and italics, double-weight bold, etc. these never occur in normal prose typesetting and if you need it just use HTML for those rare cases.

s1mon · on July 2, 2023

Underlining is an emphasis hack for mechanical typewriters or in handwriting. There's no reason to use it typographically in something which has all the layout possibilities of a modern computer or printer.

https://practicaltypography.com/underlining.html

teddyh · on July 2, 2023

Except to indicate something new and modern which needs its own visual distinction… like hyperlinks. Using underlining as the default for hyperlinks was genius.

sigg3 · on July 2, 2023

Purely anecdotal but I always waste a second clicking something I believed was a link because it was underlined.

lelanthran · on July 2, 2023

> Underlining is an emphasis hack for mechanical typewriters or in handwriting. There's no reason to use it typographically in something which has all the layout possibilities of a modern computer or printer.

The argument presented by that link is valid for paragraphs and valid for printed content.

On a website, underlining single words or short phrases doesn't make them less unreadable, it draws attention to them.

Like with hyperlinks; the displayed form of `[See here](http://here.com) for more inform` is undeniably better than simply `see here for more info` and leave the reader guessing which of those words is the text and which is a hyperlink.

The problem is exacerbated on mobile where the reader cannot hover a mouse over the words to determine which words are a link and which are not.

If you're writing a full paragraph like this:

"Our previous stories of FooBarFactory Inc were well-received by our readers. Investigative Journalism has always been a core principle of PotatoNews. The images and video that our beloved readers shared on Twitter are only a single component in the fight against big corps polluting our environment."

In the above paragraph, "previous stories" is a link, "FooBarFactory" is a link, "well-received" is a link, "images and video" is a link, "Twitter" is a link and "Polluting out environment" is a link.

The advice from Practical Typography would render that entire paragraph free of any indication that there's more for the user to read.

yaantc · on July 2, 2023

This is what Org mode does. It's still very tied to Emacs, but there's an effort to standardize the Org format. Hopefully this will help its adoption outside of Emacs, it's a nice markup (and a lot more).

ilyt · on July 2, 2023

"why /usr/bin/vi" turns /usr/bin into italic ?"

underlines is not used enough to take the commonly used character ~this is fine~, trailing tilde is rare enough

defrost · on July 2, 2023

As a point of order;

\* more than 3\/4's of people may not feel that way, particularly those discussing snake\_case Vs CamelCase

CrimsonChapulin · on July 2, 2023

That’s PascalCase, camelCase does not capitalize the first word.

arcanemachiner · on July 2, 2023

Turns out that it's a contentious topic:

https://en.wikipedia.org/wiki/Camel_case

FWIW I use the same nomenclature as you, but that doesn't make it universal.

overhead4075 · on July 2, 2023

New rule needs ~ when * or _ is used inside words, so your example wouldn't need escaping. E.g.,

dis~/cuss/~ing

defrost · on July 2, 2023

with a similar caveat for 'obvious' fractions and divisions?

avgcorrection · on July 2, 2023

You mean ¾?

defrost · on July 2, 2023

Only if you can do that for, say, 937\/997.

avgcorrection · on July 2, 2023

937⁄997

You need U+2044 FRACTION SLASH.

qingcharles · on July 2, 2023

I honestly had no idea you could do that in Unicode with arbitrary values. Thank you.

ilyt · on July 2, 2023

> Back in the days of USENET one common choice was using a / to delimit /emphasis/ - the usual reading was that this indicated words that would normally be rendered as /italics/.

Fuck no. Same idiocy as turning -- into long em, makes writing any technical posts mighty annoying

Get better screen keyboard. On mine _ doesn't require shift, neither does *

toastal · on July 2, 2023

Usage of `_` and `-` keys are dead simple home-row keys on Dvorak keyboards. I’ve never switched layouts because most are focused on prose, but programming demands a lot of snake & kebab casing. …not that `/` is too far away.

loloquwowndueo · on July 2, 2023

Do you have some reference for “simple swipe on B”? Doesn’t work for me (not that simple I guess?)

bobbylarrybobby · on July 2, 2023

They're probably referring to the iPad keyboard (https://www.ghacks.net/2019/08/02/how-to-enable-the-swipe-ke...)

hsfzxjy · on July 2, 2023

Why would a _ be harder to type than a * on a screen keyboard?

ilyt · on July 2, 2023

same difficulty on mine, just switch into alternate character page and both need no shifting or button holding there, no idea what he's on about.

ajdude · on July 2, 2023

I agree. I admit, I still /do this/ for emphasis, and have always wondered why Markdown didn't follow suit.

masklinn · on July 2, 2023

Because Gruber used a different convention?

> While Markdown’s syntax has been influenced by several existing text-to-HTML filters — including Setext, atx, Textile, reStructuredText, Grutatext, and EtText — the single biggest source of inspiration for Markdown’s syntax is the format of plain text email.

sqs · on July 1, 2023

I understand where the author is coming from and respect their contributions to Commonmark.

But...

There are tons of markup languages for prose that have well-defined specs.

So, why did Markdown win?

IMO, because it does not have a well-defined spec. It is highly tolerant of formatting errors, inconsistencies, etc. If an author makes a mistake when writing Markdown, you can always look at it in plain text.

Whereas a perfectly-spec'd markup language would probably evolve toward an unreadable-to-humans mess in the committee-driven pursuit of precision.

You see this theme in so many places in tech: "less is more", the Unix philosophy of everything-is-a-file, messy HTML5 over "XHTML", ML extraction vs. explicit semantic web, etc.

drawkbox · on July 2, 2023

> IMO, because it does not have a well-defined spec.

Same reason that JSON won.

JSON and Markdown are base standards that were generated by market need to simplify.

JSON won because it was not overly complex and there was some flexibility. If you need more go YAML or use JSON as a platform for more.

Every attempt to change JSON has and should be shot down. JSON really just has basic CS types: string, int/number, bool, object, lists. From there any data or types can be serialized or filled. With JSON you can do types via overloads/additional keys, you can add files by url/uri or base64, and any additional needs using parts of basic JSON. Even large numbers can just be strings with type defs as additional keys/patterns. Financial data can just use strings or ints with no decimal largely because this is the safest way to store financial data to prevent float issues.

KISS is life and sometimes things are just done, no improvements needed. Now you can take JSON and add things on top of it if you want. Same with Markdown. The base doesn't need to change... ever.

Don't SOAP my JSON. Don't HTML my Markdown. Though you can add specs (JSONSchema/OpenAPI) and formatting tools on top in a processing step. For messaging and base content, they are perfect, simple, clear, concise and no need to change.

sacado2 · on July 2, 2023

I think JSON and Markdown are very different, in fact.

JSON is very strict. It won't let you have a comma after the last element of a list, for instance (which is very annoying in many cases). It won't let you add comments in any way, shape or form. It won't let you use single quotes instead of double quotes. Or forget quotes in keys. Or mess with case in null / true / false. Or use NaN values.

Markdown is ill-defined, and will happily let you do whatever the hell you want.

JSON is made for programs, and is a PITA to write as a human (for the reasons mentioned above). But a pleasure to parse and (to some extent) generate automatically. It's not very good with text.

Markdown is made for humans, and I'd hate to have to parse a markdown file and do something with its content other than basic formatting. It's bad at anything but text.

billyhoffman · on July 2, 2023

JSON won because parsing it on in the browser was just was a call to “eval()”, and then you just access the object using normal JS conventions/syntax (e.g data.foo[0].bar). Whereas XML required creating a DOM parser and document fragment, and then using cumbersome HTML DOM methods like “ getElementsByTagName()” to get each value directly (or worse Xpath). It totally sucked.

Native support for JSON parsing and stringify helped when it came later. The Selector api that also came later made XML parsing a little easier if you didn’t want to use XPath, but by then most things were JSON anyway.

majkinetor · on July 2, 2023

It should at least have comments. Then it can freeze.

drawkbox · on July 3, 2023

You can make a separate file that has comments or even another JSON file that has descriptions by JSONPath or key.

Some libs also support comments and trim before processing but I prefer the external/metadata way. Comments add weight.

rollcat · on July 2, 2023

> Every attempt to change JSON has and should be shot down.

I really wish JSON allowed for final trailing commas in arrays/objects.

It would make for more readable diffs, simpler text templating, easier writing/parsing for us humans, etc. I'd happily trade all of TOML, YAML, XML, and every other similar format in existence for that one change.

ghusto · on July 2, 2023

It makes generating from templates in certain (many!) instances needlessly difficult. I say needlessly, because the rule is seemingly arbitrary. I can't see what purpose it serves.

axblount · on July 2, 2023

Also, "worse is better"

https://en.wikipedia.org/wiki/Worse_is_better

mindwok · on July 2, 2023

Nice, I didn't know there was a term for this.

I completely agree. My favourite software is not just functional, it also is opinionated and expresses a philosophy on how to do something. Simply adding flexibility forever in a quest to be useful for everyone ends up making it useful for no-one.

avgcorrection · on July 2, 2023

This is only perhaps correct in that a loosey-goosey proposals can spread farther because it is seemingly simple to implement (less MUST and whatever) and by the time you notice inconsistencies between implementations, the thing has reached a sort of critical mass already and the things aren’t that inconsistent so you just shrug and say whatever.

But in the case of MarkDown the original implementation was just not that great. Which has nothing to do with being easier; MacFarlane’s Djot is an easier to implement and easier to describe language.

And of course your point about “committee-driven pursuit of precision” is just a made up hypoethical which is not worth responding to. (The only committee has been on CommonMark, which is a definition of “MarkDown” (TM) which merely tries to deal with years of drift between different MarkDown implementations. With their famously long-winded spec-by-prose-enumeration style.)

bobbylarrybobby · on July 2, 2023

Asciidoctor has a spec, reads pretty similarly to markdown, and is infinitely better IMO. And it (well, AsciiDoc) predated markdown!

I think markdown won because it was specifically made with HTML output in mind, instead of arbitrary output (docbook, in the case of AsciiDoc, which is pretty much infinitely malleable).

premysl · on July 2, 2023

The Asciidoctor flavour of AsciiDoc doesn't have a specification. There is only a working group. The parsers are a mess composed of regular expressions.

There are in effect two different versions of AsciiDoc, because Asciidoctor people have appropriated the name while making their own changes to it and marking what they dislike as deprecated.

AsciiDoc cannot express all of DocBook, for example figures with multiple images.

While I despise Markdown, there isn't all that much to be a fanboy of. Just the syntax is overall saner.

MilStdJunkie · on July 3, 2023

Ah, DocBook //imageobjectco with something like calspair as well. I've been wanting it badly, but there's zero movement in the Asciidoctor group to try and tackle that beast.

With all due respect, and speaking as an amateur programmer, when it comes to lightweight markup, is there a better way to write a parser besides regular expressions? I suppose it's how the semantics are abstracted.

Asciidoc does get you conditionals and transclusion in the core spec, without needing to resort to extensions. This is what brought me over. That and the XML interoperability.

The Eclipse WG isn't published yet, but, in my opinion, it's a more stable surface to build on than the "many worlds" of Markdown.

Every time someone shows me a cool markdown trick, it requires me to pull something down from github and `npm-install` (or equivalent). But, well, that's kind of the point, isn't it? Markdown's ease of implementation allows a degree of glorious hackery that's just not possible otherwise. While Asciidoctor's great albatross - and its great asset - is Ruby . . which inevitably involves Opal at some point.

jillesvangurp · on July 2, 2023

You are completely right. The underlying theme here is that the requirements matter.

The requirement for Markdown is to be simple and easy. It's intended for use by people who are going to ignore whatever specs and documentation there are. They'll write a little comment, a bug ticket, or a readme and they might need things like links, bold, italic, etc. And the job is to turn that into some legible HTML. So most of its features are simple and easy to remember. Just add a blank line for a new paragraph, prefix your bullets with a -. and so on.

Markdown is undeniably simple and easy to learn. Which is why it got so popular. It has edge cases but they don't really matter. It has obscure features (e.g. tables) most people don't use, so those don't matter either. And there's a wide range of things it can't do that also don't matter. The job never was being a drop in replacement for more complex tools. It was removing the need to use those for the simple use cases and be simply good enough.

The alternatives each chase requirements that are important to their creators but not to most casual users, or indeed the people that integrate markup tools. And of course the more these alternatives differ from Markdown, the harder of a sell it becomes. And the more there are, the less likely it is for any of them to become more popular than markdown. At this point, markdown is a common default in things like issue trackers, readme's on Github/Gitlab, etc. Any tool integrating some kind of markup language support in their content management is more likely to be using markdown than anything else at this point.

The reason is simply that using something else breaks the principle of the least amount of surprise for the user. Markdown is the largest common denominator. It's good enough and easy enough to deal with. So, most new things would favor using that over anything else. It's a self re-enforcing thing.

ghusto · on July 2, 2023

> largest common denominator

Or the lowest.

This is how populist politics works. The thing that appeals to the most people isn't necessarily the thing we should be doing.

The internet and web appealed to a small percentage of people in the early 90s, and it was glorious. You had to put in effort to get anything out, which meant most people didn't bother, which meant it was a nice place. The music industry similarly had a high level of entry. Both are filled with crap now.

Elitist old man shouting at clouds? Maybe. Doesn't mean I'm wrong though.

crabbone · on July 2, 2023

These things don't win on engineering merits. Markdown wasn't better than others. It was like a bunch of others. It's just natural that one form of communication becomes a monopoly because people want to be able to talk to as many people as possible.

You only need to be good enough to enter this kind of competition... and win. The reasons you might win can be many arbitrary things, like someone deciding to adopt a practice in a large organization, or dedicating efforts to writing parsers in many languages etc.

hyperpape · on July 2, 2023

> Whereas a perfectly-spec'd markup language would probably evolve toward an unreadable-to-humans mess in the committee-driven pursuit of precision.

Maybe, and I mean that sincerely...but are you just saying this must happen or can you actually point to where MacFarlane's proposals would make a significantly less pleasant language?

fiddlerwoaroof · on July 2, 2023

Requiring a blank line before a sublist just looks wrong.

jitl · on July 2, 2023

I couldn’t figure out what was meant by a sublist. Like any hierarchy? Or just list-in-paragraph-in-list, not list-in-list? That one could use some HTML disambiguation in the article.

EDIT: yeah it’s always required. That kills me.

Izkata · on July 2, 2023

> Whereas a perfectly-spec'd markup language would probably evolve toward an unreadable-to-humans mess in the committee-driven pursuit of precision.

This proposal shows us a clear step in that direction, going from something simple and easy for humans to understand, with complex implementation, to emphasize part of a word:

  fan*tas*tic

To proposing a simple implementation that's... weird for humans:

  fan~_tas_~tic

avgcorrection · on July 2, 2023

It seems like a minor concession since most uses of intra-word emphasis are more cutesy than communicative[1] (it is of course sometimes very useful when there is a subtle syllable emphasis, or a suble typo that you want to point out).

[1] Maybe I’m being a hypocrite here? I definitely am in favor of a lot of “cutesy” ways to communicate (things that are more stylistic than necessary). But not intra-word emphasis, really.

atoav · on July 2, 2023

I had a look at djot, which adresses all of the author's grievances and I must say... I don't like it.

Sure, it probably is easier to parse, and maybe there are a few edge cases that it does better, but the goal of markdown is to have text that is:

A) human readable and looks good without parsing it

B) can be parsed and presented using different themes

In djot they sacrifice a lot (e.g. we now have to insert empty lines in a nested list?!) of point A for questionable gains at point B. Guess what I as a user care more about?

Markdown accepting a wide range of inputs is not a mistake, it is a feature. If that makes parsing more complex that is an acceptable side effect not a mistake.

avgcorrection · on July 2, 2023

I agree that empty line in front of a nested list is ugly. I very often make hierarchical descriptions of things like events or things to do or recipes and that kind of thing would be annoying to have to deal with. I like my lists tight.

I would have tried harder to find some other way to make the grammar simple.

I haven’t seen anything else (in addition) that makes it less “human readable” though.

_eojb · on July 2, 2023

I'd argue that it won the adoption it did in spite of parsing ambiguities and the lack of a spec. Not because of it. There are plenty of examples of well specified things that have gained mass adoption, so I think you are confusing cause and correlation here.

giraffe_lady · on July 2, 2023

Well-specified formats that are primarily produced by humans writing them by hand? The main entries in this category are programming languages.

_eojb · on July 2, 2023

IDK, JSON? HTML and XML are markup languages also. There are obvious issues with markdown that were fixed/resolved in various markdown variants, and missing features as well that I don't think anyone could argue helped adoption. Case in point, the most commonly used markdown flavor is GFM because we all adopted GitHub and that was what they support.

eterevsky · on July 2, 2023

It is true that Markdown won by putting simplicity for the users in front of simplicity for the parsers. But since it became ubiquitous, there's a lot of value in codifying the standard to make sure that it doesn't diverge into different dialects.

Regarding the specific author's suggestions, he explicitly writes that he doesn't propose to implement them in the actual MD "standard", since backwards compatibility is more important. That said there is value in making the markup less ambiguous while preserving the "writability" even if it's just a thought experiment.

ilyt · on July 2, 2023

It's not really problem with being "perfectly specced" or not, it's just matter of inertia.

If markdown just used bold _italics_ at the start, or needed a tag for HTML instead of passing it as is... it would be entirely fine and just as popular now. Or any other generally agreed upon as "good" fix.

But inertia makes things like that near-impossible to change now. Only additions can sorta work and even those are hard as critical mass of dialects needs to apply them for it to work.

still_grokking · on July 2, 2023

What you're saying is that the most stupid and broken "solutions" win…

Now one could speculate about the reasons.

tannhaeuser · on July 2, 2023

> messy HTML5 over "XHTML"

Nothing messy about HTML, whatever version. It just uses SGML features from a more civilized age, such as inferring tags not explicitly present when unambiguously required by the content model grammar.

Btw a large fragment of markdown can be implemented using SGML's SHORTREF feature, as can customizations such as GitHub-flavored markdown. John Gruber's markdown language is specified as a canonical rewriting into HTML with the option of inline HTML as fallback, making SGML SHORTREF a particularly fitting implementation model since it works just the same. It's quite striking how a technique for custom syntax invented in the 70's (however imperfectly specified, though not in a worse-is-better way lol) could foresee Wiki syntaxes and also determine the most commonly used markup language (HTML) fifty years later.

Agree with the gist of your post, though. As fantastic as MacFarlane's pandoc is, the idea to re-assign redundancies in markdown (eg. interpret minute presence/omission of space chars to mean something) was bound to fail, and that was very clear to me skimming only through a few paragraphs of the CommonMark manifesto. When it was first discussed here back then, someone commented that this was bound to happen when a logician (McFarlane) approached Wiki syntax.

still_grokking · on July 2, 2023

SGML is a hot mess. It should have died decades ago.

throwawaaarrgh · on July 2, 2023

> All of these rules lead to unexpected results sometimes, and they make writing a parser for CommonMark a complex affair.

> What if we tried to create a light markup syntax [..] revising some of the features that have led to bloat and complexity in the CommonMark spec?

Are you writing this new format to make life easier for the humans using it, or the humans programming it?

It's sad when programmers don't see the forest for the trees.

boerseth · on July 2, 2023

If the rules are too complicated, then they are a challenge for all parties, both users and implementers. I think it is useful to be able to imagine at least on some higher level what a parser would do to the stuff I write, so everyone benefits from the ease of understanding that comes with simpler rules. The question is just how far we can simplify without reducing usability.

The rest of the article frequently takes the side of the users, and mentions how confusing certain existing rules are to them. I know I frequently don't know what to expect from Markdown in certain corner cases, and felt vindicated by the author calling them out here. Some of their ideas for simplification would surprisingly even let us do things that are currently not possible.

_gabe_ · on July 2, 2023

> If the rules are too complicated, then they are a challenge for all parties, both users and implementers

Not necessarily. Generics, and/or C++ templates are a pain to parse because they're context sensitive. But while reading/writing code it's typically obvious whether I'm writing a comparison or a generic/template.

  Foo<Bar> foo;
  // VS
  Foo < Bar;

Likewise, in C++ you can end up with:

  unordered_set<tuple<int, float>> mySet;
  // >> is ambiguous here without a symbol table or context around the statement
  Foo >> 5;

I think both of these are fairly obvious as a user of the language, but boy am I glad I don't have to parse that!

ilyt · on July 2, 2023

> If the rules are too complicated, then they are a challenge for all parties, both users and implementers.

You are still confouding rules for writing with rules for parsing. It's absolutely possible and easy to make rules making writing easier but parsing harder.

For example, if you make rule that makes formatting markers like **\_ be order insensitive (so **_word**_ formats same as **_word_**), much easier for user, as they don't need to remember order of which the operators were used, harder to code (I assume)

ncallaway · on July 2, 2023

The problem is when it's too hard for the computers, then it negatively impacts the user experience.

There are cases that are 100% ambiguous in the spec, which means there can be no _right_ answer. Different users will have different (and both reasonable) expectations about what the same input will do. So, in these cases "too hard" for the computer means leads directly to a negative user experience. The language becomes more unpredictable.

I agree that we shouldn't _ever_ lose focus on the end user experience. But sometimes, you have to make the spec less ambiguous to improve the end-user experience.

jitl · on July 2, 2023

I am flamfoozled by paragraph-in-list and list-indentation regularly in status quo markdown. Maybe it’s because the syntax is a little weird for the edge cases today? Or maybe I’m just a goof who needs to go read GitHub’s parser source.

0cf8612b2e1e · on July 2, 2023

Weird that he did not reference the djot package[0] which seems like his attempt at implementing this thesis.

[0] https://djot.net/

noizejoy · on July 2, 2023

Maybe because OP is from 2018?

0cf8612b2e1e · on July 2, 2023

Ha! That would do it. Well, guess he stewed on the idea long enough he had to take a swing at the problem.

hinkley · on July 2, 2023

There is something to be said for, not editing your old posts, but applying a preface that references later iterations on the idea. I wish I were better about that myself.

"In this article from 2017, I talk about dinglehoppers, which have since been improved by research from these three papers [1][2][3]. Here is where I revisit this topic in 2021."

eslaught · on July 2, 2023

Needs a 2018 in the title.

He actually implemented these ideas: https://djot.net/

evolve2k · on July 2, 2023

What a terribly named project.

Surely riffing from; Mark, Common or Down would have been more effective.

eslaught · on July 2, 2023

After the controversy over naming CommonMark (where @jgm et al. caught flak over originally trying to name it Standard Markdown), I'm not surprised that he picked something totally unrelated. And really, it's not Markdown at all at this point, so being more clearly differentiated from Markdown / CommonMark seems like a plus to me.

cpach · on July 2, 2023

I don’t agree. I like the sound of it. And after all it’s just a name.

anschwa · on July 2, 2023

Personally, I would like to see a markdown spec that eliminates parsing ambiguity by restricting the "edge-case" features that HTML is really much better at describing in a standard and structured way.

I think we could pick one way to handle emphasis, lists, and code blocks that covers a specific and predictable 80%.

Anything that becomes hard to describe without including additional notation to the grammar is probably best suited to be left as HTML, as was the intention behind markdown to begin with.

avgcorrection · on July 2, 2023

https://github.com/jgm/djot

silvestrov · on July 2, 2023

One feature which I'm missing which is very useful for CMS systems etc is a standard syntax for implementation specific callbacks/macros.

E.g. a macro that returns todays date, todays great offer, etc. Or a "number of days until xxx" for countdowns until some event.

His attribute syntax is very close. A posssible macro syntax use {@ as leading marker, e.g.

    {@macroname position=left}

or

    There is {@daysuntil date=20230710} days to launch.

lewisjoe · on July 2, 2023

We are implementing markdown support in Zoho Writer (https://zoho.com/writer) and I can confirm how difficult it is to handle bold and italics.

It definitely is a weird choice to use *s for both bold and italics. Parsers could be implemented much easier, if both had different delimiters as mentioned in the post.

dr_kiszonka · on July 2, 2023

1. Markdown is great.

2. The only thing I miss is support for nested numbered lists.

2. 1. (The best kind of lists.)

prepend · on July 2, 2023

I write a lot of markdown. Ive taught lots of people to use it. I’ve never encountered these problems.

Markdown is meant to be simple. To represent complex things, use something else.

I don’t think I’ll ever use this and if someone tries to make me learn this instead of regular markdown, I’ll probably just not bother.

I don’t want to diminish anyone’s creativity, but this seems like a lot of work put into something unnecessarily.

eduction · on July 2, 2023

The article reads very much like a list of problems important to an implementation author rather than a user. Except maybe the nested list thing, which does sound somewhat annoying. But also rare.

hinkley · on July 2, 2023

My thought is to represent complex things, use better prose or diagrams. Though José and some of his friends are warming me up to interactive tools. Livebook has some stuff I need to look at more. Currently mostly targeted at developers, of course.

JimmyRuska · on July 2, 2023

I also made my own editor a long time ago, used it for personal use and on the writing site roleplay.cloud. It had lisp-like syntax, custom expansions. It also had some of these ideas, reference links, I would run code snippets with [python ...]. Normal html code would also work, like [br] instead of

https://web.archive.org/web/20121017064607/http://94.249.190...

https://news.ycombinator.com/item?id=4437875

These days, it would be good to mix/match ideas from: pugjs, htmlx, jupyter, dhall

eviks · on July 2, 2023

Is there any great modern rich document alternative? That wound truly go beyond markdown

MilStdJunkie · on July 3, 2023

Asciidoc. Particularly if you 1) need XML interoperability, 2) complex print outputs, 3) complex tables, 4) transclusion (partial and otherwise) in core spec, 5) conditionals in core spec.

The AsciidocFX program is a good "starter's editor" for those unfamiliar with Asciidoc and lightweight markup in general - it includes a "boxed" DocBook-XSL pipeline as an alternative to the Ruby-based asciidoctor-pdf. For an actual production editor, Visual Studio Code with the Asciidoctor extension is very hard to beat. Github integration on top of VSC gives you some collaborative visibility, too.

On the PDF front, another interesting Asciidoc project is asciidoctor-web-pdf, which uses Paged.js and CSS to product extremely complex PDFs using web technologies (Chromium + Puppeteer, I think). That, asciidoctor-pdf (Ruby/Prawn), and DocBook-XSL are the main PDF pipelines.

IshKebab · on July 2, 2023

Asciidoc is the best option if you need a bit more than Markdown.

mnot · on July 2, 2023

Making breaking changes to markdown is about as practical as doing it to HTML -- already existing content and mindshare give the current form massive inertia.

The is especially the case when it works for the vast majority of use cases (or can be hammered into them); ambigiuities are very visible to implementers and detail-oriented folks, but most people never see these issues, or don't care about them.

And, while it sucks that it's complicated to implement, that burden is on relatively few people. See also: the HTML Priority of Constituencies.

holler · on July 2, 2023

> Consider, for example *this* text*

Oh yes. I made the fun decision to write a markdown parser/contenteditable component for https://sqwok.im and ended up spending probably a month on it, largely writing endless unit tests and covering odd cases like that.

It's far from perfect and probably will still break on certain ambiguous inputs. I like his ideas for clarifying the language for the most general audience.

hinkley · on July 2, 2023

I have been trying to research my way out of having to write a markdown parser that doesn't allow inline html because I don't want to be a markdown author, but I categorically don't want people being able to inject things into the wiki(s) I need to create. In some languages it's a flag. In others, there's no flag.

This is like not using bind variables on your sql library. I just don't understand it. I'm looking at you, Crockford.

Decabytes · on July 2, 2023

This is why I like the way Racket does this with the Pollen language. You can use Pollen mark up and create your own tags and then decide how they are converted. It all becomes a list of X-expressions that can be manipulated in any form you like. But the tree nature of an X-expression means you don’t get issues like *strong* word*.

For example I can write ◊bold{strong* word} and it becomes (bold “strong* word”). It’s very clear how this should be rendered.

wackget · on July 2, 2023

One feature missing from any Markdown language is vertical table headers (i.e. headers on the left):

https://stackoverflow.com/q/60995936/1652951

cratermoon · on July 2, 2023

Just make it so that literal * has to be escaped, and use a greedy parse.

*foo* always means * followed by , and the closing * is missing and would be flagged.

<string>foo... uh oh missing a closing *, can't parse

Oh boy, HN mangled this. I'm leaving it as an examplar

dredmorbius · on July 2, 2023

You can show string-literal text without HN's markup interpolation by indenting the start of the line by two characters:

  You can show string-literal text *without* HN's markup interpolation by indenting the start of the line by two characters:

H8crilA · on July 2, 2023

This makes parsing and rendering easier, but writing harder. Given the widespread adoption of Markdown I suspect this project to go absolutely nowhere, since it focuses on precisely the opposite thing that makes Markdown popular.

ggm · on July 2, 2023

I wish more markdowns accepted some notation for keyword bullet lists. Indented lists which mark with a bolded or emphasised term.

If you can do this, you can write manual pages for options or flags

dredmorbius · on July 2, 2023

There is an extended syntax for that:

  First Term
  : This is the definition of the first term.

  Second Term
  : This is one definition of the second term.
  : This is another definition of the second term.

<https://www.markdownguide.org/extended-syntax>

ggm · on July 2, 2023

How this renders is suboptimal to me. It's like a variant of heading.

I meant

  -a        the minus aflag text
  -b        the minus bflag text
  Something the something text

It's basically table or grid layout without lines.

oneeyedpigeon · on July 2, 2023

That's a CSS issue rather than anything to do with markdown or the markup it generates.

dredmorbius · on July 2, 2023

This.

Markdown (or other markup languages, lightweight or otherwise) in general specify semantic document structure.

The presentation or layout are defined by the style and/or stylesheet --- CSS in the case of markdown.

That said, the definition list is intended to be something of a header/content block, as defined by the W3C.

toastal · on July 2, 2023

> keyword bullet lists

You mean a definition list like the HTML native one?

oneeyedpigeon · on July 2, 2023

I'm sure they mean DLs, yes - this is the one thing I always bemoan that markdown lacks, even more so than tables.

toastal · on July 2, 2023

### But my headings

* can be a definition list

* if I don’t actually care about semantics

yaccz · on July 2, 2023

reStructuredText is way better format while only slightly more complex than markdown.

ChrisArchitect · on July 2, 2023

(2018)