I like the ideas behind this but I do not think this will take off as much as CommonMark has. My main reasons for thinking this:
1. Users don't really care about edge cases. CommonMark (cmark-gfm really) is "good enough". If something looks odd people go "oh, strange" and make a 3 second edit and forget about it.
2. There's already multiple parsers for every language for markdown. Here there is a single Lua implementation.
3. GitHub is the king.
If it cannot render in GitHub it's not real. If it's harder for me to write in the 99% case then it's not worth it. If it's harder for me to implement within my application then I'm too lazy to switch.
Nice things in here:
1. Less ambiguous parsing. For the parser infrastructure I support at $JOB this makes supporting questions from users a HUGE pain in the ass.
2. AST-first design. Seems like everything appears in the AST (even references!) which is a huge win over cmark-gfm. There's also source positions included which is a MASSIVE win for tooling.
3. Custom Attributes implementation. This is really nice for extensibility. It is less readable but having it as an option is nice.
I can’t imagine the intended audience being so wide as to include a goal of “replacing Markdown as the default on GitHub.”
Instead, this project appeals to me as someone who’s already “bought in” to the pandoc ecosystem. Pandoc makes it really easy to write filters[1] and to take the same source file to generate web pages[2], Reveal.js presentations, Beamer presentations, and long-form PDFs[3]. As someone who writes most things in Markdown compiled via pandoc, I see the cracks in the edges all too often, but I’m too stubborn to give up markdown or any of the tools I’ve built up around pandoc and pandoc markdown. I could absolutely see there come a day where I find some last straw where I can’t get done with pandoc Markdown what I need to get done, and djot seems like it would at least be a contender. I’m sure there are many pundits here would chime in and say “just use Asciidoc,” but every time I look at a syntax quick reference, I get about halfway down the page before thinking “nah, this looks too foreign, I don’t want something that diverges this far from Markdown.”
Djot deviates in annoying ways from Markdown, but not as many and so it’d be an easier pill to swallow for the narrow audience of people like me who want something mostly similar to Markdown that works well with pandoc and avoids the most common syntactic oddities of Markdown.
“Sublists must have surrounding blank lines” and “HTML has to be wrapped” is enough to be annoying IMO. Again, not insurmountable but just the sort of thing that I’d find weird looking every time I authored a djot doc.
Especially the blank lines around lists restriction. The syntax goes to explain how “tight” lists do not have their elements wrapped in paragraphs, while “loose” lists do. This admits that lists in the rendered output will want to use whitespace to achieve a particular layout. But for the case of a Markdown list like this, which will be rendered with no whitespace between items:
- top
- inner
- top
- inner
- top
- inner
In djot this must become:
- top
- inner
- top
- inner
- top
- inner
which has way more whitespace in source form than the rendered output will have. This makes it harder to judge the visual weight of a piece of text by simply glancing at the source, and would require spending more time looking at the rendered output while drafting.
And to go further, a loose list would require even more whitespace:
- top
- inner
- top
- inner
- top
- inner
If there were a way to relax this constraint about new lines for list sub elements, I might even consider switching to this for some documents today, but absent that it’ll have to be once I bang into a few more markdown ambiguity problems.
Yes, you can write READMEs in ReST or Org, but you cannot write issues, pull request descriptions, comments, or discussion posts in anything other than GFM.
On the other hand, they already migrated GFM once (to CommonMark), so it's not wildly unimaginable they might do it again. There is precedent. (Of course, it would have to take off first. Adoption is problem #1.)
I am someone who has decided to go all-in on djot for my software.
Personally, I care about the edge cases. I also don't care about multiple parsers because I want to write my own. (I currently depend on Sphinx, Breathe, and MyST, which are heavy dependencies.) And I don't store my code on GitHub because of Copilot and because of [1].
I decided to go all-in on djot for several reasons:
* I can write my own parser.
* djot can target any format, which means I can use the same docs to generate manpages, a docs website like [2], and perhaps PDF's if my docs include something like the Rust Book.
* djot's extension story [3] is the best of any format, and I need extensions that don't exist in any format for things like EBNF in a specification.
That said, I think you are correct that this will not take off as much as CommonMark has. I guess I was just sharing that I don't care and why.
> There's already multiple parsers for every language for markdown. Here there is a single Lua implementation.
Markdown started with a single implementation AFAIK. Or was there a galaxy-wide coordinated software release for Markdown parsers and I missed it?
> GitHub is the king. f it cannot render in GitHub it's not real.
This can't be more farther than truth for me. I don't care if it's rendering in GitHub or not. If I can use it in my docs, write it fast, put on a webpage in a nice manner, I'm done.
Markdown is not popular because it's embraced by GitHub. It's the exact opposite. If it can capture some mind share, tools and websites rendering this will proliferate.
While I also like GitHub, I make very little use of its web interface. I mostly use gh to interact with the service. But I use markdown to generate document fragments on a daily basis. Claiming a technology is going to fail because the one big player in town has not adopted it yet is... not very open, to say the least.
> If it cannot render in GitHub it's not real. If it's harder for me to write in the 99% case then it's not worth it. If it's harder for me to implement within my application then I'm too lazy to switch.
There's also on-prem Bitbucket, which supports an even more limited form of Markdown. One could always use CI to render the documentation into Confluence (because it's a safe bet someone who uses Bitbucket uses Confluence as well) using Pandoc, but it's still a downgrade in usability: you can't just review the changes to the documentation done in branch A in rendered form.
But yes, GitHub is king. I don't like mermaid.js, but when I do, I do my diagrams in mermaid.js because GH renders them.
I'm a fan of Asciidoc, and I think djot is a nice "next" markdown, but:
> If it cannot render in GitHub it's not real.
This is 100% accurate. If I limit myself to CommonMark, with maybe a few extras tossed in gently, I can be pretty sure that my meaning will be correctly rendered.
Anything else, no matter how nice the syntax is, or how powerful the expressions are, won't render for the people I want it to renfer for.
I recently found out that the author is John MacFarlane, a philosophy professor I have read papers from in totally unrelated contexts. I was more than surprised to see that he is the original author of pandoc. It boggles my mind how someone with an academic career in a somewhat unrelated field can have a GitHub profile like him. It's really impressive.
On topic, though, preceding sublists with empty lines is a complete non-starter for me. However, since I don't hard-wrap lines (goal 7), but use soft-wrap only, I am not in the target audience anyways.
> I recently found out that the author is John MacFarlane, a philosophy professor I have read papers from in totally unrelated contexts. I was more than surprised to see that he is the original author of pandoc.
Professor MacFarlane must have really hated writing his papers in Latex.
I get the thrust of "not wanting to pay for hard-wrapping" but I'm not sure I understand what you think a better design would be; does a single newline always introduce a new block element?
No. Within a paragraph it would introduce a hard line break, e.g. like you'd use for poetry. You'd use two line breaks, i.e. a blank line, to start a new paragraph.
Yeah that stuck out to me as the most objectionable thing at first glance too. Otherwise it looks reasonably sane. I currently use AsciiDoc and it's ok but this looks slightly better I would say.
I understand the rationale and how CommonMark parsing is not trivial and could be simplified, but the resulting language misses, for me, the best part of Markdown: that it happens to be pretty much just what I'd write in plain text anyways.
The odd newline requirements on lists and blocks, the special syntax for raw HTML and so on makes Djot feel more artificial to me.
The fact that we cannot standardize on extensions means that markdown feels inadequate for more technical documents. If just a free more restrictions means I can easily add block annotations to everything, I will jump aboard immediately. That it is easier for parser writers is just a perk.
I think this is partly because AsciiDoc has broadly been tied to a single implementation, AsciiDoctor, without a spec, not even a sketch in a blog post like the original Markdown was. It's only recently that AsciiDoc has begun to think of itself as "a markup language" rather than "the markup language used by AsciiDoctor". A spec is apparently WIP.
As for why it never gained the memetic popularity of Markdown that might have led to a different trajectory, that's harder to say. The One True Markdown is fundamentally much simpler than AsciiDoc, and consequently much easier to learn, easier implement in JavaScript for live rendering on the Web, and easier to extend with your own opinionated features. So I think it was easy and attractive for various platforms like Hacker News and Github to support it, and this I think had a snowballing network effect.
Personally I love AsciiDoc and I think it's the future of technical writing and publishing. It's everything I wanted out of reStructuredText but without its fussy, non-composable syntax. However I don't think that future will become reality until a spec is published that is friendly to implementers other than AsciiDoctor.
AsciiDoctor is a second implementation that doesn't even fully compatibly implement the original specification. The original AsciiDoc is pretty well-specified, and it's mostly the plaintext markup of stuff that was intended to go to DocBook, with very little surprises from that.
AsciiDoctor pretty much focused on a direct HTML translation and ignored the inconvenient parts. (Some of the inconvenient parts are deprecated syntax that while AsciiDoc's had a replacement for, I've written the old style for ~20 years and when GitHub tries to render a document with AsciiDoctor, oops; sometimes I'll change the document, sometimes I'll decide rendering on GitHub isn't important.)
I suspect a lot of it's inertia from before the choice mattered or was actually reflected on (though I guess there are still plenty of projects changing).
It's easy, most projects can satisfice with it, and people on the projects that can't satisfice with it may not think about markup enough to realize they're painting themselves into a corner until they have a big ballast of existing documentation to cope with?
I've been fumbling around for how to convey signs that a project may need better tools (https://t-ravis.com/post/doc/what_color_is_your_markup/) but it's been slow-going and I'm bearish on how well ~better-practices will spread.
Because people know and like Markdown. It’s good enough so they don’t go looking for a replacement.
Markdown is used enough that you are going to need to know the syntax. So a competitor doesn’t just have to better, it has to have enough additional merits to be worth learning in addition to Markdown.
AsciiDoctor and Org Mode both have substantial additional merits over Markdown, and have dedicated user bases. The problem nowadays is that of implementation availability.
I mean, you can already do all of this (and more) in Pandoc's Markdown dialect, which is CommonMark-compatible. The main things you're getting here are sanding down some of the sharp corners that CommonMark has, in exchange for some uglier syntax if you actually care about the use cases those sharp corners enable.
The point on standardization is sort of moot until Djot becomes big enough to be more than a single implementation. I'd be happy to see it get to that level, but I won't be waiting for it either since Pandoc Markdown is perfectly servicable as it is.
Exactly, the sublist newline stuff is a total nonstarter for me. Sorry, I guess I'll run a markdown parser that takes an extra second or whatever to run.
If you just want the expressiveness of Markdown then that's fine, but this is targeted at the same space as AsciiDoc - writing big documents and even books. It's going to be painful doing that without the ability to add footnotes, cross references, figures, notes, etc. etc.
I mean you can do it - look at all the RFCs for example - but they must have been unpleasant to write and they're certainly unpleasant to read.
Agreed with the exception that djot also applies this to nested lists, which I think is a huge misstep. User ergonomics are more important than convenient parsing, and while many markdown parsers interpret your example in an un-ergonomic way, djot rejects this:
The motivation for this choice is not "convenient parsing" but deeper considerations of language design. As explained at https://github.com/jgm/djot#rationale , this choice follows from two desiderata: (1) "The syntax should compose uniformly, in the following sense: if a sequence of lines has a certain meaning outside a list item or block quote, it should have the same meaning inside it." (2) "The syntax should be friendly to hard-wrapping: hard-wrapping a paragraph should not lead to different interpretations, e.g. when a number followed by a period ends up at the beginning of a line." The document explains the compromise we made in commonmark to avoid the need for blank lines. Djot tries to be more principled.
This. The rationale of Djot's design decisions are sane. Heck, it's even understandable that lists 'require' the blank line. Yet blank lines in lists really break the reading flow. It would be worthwhile to sacrifice a pure design decision in favor of practicability and readability.
This is so horrifying I did not believe you and had to go test to make sure. Jesus that's awful. Force a newline on me before a list? Ugh but fine. Force it for every sub bullet? No freaking way, that looks catastrophically bad.
This forcing new lines thing almost totally eliminates any desire I have to experiment with it, when I was very excited about it when I first saw it come up. Hopefully they'll change their mind. They want source code to be readable then make a decision that makes our eyes bleed.
I wonder how much of that is famliarity? I find Asciidoc to be much closer to what I would write, as well as being more full-featured. It's much worse for parsing than even CommonMark though.
> In djot, we just get rid of indented code blocks. Most people prefer fenced code blocks anyway, and we don't need two different ways of writing code blocks (goal 11).
Sensible. Mostly since it makes other things easier (goal 5), second because one thing is only represented in one way, and thirdly (least important) since indented code blocks are kind of a pain to format compared to fenced code blocks.
There are two types of fenced code blocks, and the one most people write, the one with backticks, basically can't be written in some languages because their keyboards like a backtick key.
So this is what happened to the energy that had been going into CommonMark. I'd been wondering, because the spec releases for CommonMark have been slowing down for quite a while [1], and it felt like certain issues just never saw attention [2]. I'm glad that energy didn't just dissipate and actually went into something useful.
On a practical level, nearly everything this does is better served by Pandoc Markdown today. But maybe in some not-so-distant future I'll be able to make use of this.
This looks pretty intersting. Besides, I find it fascinating how jgm apparently is happy on both sides of the "typed language" spectrum. pandoc in Haskell, and djot in Lua. True, pandoc has acquired a lot of Lua in the recent years, but still. I wonder how it feels switching from the cozy safety of a language like Haskell, back to the "good old" interpreter days. I personally have dropped writing anything larger then 1k lines in a dynamic language. Have been bitten by my own inconsistencies too much in the past.
I wonder what someone like jgm has to say about the difference of writing Haskell and Lua.
My guess about the use of lua would be simply because he don't want to commit it to pandoc yet. If you push it to pandoc, it's next to impossible to remove it, or change its behavior in a backward incompatible way.
So naturally, since the embedded lua is expanding its functionality over the years, including the ability to write custom writer and reader, this is the way to be able to tap into pandoc without committing. Tapping into pandoc is important here, as the community who likely will provide feedbacks and insights are there.
I'll bet eventually when it becomes mature, a native Haskell implementation will comes.
Given the ubiquity of Markdown, and how painful it is to build a completely compliant parser, I really hope Djot (or something like it) would take off.
Shame that the creator of Markdown blocks any efforts to to fix or standardise the format.
…and threatened legal action against anyone using the word “Markdown” in a way he did not approve of.
Jeff Atwood goes out of his way to be courteous to Gruber in this post, but frankly, I think Gruber was being a jerk here, using his claim to the name to tyrannise an open source community that he has otherwise not been involved with in the slightest the last ~17 years.
If Github supported it alongside markdown, this would drive the markdown parsers to support this as a flavor, and then the community could decide; it will be fascinating to analyze the open-source population.
I'm finding dozens of applications for a lingua franca of mildly-structured, human-readable language that transforms as easily into data as documents and UI's. I've built out 10+ markdown applications using 3-4 parsers, and always had to reduce the effective semantics to a small subset of features, and then had to use metadata or convention to do what attributes will do much more cleanly.
As for Lua: yes it's the best for integrating into pandoc performance-wise, but ay! the table structures and metadata tables to get anything like typing are mind bending, and I am still miffed at being blocked at a critical time by a bug in pandoc's lua initializer for tables.
Aside language/stdlib developers, jgm is my greatest benefactor, and he deserves a cadre of loyal and competent implementors.
pandoc is interesting because with markdown it currently converts the "alt text" for figures ![alttexthere](img.jpg) in markdown->pdf conversion into the figure captions, which is cool but makes me think: it is not alt anymore, anyone should have it
I never realized the problem with markup until that phrase "light markup". The problem is that it's designed for a human to edit it by hand with a text editor. It's a programmer designing for a programmer, rather than for a user. It's a plumber designing a sink. A mechanic designing a radio. A busboy designing tableware.
What we should have instead of markup, is a WYSIWYG with keyboard shortcuts. Confluence, for example, will convert Markdown into rich text in real time, and has keyboard shortcuts for its other layout/style options. But the point is to edit it in a GUI, see your changes live, and not need to learn a language in order to edit a document. There are so many problems you avoid by giving the user tools to make their life easier. Markup may be one tiny part of that, but it shouldn't be considered the complete solution.
I very much enjoy being able to type up my documents and notes with just a little bit of syntactic sugar in ANY text editor on any OS, instead of having to move my mouse over to the "make a bullet list" icon, or having to memorize a dozen keyboard shortcuts.
This is a popular opinion and there are tons of implementations of WYSIWYG text editors such as Typora and Obsidian that attempt to make their data model simple like Markdown. But they all kind of suck, which shows that making that kind of editor is really hard.
> say, by clicking on the display itself, clicking "Menu -> Layout -> Options" selecting the layout we want, and seeing it displayed immediately
Maybe I am misunderstanding what you are saying. Isn’t this the same as LibreOffice Writer and Microsoft Word? What is the existing problem that such an interface would solve?
As someone who's been using Markdown since before it was cool, I love it! I think writing the implementation in Lua is an interesting take since Lua out-of-the-box does not support standard regular expressions; it instead has its own pattern-matching thing which is a bit more limited. But it looks like they've embraced that limitation to force themselves to write something that doesn't need a full regex library to be sanely parsed.
My biggest complaint is that asterisks map to <strong> and underscores map to <em> (in HTML terms). This is not backwards-compatible with Markdown where (asterisk)foo(asterisk) gets you <em>foo</em>, and it feels objectively backwards, if that makes sense. I wonder if there's any chance they could reverse that.
> My biggest complaint is that asterisks map to <strong> and underscores map to <em> (in HTML terms). This is not backwards-compatible with Markdown where (asterisk)foo(asterisk) gets you <em>foo</em>, and it feels objectively backwards, if that makes sense. I wonder if there's any chance they could reverse that.
Interesting. I suspect something like this will always be subjective, but I find the opposite to be true. *bold* and _italics_ make the most sense to me and is always what I wished Markdown did.
Probably, this is because I was familiar with Textile[1] before I used Markdown, and this is what it does.
Today, Slack also uses this convention instead of the Markdown convention (though I believe it _used_ to use the latter).
> Interesting. I suspect something like this will always be subjective, but I find the opposite to be true. bold and _italics_ make the most sense to me and is always what I wished Markdown did.
Yeah… After posting that message, I remembered that _foo_ in Markdown also results in <em>foo</em> - so the underscores are backwards-compatible. But I've just always used asterisks so I completely forgot about it. So I guess they were bound to make some people upset no matter which one they made <em> and which one they made <strong>, and I'm on the losing side. :P
Whatever. If this ends up taking over the world as Markdown did (and I hope it does), I'll just get used to it, I suppose.
Incidentally, how did you "deactivate" HN's parsing of the asterisks in your reply?
The odds of replacing markdown and all it's issues seem nearly impossible given its ubiquity and I've run into many of those problems but, this seems just as arbitrary in many ways,
For example:
> Block-level elements can't interrupt paragraphs (or headings), because of goal 7
It then goes on to show they do interrupt paragraphs
- this then - this other thing
vs
- this then
- this other thing
The 2nd is 2 list items but it's just the first with being interrupted by a block-level element.
I think you're missing the point (or I'm missing yours). I want both examples above to be one line item, not 2. djot says a blank line is required to start a new block but then sabotages itself by making an exception for list items because the author wants more compact lists. I'd prefer no compact lists rather than strange exceptions to the rules.
Verbosity is the obvious answer, but this past year I stumbled into a conclusion that wasn't obvious to me before: "semantic" HTML isn't serving authors' needs--it's serving the UA's needs.
Aside from the sibling answers, Markdown, though originally intended solely for HTML output, is useful for writing other types of documents; I've written an e-book which was eventually destined for PDF format as a series of Markdown files. I could have used HTML instead, but aside from being more difficult to write, Markdown's document orientation solves some problems with that (should the HTML-to-PDF translator handle CSS and evaluate JavaScript? How does it deal with <audio> or <video> tags?) by not making them a possibility in the first place.
Similarly, for things like comments on message boards or blogs, a user can just dump a bunch of text into the text box without knowing the first thing about Markdown and expect it to look more or less like how it was entered, with paragraph breaks and such. If you force these people to use HTML instead, you're forcing them to at least learn and use <p></p> - which is probably simple for those of us reading HN, but I don't consider it a reasonable request for the normies on Reddit.
So, sure, HTML is quite good at what it was invented for, but not everything that involves text input on the internet or elsewhere should be HTML.
Love it, will see how many places I can get this introduced. The things that pissed the author of this off are precisely the things that piss me off. Unconcerned with the newlines in list syntax; that's how I'd write them myself, anyway. I like many things about this, including the various uses of curly braces.
Wow, this is all I ever wanted. The only thing that's maybe missing is a natural way to do captions for images and tables, and a syntax for spoilers. But these can probably be built on top of the div and span syntax.
I like the ideas behind this but I do not think this will take off as much as CommonMark has. My main reasons for thinking this:
1. Users don't really care about edge cases. CommonMark (cmark-gfm really) is "good enough". If something looks odd people go "oh, strange" and make a 3 second edit and forget about it.
2. There's already multiple parsers for every language for markdown. Here there is a single Lua implementation.
3. GitHub is the king.
If it cannot render in GitHub it's not real. If it's harder for me to write in the 99% case then it's not worth it. If it's harder for me to implement within my application then I'm too lazy to switch.
Nice things in here:
1. Less ambiguous parsing. For the parser infrastructure I support at $JOB this makes supporting questions from users a HUGE pain in the ass.
2. AST-first design. Seems like everything appears in the AST (even references!) which is a huge win over cmark-gfm. There's also source positions included which is a MASSIVE win for tooling.
3. Custom Attributes implementation. This is really nice for extensibility. It is less readable but having it as an option is nice.