I do all of my academic writing in pandoc. As compared to LaTeX this means no bo...

icc97 · on Aug 28, 2018

Sorry to be pedantic - but I didn't think that 'pandoc' was an actual document format purely a tool for converting between formats. Do you mean that you do your writing in kind of a 'pandoc flavoured' markdown? [0]

[0]: https://pandoc.org/MANUAL.html#pandocs-markdown

mort96 · on Aug 28, 2018

Well, between Pandoc's markdown flavor and how it has its own way of letting you insert latex code anywhere, you're not going to be able to process your document using anything that's not trying to be compatible with pandoc documents.

smohare · on Aug 28, 2018

I’ve never understood the impetus for not using full LaTeX in an academic contex, given that the boiler plate is so minimal and presumably one has a built up a personal template over time.

For blog posts and notes I see the appeal, since the boilerplate can be a hindrance to spontaneous writing.

CJefferson · on Aug 28, 2018

Latex can't produce web output, which is increasingly a target I want.

Also, Latex can't produce any output which is accessible to blind people (other than giving them the raw LaTeX). The PDFs latex produces are probably the least accessible format available (much worse than a word proeuced pdf, or some html). This matters to me, and should matter more to other people (in my opinion).

baldfat · on Aug 28, 2018

BUT that is what makes Pandoc powerful. You convert your latex or your whatever into: (Can we please add Racket's Scribble? It is by far the reason why Racket has the best documentation of any language. https://docs.racket-lang.org/scribble/)

Markdown, reStructuredText, textile, HTML, DocBook, LaTeX, MediaWiki markup, TWiki markup, TikiWiki markup, Creole 1.0, Vimwiki markup, OPML, Emacs Org-Mode, Emacs Muse, txt2tags, Microsoft Word docx, LibreOffice ODT, EPUB, or Haddock markup to

HTML formats

    XHTML, HTML5, and HTML slide shows using Slidy, reveal.js, Slideous, S5, or DZSlides

Word processor formats

    Microsoft Word docx, OpenOffice/LibreOffice ODT, OpenDocument XML, Microsoft PowerPoint.

Ebooks

    EPUB version 2 or 3, FictionBook2

Documentation formats

    DocBook version 4 or 5, TEI Simple, GNU TexInfo, Groff man, Groff ms, Haddock markup

Archival formats

    JATS

Page layout formats

    InDesign ICML

Outline formats

    OPML

TeX formats

    LaTeX, ConTeXt, LaTeX Beamer slides

PDF

    via pdflatex, xelatex, lualatex, pdfroff, wkhtml2pdf, prince, or weasyprint.

Lightweight markup formats

    Markdown (including CommonMark and GitHub-flavored Markdown), reStructuredText, AsciiDoc, Emacs Org-Mode, Emacs Muse, Textile, txt2tags, MediaWiki markup, DokuWiki markup, TikiWiki markup, TWiki markup, Vimwiki markup, and ZimWiki markup.

Custom formats

    custom writers can be written in lua.

https://pandoc.org/

CJefferson · on Aug 28, 2018

Well, except LaTeX probably isn't the best base format to write in -- Pandoc's LaTeX parser isn't very good, it doesn't parse (from a quick check) any of the papers I've written. They've tried hard, but I think it's a losing battle, particularly once people start using a large range of packages.

That's not surprising -- it's basically impossible to "parse" LaTeX, as it's defined by execution.

babahoyo · on Aug 28, 2018

iirc pandoc's markdown provides the set of functionality that one is capable of transforming back and forth. So as long as you stay within those formatting confines, you are set.

This works for everything except table notes a la ```threeparttable```

lqet · on Aug 28, 2018

What about htlatex? It is quite powerful. In most of the cases, it produces nice HTML pages out of the box, with automatic rendering of figures and mathematical equations into PNG. It is part of most LaTeX distributions. On Linux, for example, just type

  $ htlatex mydoc.tex

instead of

  $ pdflatex mydoc.tex

michaelhoffman · on Aug 28, 2018

For me, at least, htlatex never works just quite right. There are a lot of edge cases where it's broken. If you want to preserve having non-PDF output, starting in something like Pandoc Markdown is a better idea. And I do most of my documents in regular LaTeX.

voltagex_ · on Aug 28, 2018

>Also, Latex can't produce any output which is accessible to blind people

This sounds like it should definitely be a target of a grant. I guess most government organisations around the world are using Word et al, which isn't too bad these days accessibility wise (AFAIK).

Can you provide a small example of a LateX document that produces an inaccessible PDF?

CJefferson · on Aug 28, 2018

If you grab any academic paper (particularly two columns) there is a good chance getting the text out will be hard, and any part of the paper with maths or tables will be unusable. Sorry. I'm away from a computer now, to make a smaller example.

masklinn · on Aug 28, 2018

The paper "GADTs Meet Their Match" (first I had in my list) seems to work fine, but I don't know what it was generated with.

voltagex_ · on Aug 28, 2018

cairo 1.13.1 is listed as the generator.

http://checkers.eiii.eu/en/pdfcheck/?url=https%3A%2F%2Fwww.m...

The ACM template fails more! http://checkers.eiii.eu/en/pdfcheck/?url=https://www.acm.org..., and it's generated by pdfTex-1.40.15

CJefferson · on Aug 28, 2018

I'll pick on one of my own random papers:

https://www.cs.york.ac.uk/aig/projects/implied/docs/cp03.pdf

Try extracting "Theorem 2" on page 5, or any text really. I just get random noise through either a PDF reader, or something like pdf2ascii / ps2ascii.

We just made this with standard latex.

mkl · on Aug 28, 2018

Any chance you could post the source code for this? It's using bitmaps for characters instead of proper fonts, which shouldn't happen nowadays. Maybe you should put "\usepackage{lmodern}" at the start? See for example https://tex.stackexchange.com/questions/1291/why-are-bitmap-...

I work with course materials made in Latex, and students sometimes need/want to copy and paste from them, so I try to avoid these kinds of problems.

Gorgor · on Aug 28, 2018

That’s interesting. Did you \usepackage[T1]{fontenc}?

CJefferson · on Aug 28, 2018

Thanks paper is from 2003, so I'm not sure.

This is just an example. From experience, most PDFs at conferences and journals, generated from pdf, are not accessible to varying degrees.

jimhefferon · on Aug 28, 2018

Accessibility is a big current push from the TeX Users Group. The president, Boris Veytsman, has made moving it forward a big goal. I know that a lot of people are working on aspects of that, but the name I hear the most is Ross Moore, who I have heard talk on making the output be PDF/A-3a compliant. I understood that it is a long way there.

CJefferson · on Aug 28, 2018

I hope so, because honestly, Tex generated PDFs are the single biggest problem with being a blind researcher (I'm not blind, but I know a blind researcher).

BeetleB · on Aug 28, 2018

>I’ve never understood the impetus for not using full LaTeX in an academic contex, given that the boiler plate is so minimal and presumably one has a built up a personal template over time.

I don't find the boilerplate minimal at all. Contrast the following:

    \begin{itemize}
     \item First
     \item Second
     \item Third
    \end{itemize}

with

     - First
     - Second
     - Third

I won't even get into the hell that is tables.

I loved LaTeX until I discovered Org Mode. Pandoc also scratches the same itch.

susam · on Aug 28, 2018

I agree. If one is going to use LaTeX directly or indirectly via Pandoc, eventually one would have to build up a personal template to fine-tune the look and feel of the documents.

If one is going to write LaTeX code anyway, it seems easier and cleaner to use LaTeX all the way, move all the boilerplate along with the personal template to say, a file named preamble.tex, and \input{preamble.tex} in the documents.

However, there are situations where Pandoc can be convenient. For example, I wanted a document[1] to be written primarily as README.md (CommonMark format), so that GitHub could render it as the project README. At the same time I wanted to render a PDF output from a customized form of the content. Pandoc is convenient for cases like this although it takes a bit of work to fine-tune the formatting and customize the content for each output format.

[1]: https://github.com/susam/gitpr

[2]: https://github.com/susam/gitpr/blob/master/Makefile

BeetleB · on Aug 28, 2018

>If one is going to write LaTeX code anyway, it seems easier and cleaner to use LaTeX all the way, move all the boilerplate along with the personal template to say, a file named preamble.tex, and \input{preamble.tex} in the documents.

Not sure why you think it has to be that way. I author LaTeX documents using org mode. Org mode handles most of the boilerplate, and I can still put pretty much any custom LaTeX within the org document, wherever I want it (this includes \newcommand, etc). I lose nothing by going to org mode, and I gain much in terms of reduced boilerplate.

tonyarkles · on Aug 28, 2018

Yup. I’ve got a pandoc template for doing org-latex-pdf conversion, as well as some org templates for common documents that my clients need. Hack away on the document in org (which I’m probably going to be doing anyway, since the rest of my life is in there too), and then when it’s ready to hand off, turn it into a PDF using a shell script.

My absolute favourite moment with that flow was a client who wanted one as a docx instead of a PDF. Pandoc obliged and they commented that I must have spent a lot of time reformatting things for them :)

Nullabillity · on Aug 28, 2018

Why not use Org's built-in org->latex->pdf exporter? AFAIK Pandoc isn't compatible with many of the more interesting Org features, such as Babel.

tonyarkles · on Aug 29, 2018

That's a good question! The flow started out as markdown->latex->pdf via pandoc, and then when I got back into Org, it just slid right into that workflow to replace Markdown.

I'm curious now though... maybe I'm missing out!

foo101 · on Aug 28, 2018

It isn't clear to me whether you are saying that Pandoc is necessary or if you are saying that Pandoc is unnecessary and LaTeX alone is sufficient for all purposes.

I think your parent comment was saying that LaTeX alone is sufficient. You also seem to be saying that LaTeX alone is sufficient while using Org mode. Would you please clarify if I am interpreting your comment correctly or not?

BeetleB · on Aug 28, 2018

>It isn't clear to me whether you are saying that Pandoc is necessary or if you are saying that Pandoc is unnecessary and LaTeX alone is sufficient for all purposes.

I'm not saying either. The parent said it's easier and cleaner to use LaTeX all the way. I was pointing out that it is easier to write in a format like Org mode and export to LaTeX (whether via Pandoc or Org mode's built-in exporter).

Of course LaTeX is "sufficient". It is also, IMO, painful.

brennebeck · on Aug 28, 2018

Pretty sure they are saying pandoc is unnecessary.

sevensor · on Aug 28, 2018

I wrote my dissertation using Pandoc. It might seem that the LaTeX boilerplate is minimal, but Markdown is even more minimal, and it preempts the urge to fuss with your layout. Writing in Markdown means that you can wave your hand at the document and say, "It's a draft, I'll fix the formatting once I'm sure I even want this material." Afterwards, fixing the layout is really easy because you can drop raw LaTeX in wherever you need to, and you haven't wasted countless hours laying out a float you later end up cutting.

stdbrouw · on Aug 28, 2018

Having to use `\textbf{...}` is impetus enough for writing in Markdown instead.

nanna · on Aug 28, 2018

LaTeX editors have simple keybindings for this, like ctl-b., or C-c C-f C-b in emacs, which makes this kind of thing a non-issue for me...

Anonymous4C54D6 · on Aug 28, 2018

Just reading C-c C-f C-b is an issue for me and I haven't even tried to remember and type it yet.

disgruntledphd2 · on Aug 28, 2018

C-c C-f is a prefix key, you get bold with C-b, italic with C-i and so on. At least it was last time I used AucTex :)

hatmatrix · on Aug 28, 2018

I agree - while pandoc is great it's usually not 'one click' to any format, especially when have html or latex-specific markups.

It's not for everyone, but emacs+auctex really reduces the latex boilerplate (at least writing it) that I don't really feel it's a hindrance.

neves · on Aug 28, 2018

I didn't use LaTex for years, is it still a hell to make tables? And also very difficult to use templates to generate good looking documents that doesn't like an academic paper?

uvtc · on Aug 28, 2018

Yes! It's great to be able to put LaTeX-formatted equations directly into your pandoc-flavored markdown source file.

Incidentally, I really like the thoughtful syntax additions Pandoc makes over olde Markdown (eg., tables, definition lists, and span & div syntax as well). Such a great all-around doc tool.

dr_coffee · on Aug 27, 2018

What's your workflow for inserting and managing references

tekknolagi · on Aug 28, 2018

Not OP, but I used `+citations` and `pandoc-citeproc` along with a bibtex file that I managed by hand for https://bernsteinbear.com/dat-paper/ (a small senior project paper). It worked pretty well for me.

meanmrmustard92 · on Aug 28, 2018

Add bibliography=path/to/library.bib (and optionally specify a csl for bibliography formatting; I like econometrica) in frontmatter yaml. Insert citations with @bibcitekey. compile with --pandoc-citeproc filter.

sevensor · on Aug 28, 2018

It was a couple of years ago that I wrote my dissertation using Pandoc, so things may have changed. At the time, I started out using pandoc-citeproc with my BibTeX database, but eventually I needed more control over formatting and I switched to writing \cite everywhere. Even with hundreds of references, it only took an afternoon, so I'm happy I did it the way I did. My approach with Pandoc is to use it until you have to invest LaTeX-level effort into making it do what you want. At that point, swapping in LaTeX is rarely painful. Often you can get away with editing Pandoc's generated LaTeX and pasting it back in to your source.

mb2100 · on Aug 28, 2018

You can control the formatting pandoc-citeproc (which is now built in to pandoc) produces with a CSL file. That's great if your institution provides one, otherwise... you'll have to learn CSL ;-/

ryanmarsh · on Aug 28, 2018

I use pandoc-crossref.

criddell · on Aug 28, 2018

> if the publisher 'needs' a Word file, you are one click away from providing it

Once the work has moved into a Word file, isn't that where it stays? Editors and publishers often make heavy use of features like track changes and notes. Doesn't pandoc lose that information?

aquova · on Aug 28, 2018

It does. I think the assumption here is that the author is the only contributor to the document. Exporting into a Word doc would serve the same function as exporting to a .pdf, others could read it and even mark it up, but the author would have to make the noted changes in their original plain text document themselves.

mb2100 · on Aug 28, 2018

pandoc has a --track-changes option, so you can convert a docx file with its proposed changes back to, say, markdown.

jonathanstrange · on Aug 28, 2018

I tried and it didn't work for me. Pandoc's conversion functionality is good but unfortunately also fails very often, at least in my experience. I suppose with custom templates and a lot of trickery I could get it working for the kind of papers I write, but I've found it easier to convert LaTeX to Word manually when needed - which is a pain in the ass, too, of course.

nanna · on Aug 28, 2018

In my experience it works so long as you keep to very vanilla LaTeX code. Pandoc's support for LaTeX packages tends to be very patchy.

baldfat · on Aug 28, 2018

I have to put in a word for Racket's Scribble. Programmiclly creating documents is powerful, and this system makes it simple. You can also basically use it as a "Markup-less" system.

Scribble Code Example:

#lang scribble/base

@title{On the Cookie-Eating Habits of Mice}

If you give a mouse a cookie, he's going to ask for a glass of milk.

@section{The Consequences of Milk}

That ``squeak'' was the mouse asking for milk. Let's suppose that you give him some in a big glass.

He's a small mouse. The glass is too big---way too big. So, he'll probably ask you for a straw. You might as well give it to him.

@section{Not the Last Straw}

For now, to handle the milk moustache, it's enough to give him a napkin. But it doesn't end there... oh, no.

Scribble -

Scribble is a collection of tools for creating prose documents—papers, books, library documentation, etc.—in HTML or PDF (via Latex) form. More generally, Scribble helps you write programs that are rich in textual content, whether the content is prose to be typeset or any other form of text to be generated programmatically. - https://docs.racket-lang.org/scribble/

Some languages based on Scribble

Skribilo -

Skribilo is a free document production tool that takes a structured document representation as its input and renders that document in a variety of output formats: HTML and Info for on-line browsing, and Lout and LaTeX for high-quality hard copies.

The input document can use Skribilo's markup language to provide information about the document's structure, which is similar to HTML or LaTeX and does not require expertise. Alternatively, it can use a simpler, “markup-less” format that borrows from Emacs' outline mode and from other conventions used in emails, Usenet and text. https://www.nongnu.org/skribilo/

Pollen -

Pollen is a publishing system built on top of Scribble and Racket. So far, I’ve optimized Pollen for web-based books, because that’s mainly what I use it for. But it can be used for small projects too, and non-webby things like PDF.

As a publishing system, Pollen includes:

    A programming language. The Pollen language is a variant of Scribble, with specific dialects tailored to different kinds of source files. You don’t need to use the programming features to do useful work, but they’re available when you need them.

    A set of tools & libraries. Pollen can produce output in any format, but it’s especially useful for markup-style formats like XML and HTML.

    A development environment. Pollen works with the DrRacket IDE. It also includes a project web server so you can dynamically preview and revise your publication. http://docs.racket-lang.org/pollen/Backstory.html

They are Domain Specific languages that excel at outputting awesome HTML and PDF. They really aren't markup but really they are a Macro system that is built on top of a full Lisp (Racket) It is easier and much more powerful then anything I have seen on Pandoc and Latex (I use Latex still for specific targets but not for general papers anymore).

Racket has the best documentation period and it is because the documentation