Hacker News new | past | comments | ask | show | jobs | submit login
My Mathematics PhD research workflow: LaTeX notes and instant pdf referencing (castel.dev)
236 points by todsacerdoti on April 10, 2022 | hide | past | favorite | 63 comments



Another tip: the vast majority of ArXiv papers have the LaTeX source available under "other formats". I used to download those instead of the pdfs for better searching. Now I don't really bother keeping a personal collection of papers, however.

Downloading the LaTeX source also has the advantage that you can see the comments the authors made that don't show in the pdf version. Sometimes appalling: "% Author 1: Is this lemma even true? This proof seems like bs to me % Author 2: Eh screw it who knows, it's probably true."


A good one I saw once (and saved me tedious image->data extraction): something like "Why are you trying to copy this image for the data, find the full data tables at..."


You just strengthened my opinion that PDF is a format of the past, considering people prefer to read the LaTeX source instead of the finished document.


Let me quote an electrical engineer who thought me about norms: "Always use PDF because it comes accross more real, because it cannot be edited (!)".

This is the perception non-IT-people have of PDF in comparison to e.g. .docx or similar editable formats those people know.


Personally, I would like to see a pdf-like format with video/animations/gifs, possibly the ability to play with models and variables.


PDFs can already contain videos and JavaScript. But given the security problems, I consider it a feature of most non-Adobe PDF viewers to not support that.


Right, I hadn't thought about those issues, and it's impossible to expect that venues and conferences will be capable of vetting the documents without introducing inefficiencies.

Perhaps we should move to sandboxed webpages embedded within electron, where each paper is its own page?


AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHHHHHHHHHHH!


I take that as an emphatic no.


I'm another academic mathematician. I'm not really convinced that all this would be a time-saver for me as far as research productivity; maybe I'm just old-fashioned but I just have a few folders for paper notes and that's it. But certainly this could vary from person to person.

To me, the place for the real time savings is organizing your teaching workflow. If you're a grad student, make sure to get on top of that ASAP. A lot of it is very repetitive both within a given semester and also from year to year. I have a ton of scripts built up over the years for things like organizing lecture notes (especially if you are teaching a class you have taught before, you don't want to have to rewrite all your lectures), creating problem sets / solutions, inputting grades into the LMS,... I even have automatic solution generators for a few rote calculations that I assign many problems about (Gaussian elimination, for example).


> maybe I'm just old-fashioned but I just have a few folders for paper notes

paper, like.. pieces of paper.. yes, that is old-fashioned! however, in deep math, maybe this is something.. I was told after attending a world-class lecture week in pure math, that it is common for PhDs and post-docs in math to have very little to talk about between each other.. since their interest and subject area is almost certainly deeply buried in some specialty. Aside from the quip "all roads lead to cryptography" perhaps just an excuse for certain people, I can see how this "different worlds" problem in post-doc math might be real..

The fellow writing this post is first-year however.. maybe its different.. Also I just cloned and built pdfgrep .. I am definitely going to try that out immediately .. thx!


For me at least my reasoning for trying new things to make notes with is that I very very rarely actually read them again, so I need to get as much learning done from writing the notes, and to do that I need to write more notes e.g. make writing the notes fun.


After I taught a few semesters ago I took the material I had prepared and turned the skeleton into a github template https://github.com/evanberkowitz/course-base

I keep questions, slides, written notes, etc. controlled and semester-specific things in a directory that isn't tracked.


Author here, if you have any questions, feel free to ask them.

It really is amazing how hard it is just to retrieve the currently opened pdf file and its page number in a pdf viewer. Some pdf viewers (Like Zathura) provide this via DBus, but even very common ones like Evince don't. I managed to find a way using gvfs, although it's a bit of a hack.

For others (e.g. Mendeley), I have no idea on how to do this... Anybody have ideas? it is Qt based, maybe I can hook into that via some debugging tool?


The solution is emacs. I know you’ve got a lot invested in vim already. You can get instant LaTeX rendering with AUCTeX, view the pdf in emacs with pdf-tools, which can be used with something like org-ref (there are many bibliography packages) to query a db with the DOI of the pdf. (Try saying that two times fast). This EmacsConf talk should give you a taste of what I mean:

https://emacsconf.org/2021/talks/research/

Your old post with instantly rendering latex snippets actually inspired me to start using vim/LaTeX. This opened up a whole new world for me. I started programming, got back into my engineering degree, started using emacs/org-mode and have nearly finished the degree. Thank you, from the bottom of my heart. I don’t know what I’d be doing if I hadn’t seen your post. It revealed a way of using computers that I had never seen before.



Yes, this is the one. I couldn’t help but want the same speed and beauty as in that post.


Where are you taking your degree?


Monash university


Try emacs and org-mode sometime. Configurability is off the charts and vi keybindings are easily configurable (and so are the snippets). I loved the workflow, I believe that a mind like yours will also find emacs elisp captivating and I would love to see what you could come up with in that much more flexible ecosystem (I am a former Vim evangelist)


What does org mode have to do with the problem where you have a document open in a PDF viewer and you want to retrieve the location of the document in the file system and what page is currently showing in the viewer's window?


The fact that pdf-tools (in emacs) makes this an utterly trivial task.


If you're using Org-ref, and have used that to drill down into your pdfs then you always have the document context (and bibtex etc context) from within emacs. But I don't think that is the OPs use case, so apologies for the noise. Also careful use of unix find with pdfgrep can be handy.


I guess it will be easier to get it working with Zotero, as it's open source, and I think it even supports custom plug-ins.


Yes, Zotero does support custom plug-ins ("add-ons").


On recent-ish versions of evince it is possible to do a bit more via dbus. For example, forward and reverse synctex search between evince and vim over dbus is done at [1]. I use this when I use evince, but I acknowledge that it has stupid flaws.

[1]: https://github.com/peterbjorgensen/sved


Having written this, I am actually completely incapable of retrieving the page number over dbus. (As you had mentioned). I'm a bit surprised that forward and reverse search can still be done.


No questions. You're a pretty swell dude and I used a lot of your advice while writing my thesis. Love your work.


Have you tried contacting the developers? This seems like a feature that every pdf viewer should have.


I recommend two upgrades over what the OP recommends. For electronic handwritten math, use ZoomNotes on iPad, which has a quirky aestetic but also a bunch of powerful features and the extremely satisfying eponymous infinite zoom.

http://www.zoom-notes.com/

For organizing a personal library of academic papers, use Zotero, which has a great interface for sorting the papers, annotating the PDFs, and exporting them as BibTeX. (It would presumably be harder to get the OP's single-click PDF links working, though.)

https://www.zotero.org/

In particular, the Scite add-on for Zotero can also tell you which other papers have cited the the paper you are reading in a supporting way vs. a disputing ("contrasting") way, which is really cool.

https://medium.com/scite/introducing-the-scite-plug-in-for-z...


Specifically being able to create a deep link to a passage is a very powerful capability indeed. The friction required to look up a paper and navigate to the specific passage for a given reference can really be so high that we just end up avoiding it altogether I feel. Kudos to you for trying to solve this problem!

I personally find LaTeX a little too friction-full (is that a word?) on the input side. The output looks beautiful but the lack of feedback when writing stuff keeps me from actually adding stuff to it. Although your daily notes seems like it might help with this tendency a little bit.

This is a problem I'm currently trying to solve with my current project (https://topictrails.com/ if you're interested).


Modern tools like Overleaf are great, especially for collaboration. But it's still extremely annoying having to google every little thing like "how can I move this element on the slide a bit to the right?" or even things that you'd expect would be easier in a software designed for mathematicians like "how can I format this common type of math problem (eg an optimization problem) nicely?". Plain LaTeX's overall useability has strong 80's vibes and the output does look nice, but imho doesn't justify the effort (especially given that the competition like Word has really come of age wrt to equation editing). If it wasn't for tools like Overleaf I might have ditched it, but there is also strong peer pressure in math-heavy fields to keep using it.


Maybe Word has come of age in terms of equation editing, but has it come to age in terms of usability? I haven't used in about 10 years, but back then it crashed a lot on larger documents, and I suffered a lot from never knowing if I was inside the invisible tags or not, so I never knew when I was typing right after a list if I would start a new list item or not, or when typing next to italic text if the new text would be italic or not. I also remember that trying to move anything, like an embedded image, would move tons of other things too. I definitely found LaTeX much easier to use.

And that's excluding Word line-breaking which always looked rather poor to me: some lines were left very empty and others overcrowded. It just didn't look professional to me. Perhaps that was been fixed too.


I don't know whether to what version you were referring to, but my guess is that you might still consider indentation and text formatting unintuitive. But it is actually easy to pick up the rules that Word uses, and there are keyboard combinations as well.

It's a personal preference, but I think it easily beats the typical LaTeX workflow for most people I know, which starts with copying an old project or some template with a ~50 line preamble and then constantly having to turn to Google for literally anything that's a bit out of the ordinary - including a lot of things that really shouldn't be extraordinary for a math software, like typesetting optimization problems, conditional expectation, argmin, table footnotes,... all of which have multiple options and most don't look great. Heck, you even need to define a theorem environment manually in the preamble to get them to look like you're used to.

Since you complain about moving around figures, in LaTeX the workflow for most users is trial-and-error multiple option combinations to get them roughly where you want them. Just check the Google autocomplete options for latex and tell me that people aren't confused by those things.


I guess that, as most things related to using software, it boils down to familiarity. You say you managed to figure out Word's rules for when text will inherit formatting of the text to the left of it and when it won't. I believe you, I imagine I would have learned them too if I kept using Word. On the other hand when I taught Linear Programming I typeset lots of optimization problems and didn't Google anything, because I already know plenty of LaTeX. I certainly have googled stuff for LaTeX and found it is a quick and painless way of figuring out how to do something. This seems like an advantage of LaTeX, not a disadvantage as you seem to think. What do you do in Word when you want to do something you don't how to do? Doesn't googling also work there? And if not, that seems like a disadvantage of Word.


Googling also works for Word, but I practically never need it there. It's just so easy to find solutions to those questions that you barely perceive them as such. I even figured out the shortcuts just by trying something that seemed to make sense (fyi: Italics is Ctrl+I, bold is Ctrl+B; changing indentation levels in a list is Alt+Shift+Arrow, getting out of the list indentation requires three times backspace, which I agree doesn't make sense, but it's easy enough to figure out within seconds of the first encounter; and yeah, you can google those if you want). In LaTeX I need to switch to a different window get to Google much more frequently, and I find that breaks my flow. I don't see how this could be framed as anyhting but a disadvantage.

Anything becomes easy if you practice it enough, but I find both the 'onboarding' as well as the 'steady-state' productivity higher in Word, because you spend next to no time looking for how stuff works and instead focus on your content. Yes, LaTeX formatting generally looks better (eg spacing, LaTeX even lets you adjust the spacing between individual characters), but looks come second to content imho.


Personally for feedback I use editors like overleaf or the extension for vscode that autocompilea and displays a PDF next to the working TeX file. I prefer it over something like LyX or waiting to compile manually.


You may also want to check out sioyek which is an open source PDF viewer specifically designed for reading research papers and textbooks.

https://github.com/ahrm/sioyek

Disclaimer: I am the developer of sioyek


I just wish there was a good way to have something like Zeal[1] for PDFs. Navigating a big pile of huge datasheets and manuals (a processor TRM can be 5000 pages!) is such a huge pain and that's assuming they have decent TOCs, which many do not.

[1]: https://zealdocs.org


I use pdfgrep a lot.

Probably possible to convert pdfs to html (with the default pdf tools on nix - pdftotext pdf2txt et al) and pull them into zeal.


I had a similar idea that I would just add daily notes to in markdown/tex to add to a knowledge graph, but then I lost motivation.

I just started using the iPad to make handwritten notes. What I really want is a way for the handwritten stuff to be integrated into a knowledge graph, or even automatically generating tex/diagrams for me.

On top of that, daily scribbles don’t necessarily go anywhere, or are even wrong, so it can generate noise anyway. What was more valuable was consolidating a few weeks+ of progress into a more complete summary or set of notes. You can still explain the reasoning of how you got to something finished, but it’s just all correct.


Does anyone have a good reference for Tikz? I feel like I'm pretty good at it (at least better than most my peers) but I also believe it takes a significant amount of time to create interesting pictures and do even basic things that PPT or other basic tools can do easily. I want the full editing power of tikz but doing the basic parts is tough (same with manim, though lower barrier to entry).

As for writing documents with lots of math, absolutely no problem. I can churn that our faster than peers can with Word.


The best way to TikZ is to copy-paste from previous figures you've created.

The second best is to look for a similar figure on TeXample https://texample.net/tikz/examples/ or stack overflow https://tex.stackexchange.com/questions/tagged/tikz-pgf

The third best option is to use squared paper to draw by hand, then transfer hand-drawn stuff into TikZ code. It's slow as hell, but works well if you build up a collection of components you can copy-paste into other figs later.

There are also some GUI tools you could try: https://www.mathcha.io/ https://homepages.inf.ed.ac.uk/cheunen/freetikz/freetikz.htm... https://tikzit.github.io/ etc. (more links in this thread https://tex.stackexchange.com/questions/84890/does-there-exi... )


I know you're trying to be helpful, but the first two best options you suggest are already things I do. For the third, I'm not sure those are great options either since really it isn't the translating hand drawings to tikz that's the issue (the examples are pretty brute force too. Though I can see some use cases for mathcha.

I am more looking for tips and tricks that have helped people build their skills and streamline the process.


I have the best reference. I ask Till Tantau when I have any urgent tikz questions.

The advantage of working at the university of Lübeck: his office is next to my office.


Any advice you've learned along the way that you'd care to share? Otherwise I'm confused by the comment. Are you just flexing?


I am just flexing that I am close to a famous person.

If he writes me a snippet, I just copy paste it.


Texpad (https://www.texpad.com/) is an editor with instant compile which can make TikZing faster, especially while finetuning coordinates. I'm sure you're aware that the TikZ manual is very thorough; I've recently made an HTML version that's a bit easier to navigate than the monster PDF https://tikz.dev/.


Could you share a type of math with 1 or 2 concepts that would be good for computer science or solving problems in general. Reference to something not too difficult, but a "good to know" kind of thing. For example, something from upper division math.

Let's say in undergraduate I would say: Calculus I - understand what is derivative. Calculus II - understand what is an integral.

Math is broad, so maybe 1 or 2 takeaways from some topics in upper division math that I might find useful.


It's hard to capture an entire lecture series in one sentence, but one way you might categorise subjects by the type of problems they allow you to solve:

- Multiple linear equations over the rationals/reals => linear algebra

- Solve equations over the integers / integers mod p / factor numbers => number theory

- Solve polynomial equations / factor polynomials => algebra

- (Partial) differential equations => (partial) differential equations

- Approximate solutions to various equations over the reals => numerical analysis

- Counting the sizes of various finite sets => combinatorics

- Integrate wild functions => measure theory

- Formal understanding of real numbers => real analysis

- General framework for differential equations => functional analysis

- General framework for continuous functions and limits => point-set topology

- Prove that two elastic shapes are different => algebraic topology

- Prove that a knot in a circular piece of string cannot be untied without cutting the string => knot theory

- Determine optimum strategy in a game with incomplete information => game theory

- Describe very big sets / prove that certain things can't be proved => set theory

- etc.


Appreciate the reply jules! Thank you for this list. Just picking the brains here for those who have Phds in this fields and sometimes providing retrospect helps us more junior's in math.


If you had to pick one, I'd say the single most important high level math topic that'd be useful for computer science is Graph theory. There are so many applications of graphs in computer science.

The thing about math is that regardless of which field you work in the treatment is the same. You start by stating a set of axioms. Based on these axioms you set up definitions and then prove lemmas, corrolaries and theorems. This approach isn't really necessary or directly useful for computer science though. But it is very beautiful. For example, if you truly want to appreciate, in it's full force, why a player playing a fair game against a casino will always eventually lose (theorem called Gambler's ruin) you can study stochastic processes with full rigour (this is also a very useful area). But that full rigour isn't necessary to not blow your savings on games of chance :)


I wonder if those planning to stay in academia after grad school are more willing to invest in their own tooling like this.


I looked at the math in the article and I asked myself, what compels people to choose this path for their career? Nothing about it looks appealing. It looks like a bunch of gibberish for 99% of humanity and only 1% of people willingly go to school to learn this stuff. I have a lot of respect for people who have jobs involving this level of math and physics so dummies like me have the ability to type out this comment.


This only answers a small piece of your question, but it's important to realize that the mathematical notation you see is only how mathematics is recorded and communicated. Most of what goes into doing mathematics happens in your head. "Doing" mathematics requires you to know what the object of study is, and what tools have historically been brought to bear on it, and these are communicated and taught via notation. But once you've internalized those concepts, the notation doesn't usually play a significant role. Yes, you may trial your ideas with some calculations, and once you're convinced of something you need to prove it with some degree of rigor. But you don't usually set out to prove something you don't already have some reason to believe in -- it's not like the math workbooks you may have encountered in school. (I second the sibling reply's mention of Lockhart's Lament!)


I've been trying to learn math for years but truly have no idea where to begin. I started with Kiselev's Geometry and Gelfand's Algebra. I used to hate Math. When I read Lockharts Lament some years ago and it brought me to tears. Ever since then I've had a big appreciation for it but no real time to get into it. Gelfand's Algebra is really cool because it starts with the most basic problems ever that a 3rd grader can do, and works it's way up. It's worth checking out!


You could say that about practically anything in academia, especially sciences and engineering. Not to mention, most code of big organizations would look like gibberish to an amateur programmer, the Linux kernel would look meaningless to someone starting to code etc.


Grad School in most STEM fields will look equally gibberish to you, so your question is almost like asking "Why do people do research?". The answer is to learn things that we didn't know before. In most STEM fields it's about how the world works, in math you discover eternal truths, which is pretty cool too.


Why do people climb mountains? "Because it's there"


Math for the sake of math can be very self fulfilling. Like an intellectual hobby (it just takes a lot of work and energy).


1% is 70 millions so not even that level of mathematics could you comprehend.


Most people would say the exact same thing about coding.


What no TiKz? :o




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: