Getting the footnotes right is going to be really tricky. Sometimes I couldn't even read the superscript numbering on the original scans. And that was after zooming in to the max.
Reliably identifying the superscript locations should be enough since they are in the same order as the footnotes.
It's a little early for feature requests... but I would love to see an EPUB edition! It shouldn't be too hard once done with the hard work of getting the data structured structured.
Yes. The original idea was to have some LLM place footnotes references in the text, based on the content of the footnotes themselves, but as I say in the blog post, that failed spectacularly.
Now another idea is to manually put placeholders for footnotes references in the text, and then number them automatically. Before that, I manually enter the number of footnotes on each page, for verification. I have already done this for the first two volumes, it's pretty fast. Having the number of footnotes on a page lets:
- check that the number of footnotes is correct
- (and therefore) also check that footnotes numbers are also correct (from 1 to n, in order)
- also check that the number of footnotes references is also correct (should exactly match the number of footnotes)
- and finally, properly number the placeholders.
Manually inputing numbers in the main text would be very difficult and error-prone, but simply putting placeholders and checking them automatically, should be much faster and safer.
Reliably identifying the superscript locations should be enough since they are in the same order as the footnotes.
It's a little early for feature requests... but I would love to see an EPUB edition! It shouldn't be too hard once done with the hard work of getting the data structured structured.