Hacker News new | past | comments | ask | show | jobs | submit login

Something I've been wondering: why do ebooks take so long to render? My kindle seems good at it, but opening an ebook in calibre/fbreader/etc can take minutes or even fail in some readers depending on the ebook.



I would guess there are multiple potential pitfalls here. Firstly, not all ebook formats are created equal -- Storyteller only operates on EPUB files, because EPUB is an open source format and it supports Media Overlays (read-aloud) natively. I can only really speak to that format, but there are others (MOBI, PDF, etc).

An EPUB is just a ZIP archive of XML and XHTML files (plus other assets, like images). Partly, I suspect, because of the dearth of actively maintained open source projects in the space, and partly because of the nature of tech in the book publishing industry, EPUB generation software used by authors and publishers often messes up this spec, which means that EPUB readers sometimes need to have fairly complex fallback logic for trying to figure out how to render a book. Also, because EPUBs are ZIP archives, some readers may either unzip the entire book into memory or "explode" it into an unzipped directory on disk, both of which may result in some slowness, especially if the book has lots of large resources. The newest Brandon Sanderson novel, for example, is ~300MB _zipped_.

Additionally, and perhaps more importantly, EPUBs (and I believe MOBIs as well) represent content as XHTML and CSS, which means that readers very often need to use a browser or webview to actually render the book. Precisely how they deliver this content into the webview can have a huge impact on performance; most browser don't love to be told to format entire novels worth of content into text columns, for example.


Additionally the XHTML content can just be a single large file instead of one file per chapter/section. Paginating and rendering the large single file is going to be more effort than the same on a smaller file. This is all on top of the pitfalls and variability you mention.


Yup, great point. Especially if you've used some tool to convert from another file, like a PDF, into an EPUB, you can easily end up with the entire book in a single XHTML file, which, again, can be pretty heavy for a browser to parse and format! I also have no idea whether Calibre et al actually use native web views, or have their own renderers, which are almost certainly less performant than native web views!


> Additionally the XHTML content can just be a single large file instead of one file per chapter/section.

Terry Pratchett books are notorious for that. Some tools EPUB authoring tools artificially introduce breaks, but you can't rely on them.


I used Storyteller to align the most recent Sanderson's novel on audio and the result is 1.7Gb. That's... painful. It resulted in it crashing the reader on Remarkable2 tablet.

I'm now actually working on a Calibre-Web change to strip the audio and media overlay from the books it serves via OPDS.

Then I'll need to tackle cross-device progress sync. This turned out to be surprisingly tricky.


You can’t do much better than that; that’s the size of the audiobook! For what it’s worth, I also used Storyteller on Wind and Truth, and got it down to 1.2GB by using the OPUS codec with a 32 kb/s bitrate.


Yeah. My current workaround is to create KEPUBs (Kobo-optimized epubs), but that creates an issue with cross-format reading progress sync. This is an interesting task in itself, though.

So I'm trying to design a progress sync protocol. My current idea is to just use several words from the text itself to unambiguously pinpoint the position within a section (chapter).


Is the idea that you have some devices that you want to download just the text to, but have it sync with your other devices? I think we could support that natively, honestly! Storyteller already has the input files, and it uses a text-based position system that doesn’t require the audio to exist. If you’re already doing work on this, maybe we could add it to Storyteller?


Ooh, that's an interesting idea. I only have one device where I would ever want to switch to listening to my book, but a couple others where I would like to read it.


FWIW, I wrote an EPUB (well, it was called OEBPS at the time) reader that rendered pretty much all of the format ~21 years ago (including all of XHTML and CSS) and it had very decent performance. I seem to recall that someone tried it on the One Laptop Per Child XO and it was... well, slow, but it worked.

So it's possible :)


Thank you so much! That's incredibly enlightening!


Of course! I'm hoping to have a web reader with Media Overlay support built in to Storyteller available in the next few months, along with some much needed library management tooling, so maybe that will be useful for you! I'll try to make it snappy :)


I find Koreader (linux version) leaps and bounds faster than the calibre reader.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: