Hacker News new | past | comments | ask | show | jobs | submit login

That’s a really interesting idea! The more I think about it, the more I like it.

A challenge I foresee is that the media overlays are only reusable if you have the exact same input EPUB file, and have processed it with Storyteller to mark up the sentence boundaries. EPUBs have unique identifiers, though, so maybe this would be fine! We’d need to add a new processing flow to Storyteller, but it should be doable.

Feel free to hit me up in the Storyteller chat if you want to discuss more! Thanks for sharing this idea!




It would be cool to do this with Project Gutenberg and LibriVox files, since they're all public domain works anyway.

The entire Great Books of Western Civilization are on both, and I know I'd make more progress on reading it if I could hand off between reading and listening more easily!


You could require that the input files have the same sha256 hash, that would presumably be more robust than trusting an ID from the file itself


Yeah I was toying around with that, too… but folks often mess around with metadata in tools like Calibre and Audiobookshelf in ways that wouldn’t have an impact on Storyteller’s sync, but would change their hash. On the other hand, I don’t know how various publishers handle EPUB dc:identifiers and that may not be robust enough, either. We could try doing something like hashing only the contents of spine items (including their file names, since that’s how media overlays refer to content)


I was going to suggest to use the same approach as the old CD tagging systems. Count the number of words in each chapter to create a "book fingerprint".

It's highly likely to be globally unique, and it can also help with the missing forewords/afterwords/bonus content sections.

In addition, you can also add fuzzy matching for the title.


I think that the thing we need to account for (which, number of words per chapter would capture this, I think) is different publications of the same book, which would need different overlays if they have different chapter filepaths, etc.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: