Hacker News new | past | comments | ask | show | jobs | submit login

I was going to suggest to use the same approach as the old CD tagging systems. Count the number of words in each chapter to create a "book fingerprint".

It's highly likely to be globally unique, and it can also help with the missing forewords/afterwords/bonus content sections.

In addition, you can also add fuzzy matching for the title.




I think that the thing we need to account for (which, number of words per chapter would capture this, I think) is different publications of the same book, which would need different overlays if they have different chapter filepaths, etc.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: