Show HN: Auditus – Ebook to Audiobook Conversion

visarga · on April 1, 2018

I like your project, but I'd like it more if it was accompanied by a video of the text being highlighted as it is being read (a kind of dual modality reading - visual + audio). I created such a tool for myself on MacOS with the Alex voice, that works in browsers on any page and in PDFs. I find dual modality reading to enhance my focus a lot. Btw, I use this tool to read this very thread of comments.

And of course I would want to hear the amazing Wavenet voices used in this role.

nafizh · on April 1, 2018

Do you plan to share yours? Looks very interesting from the dual modality. I would definitely try it out.

visarga · on April 1, 2018

It's only for personal use, but if you want to try:

https://gist.github.com/Visarga/e6597edcb7ec6993521829ef5c17...

Also, it only works on MacOS because it uses the Alex voice which comes bundled with the system.

chairmankaga · on April 2, 2018

That's really cool, thanks for sharing!

I saw a site that highlighted the official audiobook/ebook of "The Lean Startup" on one page. I love this as a generic solution.

chairmankaga · on April 2, 2018

Also, whether by design or not, this links back to your personal website which leads to directly identifiable information about you.

Just FYI!

nafizh · on April 2, 2018

I just tried it, this is so awesome. Btw, so you have to manually select the text, and only then the voice option comes ?

hipjiveguy · on April 6, 2018

hey.... seems like this 404s now - do you have an updated link?

acafourek · on April 1, 2018

The DAISY markup spec is a cool project if you’re into the tech of text/audio sync.

Also check out LearningAlly.org- a non-profit that produces audio books with synced text highlights. They are specifically oriented toward students with reading challenges but I believe anyone with some certification of learning difference can join. (They require certification to avoid conflict with copyright protections)

Immortalin · on April 1, 2018

Funnily enough, I did look into it: https://www.youtube.com/channel/UCP4mlh8yZqOaEf0x7FKncIw

Have not quite gotten the whole video thing working yet

dnprock · on April 2, 2018

I made a similar tool and host some audio here. You can tap and read: http://book.vidalab.co/

firefoxd · on April 1, 2018

I have the habit of convert everything I write into audio first before I publish it. It's a good way to make your mistakes pop out.

I use fromtexttospeech[1] to convert to audio. Judging by the voices, it seems like the author is also using the same speech engine.

If I'm not mistaken, these are from Nextup's TextAloud software.

[1]:http://www.fromtexttospeech.com

Edit: Though this beats the 50k character limit

Immortalin · on April 1, 2018

AWS Polly

dsr_ · on April 1, 2018

It occurs to me that the mistakes in emphasis that I hear in these samples are the same mistakes I hear from young readers who are concentrating on decoding the words one at a time. The method that fluent text speakers use is to process the entire sentence while beginning to speak it.

I wonder if a better model of word emphasis considering whole sentences could move an automated reader out of the uncanny valley.

Immortalin · on April 1, 2018

Currently the text is run through a sentence segmented before conversion

danthelion · on April 1, 2018

Hey, I just created a similar application using the Cloud Text-To-Speech API from Google and the textract (https://textract.readthedocs.io) library to extract text from lots of kinds of documents:

https://github.com/danthelion/doc2audiobook

It runs inside a Docker container so fairly easy to try it out.

Immortalin · on April 1, 2018

Hi! I built this to have an easier way to convert ebooks to audiobooks! The backend is powered by AWS Polly. If you have any feedback or feature requests, please feel free to drop me an email at <last 3 characters of username> @ <myusername>.com

64738 · on April 1, 2018

What is the typical wait from payment to delivery? I know you can't be specific, but just ballpark it (eg. a few minutes, a few hours, etc.).

I'm not sure what the book's letter-count is, but the price was $2.82. It's only been about 10 minutes, so I'm not displeased, just curious what to expect.

Thanks!

Immortalin · on April 1, 2018

Please email me the file, it should not be more than 15 min but the server's rather overloaded right now

nirv · on April 1, 2018

It looks good! Could you add a plain-text (or even markdown) field as an input option? I'd be interested in trying this with blog posts and magazine articles.

(P.S. Kudos for the "Accelerando")

Immortalin · on April 1, 2018

Planned :)

gh02t · on April 1, 2018

This is great! I forwarded this to my girlfriend (who is blind) and she loves it.

llao · on April 1, 2018

On Android there is http://www.hyperionics.com/atVoice/ which I really like and which seems on a similar level (at least judging by the example).

Immortalin · on April 1, 2018

If you have any suggestions do let me know! I am currently working on adding support for PDF files, also considering adding a support for the new Google Wavenet speech synthesis but it is much more expensive (about 4x the cost) :(

laex · on April 1, 2018

I built something similar to listen to Paul Graham's essays It's a console app & uses OSX's "say" command for the TTS. Contributions are welcome. https://github.com/hemantasapkota/awesome-essays

lewi · on April 1, 2018

I get a metamask (ETH Wallet) phishing warning on this site. Anyone else experience this?

aik · on April 1, 2018

Interesting. The MetaMask phishing detector keeps a blacklist of URLs/domains and compares a site's domain against it using the levenshtein distance algorithm. So it could be a false positive. After a quick check I didn't find Auditus on there:

https://github.com/MetaMask/eth-phishing-detect/blob/master/...

dangoor · on April 1, 2018

This is cool and I figured we'd be heading down this path soon enough. A lot of the best audiobooks I've listened to were narrated by people that can do multiple voices well. I was thinking that being able to produce an audiobook that uses different voices for different characters would be great. Something like Narrator:

http://marinersoftware.com/products/narrator/

Narrator, though, uses Mac OS text to speech, which is nowhere near the level of Polly or Google Cloud Speech.

deepakb358 · on April 1, 2018

There are a few IOS apps that do this in real time. The best one by far is 'Voice Dream' and they use the same voices. It is basically and audiobook in your pocket anytime, anywhere for any text file and shows the words as it is reading back, start/stop/pause, adjust speed, change voice, etc etc. All around awesome. When the new google voices or equivalent make it to IOS, it will be almost human-like.

gnicholas · on April 1, 2018

This is a good example of a tool that was created for the accessibility community (vision impaired, dyslexic) and has subsequently been adopted by mainstream readers.

asveikau · on April 1, 2018

As a language nerd I would like to praise naming the project after a Latin past participle.

eejdoowad · on April 1, 2018

It's really cool to see the applications made possible by the high quality, reasonably priced, and fairly licensed text-to-speech APIs offered by AWS, Azure, and Google Cloud.

The most fleshed out service of this type that I've found is narro.co, which offers web/pdf/epub/video/rss/email/text to audio conversions.

ehudla · on April 1, 2018

What are the best practices for doing the reverse: taking audio and producing text? I don't mind the translation to be rough, the error rate can be quite high for my purposes, but I want the process not to get stuck and recover so it processes a full length talk.

Immortalin · on April 1, 2018

From my experiments generating subtitles from TV/movie audiotrack, 75% (worst case) to 95% (best case). If you model it as a standard distribution, somewhere around 85-90% accuracy. Most services provide much better accuracy for stuff like calls or conferences with proper microphones and minimal background noise than for things like TV shows and movies. If the input audio is noisy, I would do some noise filtering before piping it into conversion.

ehudla · on April 1, 2018

Which conversion tools/services do you have in mind?

Immortalin · on April 1, 2018

Google and Azure

fipple · on April 2, 2018

As an easier problem, what I’d find useful is a way to keep a pirated audiobook and pirated e-book in sync, the way that Amazon does with WhisperSync. A single app where I upload the .epub and the mp3s and it keeps me in sync when I read in either format.

bunchjesse · on April 1, 2018

You can also do this in iBooks with any of the built-in voices available on iOS.

Just turn on Speak Screen in Settings -> General -> Accessibility -> Speech and then swipe down with two fingers while reading your book. It'll even turn the page for you.

lighttower · on April 1, 2018

This is awesome. But doesn't Amazon bill you for usage? What's keeping you afloat?

Immortalin · on April 1, 2018

Each conversion costs a couple dollars depending on length - cheaper than most audiobooks at the expense of human realism. You can listen to a sample of a human read version of accelerando: https://www.audiobooks.com/audiobook/accelerando/210129

The one generated by auditus is too smooth, slightly unnatural

Immortalin · on April 1, 2018

Here's a human narrated sample for comparison: https://www.audiobooks.com/audiobook/accelerando/210129

rahimnathwani · on April 1, 2018

That link returns a 404 for me.

Immortalin · on April 1, 2018

Try again?

planb · on April 1, 2018

I usually do this by hand with surprisingly good results: I use calibre to convert the ePub to txt and then fix some common problems (i.e. remove line breaks and page numbers) using regular expressions. Then I convert it to an audio file using the macOS Automator text-to-speech action (be sure to download the high quality voices first).

Immortalin · on April 1, 2018

Update: Server's overloaded right now. Any conversion that has not been sent will be delivered by end of tomorrow.

k4ch0w · on April 1, 2018

Love the idea about the project. I tried to upload a Epub and got an error page. I tried 3 times and different voices. I look forward to seeing more of it and think it's an awesome idea.

Immortalin · on April 1, 2018

Send me the epub and I will convert it for free! Thanks for catching the bug, will look into it soon. Edit: Was the epub in English?

thisisit · on April 1, 2018

This doesn't seem to be working. I have tried uploading a sample epub. After the epub is uploaded it sends me to a conversions page. That page is just a copy of the homepage.

Immortalin · on April 1, 2018

Also, you need to select the file type on the left, currently epub is the only option but PDF support is planned soon.

Immortalin · on April 1, 2018

Send me the epub and I will convert it for free! Thanks for catching the bug, will look into it soon.

CNJ7654 · on April 1, 2018

Part of me would actually really enjoy having the option to use an old school voice generator. Imagine a horror novel narrated by MS Sam

iamjeff · on April 1, 2018

Really neat solution.

Is this in any way based on Amazon Polly?

hugozap · on April 1, 2018

Looks like its down. I get a "The page you were looking for cannot be served." error.

archaeopteryx · on April 1, 2018

Metamask warns me that this site is on the Ethereum phishing list...