Hacker News new | past | comments | ask | show | jobs | submit login

https://alphacephei.com/vosk/install#usage-examples demonstrates the bare-bones vosk-transcriber sample, and there's also https://www.assemblyai.com/blog/getting-started-with-espnet

I wasn't able to play with https://github.com/o-oconnell/mp4grep on ARM.




Ah, vosk-transcriber looks like it's decent, especially if you use srt output so you have timestamps. Probably no reason to use mp4grep for this purpose then.


I'm still hoping for a turn-key open source solution that includes speaker identification.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: