Hacker News new | past | comments | ask | show | jobs | submit login

I've been using MacWhisper for a few months, it's fantastic.

Sometimes I'll send a mp3 or mp4 video through it and use the resulting transcript directly.

Other times I'll run a second step through https://claude.ai/ (because of its 100,000 token context) to clean it up. My prompt for that at the moment is:

> Reformat this transcript into paragraphs and sentences, fix the capitalization and make very light edits such as removing ums

That's often not necessary with Whisper output. It's great for if you extract captions directly from YouTube though - I wrote more about that here: https://simonwillison.net/2023/Aug/6/annotated-presentations...




This is so good! I studied English, then moved to linguistics, then lived in the UK for almost a decade and due to my accent none of the TTS tools are close to the approach you just mentioned (whisper + LLM). Thanks Simon!




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: