Hacker News new | past | comments | ask | show | jobs | submit login

For my sci-fi story (alpha readers wanted; see profile), I used Whisper to transcribe an interview of a Malawian President. From there, I created a vocabulary comprised of only the president's words, which I used almost exclusively when writing his speech.

The results from Whisper are incredible, with very few mistakes. Though it did get Nelson Mandela's first name wrong (transcribed as Nesson). What's more, Whisper finished transcribing a 60-minute audio stream in 20 minutes on commodity hardware (T1000 G8 NVIDIA GPU). Broadly, here are the steps I used:

* Download and install podman.

* Download and install git.

* Download and install curl.

* Open a command prompt.

* Run the following commands to containerize Whisper:

    git clone https://github.com/lablab-ai/whisper-api-flask whisper
    cd whisper
    mv Dockerfile Containerfile
    podman build --network="host" -t whisper .
    podman run --network="host" -p 5000:5000 whisper
* Download MP3 file (e.g., filename.mp3).

* Run the following command to produce a transcription:

    curl -F "file=@filename.mp3" http://localhost:5000/whisper



Whisper is great. You can get faster results running the tiny model. I used it for podcast transcription and it is much faster and the quality is not worse than the medium model - there are some podcast episodes that the transcription is the same.


If speed is important, you're much better off using a larger model and whisper.cpp.


Wow thank you! That's a nice speedup indeed, with whisper I get

    33,53s user 2,05s system 443% cpu 8,023 total
with the 'tiny.en' model whereas whisper.cpp gives me

    22,71s user 0,12s system 745% cpu 3,062 total
with the 'base.en' model for a 15s audio clip on an i7-3770 (8 threads).


Awesome! Thanks for posting the stats.

In my workflows I've found rare but noticeable quality differences between the model sizes. So when practical I try to use the larger ones.


why not just run whisper from the command line directly? Why put it into a docker container??


Why not keep everything tightly contained?


Hm, I'm on Mac so it takes up a bunch of ram and I'm not used to this workflow. good point though.


Unless you actually use the memory (e.g. allocate it), it won’t impact system performance, but yeah, it definitely is overhead.


some people just love making their environments needlessly complicated.


complexity is in the eye of the beholder, some people just get docker enough that it's not a friction

Now installing the dependencies of every git repo I want to try on my host system, that's how an environment becoming needlessly complicated


thank you for this




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: