I tested MANY while building various audio tech and this was by far the best (beats the shit out of all the python Pandas Hugging Face lists etc). Incredibly ability to cut noise (tho, it must be noted that perceptual improvements for humans do not usually increase machine transcription as AI models strangely seem to pull info out of the dead space between words, and the noise around words...not just the voiced words themselves...the hidden vibrations, beyond our human ken, oh mere mortals unworthy of the grand perceptive machines...ugh...:p :o ;p xx ;p)
I contacted the authors via their FB emails but never heard back. Right now it's non-commercial and I was building a commercial product.
Somewhat tangential question, have you looked for/found any audio models/tools that can be used for separating out individual voices to separate audio tracks automatically? Perhaps this is already possible with existing tools that I am uninitiated in.
I haven't tested this with multiple voices and it sounds like you want something more specific but it's produced 10/10 results with a couple dozen audio files I've thrown at it, might be of use... https://vocalremover.org/
Izotope RX Pro, which is software for the cleaning and refinement of audio for music and audio post production includes 'Multiple Speaker Detection' which analyzes different voices in a recording and allows you to process them independently.
I can't speak to it's effectiveness because I don't have any need for it, and also RX 10 Advanced is commercial software and pretty expensive for a casual user, but the feature seems to be on the horizon for other apps.
I tested MANY while building various audio tech and this was by far the best (beats the shit out of all the python Pandas Hugging Face lists etc). Incredibly ability to cut noise (tho, it must be noted that perceptual improvements for humans do not usually increase machine transcription as AI models strangely seem to pull info out of the dead space between words, and the noise around words...not just the voiced words themselves...the hidden vibrations, beyond our human ken, oh mere mortals unworthy of the grand perceptive machines...ugh...:p :o ;p xx ;p)
I contacted the authors via their FB emails but never heard back. Right now it's non-commercial and I was building a commercial product.