Nvidia RTX voice does similar. It's pretty similar to other technology though where it focuses more on removing background noise. It actually works very well. It would definitely be interesting to see it also filter speech itself. But I feel like this would be hard to do without introducing extra latency. If someone is saying "umm" or some other filler before a word you kinda need to know what that word will be to determine if it's filler or not. So it almost can't be done without introducing latency as it would need some future speech to determine if filler or not.