Could we generalize the WER weighting to optimize for the domain?
Something like
weight = w1 * WER + w2 * phonetic similarity + ...
which also requires a hyperparameter search... But we are already dumping so many GPU hours here.
I assume this is already being investigated by Google, though?
I wonder if you could make that parameter trainable instead of using a hyperparameter search for it.
For phonetic similarity I've been playing with a dual objective system that could be promising.
Could we generalize the WER weighting to optimize for the domain?
Something like
weight = w1 * WER + w2 * phonetic similarity + ...
which also requires a hyperparameter search... But we are already dumping so many GPU hours here.
I assume this is already being investigated by Google, though?