It looks like Opus (the project linked to) goes back to at least 2003, while the coalition behind the Opus Codec chose to create a name collision around 2012.
It seems they succeeded, besides Wikipedia the first result my Google view has is the codec.
If we have a scarce namespace, first come first served is not a much better way to allocate names than by popularity.
People have yet to successfully stamp a trade mark on words like Opus, and that is perfectly fine by me. Codecs and NLP datasets are distinct enough domains that any confusion will resolve itself quickly.
It would be pretty hard to choose that name without becoming aware of at least one of these. The creators most likely just deemed these other uses too obscure and unrelated to matter, which is fair because it's pretty hard to pick a good name that hasn't already been used somewhere.