"Canonical" can't apply to music releases in an objective and definitive way.
Context certainly matters (first, modified, compilation, remaster, remix, audiophile pressing, and so on) but you can't even nail "canonical" to first release, especially for singles, because there may be early promo mixes, radio mixes, vinyl mixes, iTunes mixes, and so on - all mastered differently.
Most people's idea of "canonical" is really "The version I want to hear without having to specify other details". But that's subjective and likely to be significantly different for some non-trivial percentage of users, especially in different territories.
Spotify probably just makes an informed stab at "most popular" - which is a good heuristic and will work most of the time, but is hard to calculate when you don't have Spotify's stats.
We may never have the stats Spotify has, but we are trying to get listening information via the in-development ListenBrainz: https://listenbrainz.org
I'm not sure when/if we'll be able to tie it in with MusicBrainz directly, but for someone like exogen, ListenBrainz may be a good basis to figure out relative popularity of various Recordings/Tracks regardless.
You've described the issue pretty well, and I understand (and agree with!) all of that – like I said, I've devoted a LOT of time to solving this.
> Most people's idea of "canonical" is really "The version I want to hear without having to specify other details". But that's subjective and likely to be significantly different for some non-trivial percentage of users, especially in different territories.
Yup! You are describing the problem literally any search engine faces. And yet, Google/Bing/etc. provide pretty smart results. So, do you think the "Smells Like Teen Spirit" recording by Francis Drake is the BEST first result, as MusicBrainz says it is? Is a live bootleg recording the BEST second result? In any locale? MusicBrainz is NOT primarily a search engine, but all that data has very little value if people (and other software) can't actually find it! This absolutely harms adoption.
OK, so we might not need to nail down a "canonical" version when we live in a world with search ranking scores. I totally realize "canonical" is a bad word choice on my part – but it's really how people think of these things!
> Spotify probably just makes an informed stab at "most popular" - which is a good heuristic and will work most of the time, but is hard to calculate when you don't have Spotify's stats.
I bet they do it that way too, but I think you're throwing in the towel way too early here. :) I have a system that works amazingly well and nearly always chooses the most likely intended recording without any listen count data. MusicBrainz has a LOT of data available to it, what type of heuristics might make sense here? I use a ranking system that takes all these factors into account and, like Lucene, assigns a score:
• Number of releases & release groups the recording appears on (the most well-known recording is more likely to appear on additional albums like compilations, and more likely to be widely released in lots of countries).
• How old the release is relative to the other search results (earlier matches are more likely to be the original).
• Whether the recording is from a release with a "single from" relation to another album (the target LP is more likely to hold the recording we want).
• Whether it's from a release that's an Album or EP (positive weighting), or Live (negative weighting), whether the recording ONLY appears on Compilation albums (negative weighting), whether it's any other type of release like Bootleg (strong negative weighting).
• Whether the recording has ISRCs entered for it (more well-known recordings are more likely to have ISRCs in the first place, and also more likely for people to have entered them into MusicBrainz).
• Whether MusicBrainz users have entered any tags and ratings for it (weak but positive correlation with how popular it is).
• Domain-specific string similarity metrics; essentially, query expansion that makes sense specifically for song titles & artist names. This lets certain matches remain equivalent when it makes sense (e.g. "mambo number 5", "mambo no. 5", "mambo #5", "mambo number five" should all be exactly equivalent in terms of string matching. Lucene does some of this already of course, but not nearly enough – I have a query expander with hundreds of examples where Lucene does a worse job)
I can think of more too, that my system doesn't currently use. All that's without relying on any external data source! But if you want to go one better, it's also possible to correlate results with other APIs like WikiData, DBpedia, Spotify, YouTube…
In most cases, I've found that there's enough of a delta between the top score and the second-best score to determine which one is "correct". (Yes, that word, I know…)
Ideally MusicBrainz would be on par with a human expert in determining which recording you most likely meant, and I believe that it CAN do this today, but it doesn't.
Note that our current search server software is in "minimal maintenance" mode. We're working on a replacement which will hopefully allow for a lot of improvements to search rankings etc., but a lot of other things have higher priority (like actually being able to serve requests in spite of getting hammered by bots and spammers).
Of course, MusicBrainz is an open source endeavour. The old search server maintainer was a volunteer from the community. If you believe you can do a better job at running our search server, please join us in #metabrainz at Freenode and introduce yourself.
Also, note: in theory MusicBrainz already has metrics for the number of clicks, views, lookups, and edits certain entities get through their site and API. I bet these are strongly correlated with listens/popularity.
What does "in theory" mean here? Do those tables exist in whole or some part? Is this a matter of indexing an existing data set or hoping some data was acquired by accidental consequence?
Even if it's not collected though, it's data that they at least already have the ability to collect by simply flipping a switch, as opposed to spinning up a whole new ListenBrainz service and hoping it gains traction.
Context certainly matters (first, modified, compilation, remaster, remix, audiophile pressing, and so on) but you can't even nail "canonical" to first release, especially for singles, because there may be early promo mixes, radio mixes, vinyl mixes, iTunes mixes, and so on - all mastered differently.
Most people's idea of "canonical" is really "The version I want to hear without having to specify other details". But that's subjective and likely to be significantly different for some non-trivial percentage of users, especially in different territories.
Spotify probably just makes an informed stab at "most popular" - which is a good heuristic and will work most of the time, but is hard to calculate when you don't have Spotify's stats.