This data's definition of "famous" or "notable" is in the "Measuring notability" section of the linked paper:
we build a synthetic notability index using five dimensions to figure out a ranking for this broader set of individuals. These dimensions are:
1. the number of Wikipedia editions of each individual; [i.e. number of languages in which this person has a Wikipedia article]
2. the length, i.e total number of words found in all available biographies. […]
3. the average number of biography views (hits) for each individual between 2015 and 2018 in all available language editions […]
4. the number of non-missing items retrieved from Wikipedia or Wikidata for birth date, gender and domain of influence. The intuition here is that the more notable the individual, the more documented his/her biographies will be; [!]
5. the total number of external links (sources, references, etc.) from Wikidata.
We then determine the quantile values from each dimension and add them all to define our notability measure
To be fair, Wikipedia has access to a lot more primary and secondary sources for people in the 20th century than for people in the first, so I don’t know that Wikipedia is the best metric.
I find it very hard to believe that Jesus never existed. That he's the son of god is debatable - but I thought his existence has been pretty concretely proven time and time again.
Several prophets (likely more than one with the then common name Jesus) had a following in Roman-occupied Palestine. The bible figure is likely an exaggerated amalgamation.
Thanks so much for digging this out! Very useful to know. So, the notability methodology fails massively, as many have noted. Jesus and Muhammad trailing Britney Spears by a factor of 4 or so is my favorite so far ... LOL. But the question becomes, how can the notability be improved. Of course "AI" is probably the answer here, in the same way it is becoming the answer to so many questions/problems. (Just as the answer to every legal question is "it depends".) Two elements pop to mind: (i) Accessing more things outside of the Wikipedia/Wikidata database. (ii) Within the Wiki world, making associations like Jesus ~ Christian ~ bible ~ best selling books.
> Jesus and Muhammad trailing Britney Spears by a factor of 4 or so is my favorite so far
To play devil's advocate (edit: pun entirely not intended!), I'd bet that way more people today could correctly identify a photo of Britney Spears than an accurate rendition of either Jesus's or Muhammed's faces. Obviously this map isn't supposed to be most "recognizable" people, but I think there's something to be said about whether the person itself is different from the mythos around them (which may or may accurately describe their life).
Ah, right, I actually had known about that before, but I must have forgotten! It would be interesting to see if there weren't as much controversy around depicting Muhammed whether the common versions today would be accurate; depicting Jesus isn't discouraged at all in Christianity, but centuries of Europeans depicting him looking like them have established a common trope of a white, fair-haired Jesus, which would not have been at all what he actually looked like given where and when he was from. Given that Islam has been much more continuously practiced in the region where Muhammed lived, I imagine that depictions of him probably wouldn't have been as egregiously inaccurate in terms of race, although it's possible other sorts of cultural expectations might have been adopted over the years.
1) Apply a discount on notability for people based on how near their birthdate is to now. If Hammurabi and Jordan Peterson have the same score, Hammurabi should win by far.
2) Use an additional book corpus. Someone mentioned in books from 1500, 1800 and 2022 should score higher than someone popular in only one era.
we build a synthetic notability index using five dimensions to figure out a ranking for this broader set of individuals. These dimensions are:
1. the number of Wikipedia editions of each individual; [i.e. number of languages in which this person has a Wikipedia article]
2. the length, i.e total number of words found in all available biographies. […]
3. the average number of biography views (hits) for each individual between 2015 and 2018 in all available language editions […]
4. the number of non-missing items retrieved from Wikipedia or Wikidata for birth date, gender and domain of influence. The intuition here is that the more notable the individual, the more documented his/her biographies will be; [!]
5. the total number of external links (sources, references, etc.) from Wikidata.
We then determine the quantile values from each dimension and add them all to define our notability measure
They also have a table of what this metric throws up as the most "notable" from each time period: https://www.nature.com/articles/s41597-022-01369-4/tables/3 and how the "domain" varies over time: https://www.nature.com/articles/s41597-022-01369-4/figures/2 (note Nobility and Religious in 500–1000, to Sports and Culture post 1950).