The best bit: They recursively reference the paper to provide proof that too many parents choose the same common names:
> For instance, a parent might anticipate the name “Kate” would be a pleasantly traditional yet unique name with only moderate popularity. They would be wrong [6].
Sketch of a more complete solution, excluding shortened forms and very foreign ones. Alternatives are in rough order of frequency.
vowel 0: E, Ye, Je, Ai. Optional and rare; Ai in particular is very rare.
consonant 1: C, K, G, Q. Mandatory; G and Q are rare.
vowel 1: a, aa, ai; optional h or gh. Mandatory. A few shortened forms use i instead.
consonant 2: t, tt, d. Almost mandatory, but a few r-centric variants lack it. There also seem to be a few
vowel 2: a, e. Optional, only valid if consonant 2 exists. In shortened forms, also i, ie, or y; this is the end.
consonant 3: r, l. Optional. Sometimes L starts a new word instead.
vowel 3: i, y, ee, ie, ii, e if no consonant 2, plus several rare vowel sequences. Almost mandatory (assuming consonant 3), but a few rare variants pack the r right next to the n.
consonant 4: n, nn, nh. Optional.
vowel 4: e, a, ey. Optional; ey is rare.
Some languages shove an s, c, x, t, k somewhere too (some of these are probably language-specific diminutives, but a few might be phoneme drift instead) ...
"Kaylee" and its variant "Kayla" should probably not be counted (despite almost fitting the pattern) since that's a compound of "Kay", adding the additional "Leigh" name.
It goes beyond that. Three of the authors have east-asian last names.
I understand that many people from east asia have a given name in their native language and an english sounding name that they often choose themselves.
If those three authors did chose their english names, then they too fall into the same category of parents who chose a variation of Katherine.
This phenomena also occurs in the transgender community; people put a lot of thought and intention choosing their new names only to wind up surrounded by other people who also landed on the same name, often for similar reasons. There's even a whole subreddit specifically for transfeminine people who are named some variation of Lily: https://www.reddit.com/r/LilyIsTrans/
If they did indeed pick those names while traveling to the US, possibly as adults, I think they’d actually really interesting cases. They’d be choosing names later than their peers, so they could see how the name game played out for their peers. Of course, they could also be peers of the parents of the other authors.
I’ve also met some folks who had English names that phonetically sounded similar to their original names. I wonder if there’s an east Asian first name that sounds like any of the versions of Katherine.
From my anecdotal experience origin is a big factor: of adults I have known to choose a name for themselves americans overwhelmingly pick unusual or ornamented names, whereas the other group (typically asian, first language has a different set of basic sounds) pick stereotypically common and plain/short names. I don't really know anyones specific thought process on the matter though, maybe I'll have to start asking for curiosities sake.
Anecdotal, of course, but the goal can be quite different for both groups.
English people choosing English baby names often want them to be relatively unique or stand out in some way. At the very least, they don't want them to be _overly_ common.
Manny people I've talked to that have chosen a name after immigrating are kind of aiming for the opposite. They want to fit in. They already feel that they stand out, and generally try and minimize that.
There's also the fact that for some groups, they're choosing the name at a time when they may not be very familiar with English names or culture, and may not have much in the way of local resources they can or feel comfortable with drawing on, so the main indication they have that they haven't chosen some absurd name is "hey, lots of people here are named that".
Not sure it's still the same scenario with today's far more connected world, but even 20 years ago you could guess with some accuracy that someone was east-Asian from their "English" name being ~50 years out of date, popularity wise.
(Para)bosons, (para)fermions, quons and other beasts in the menagerie of particle statistics, by O.W. Greenberg, D.M Greenberger, T.V. Greenbergest https://arxiv.org/abs/hep-ph/9306225 [Wally Greenberg told me that T.V. stands for 'the very']
This reminds me that I went to nautical school with someone whose last name was Schiff (German for ship) and he said that was exactly the reason he chose to go to sea.
Also remember someone a year ahead of us whose name was Dory (a small rowboat).
I'm reminded of the 'Tom Formal' that took place whilst I was at university:
> Last night, February 3rd 2011, saw 100 students and fellows all sharing the name of Tom, gather together in a record-breaking charity event in Sidney Sussex dining hall. [...] For £20, Cambridge students with the name "Tom, Thomas, Tommy (or another legitimate variation)" were able to attend a black-tie, three-course formal dinner
This is the catch: you're not naming a baby: you're naming a person; they just so happen to be a baby at the beginning when you're enjoying an early appreciation for the mel lif lu ous ness of the name ... but they're going to be an adult the vast majority of their life. Ergo, that ought to be the usage parents plan for, rather than some cute, endearing name "fitting" for an infant (for some definition of fitting).
In most cases where babies get a "cutesy" name, hopefully the parents have the self-awareness to give them the corresponding adult version as their legal name, or at least a reasonable alternative as a middle name— both give the person easier options if they want to change it up during life transitions like entering high school or going away to university.
For example, Gwyneth Paltrow's daughter is Apple Blythe Alison Martin, and she's stuck with it, being Apple Martin professionally— but it's good she had the off-ramp to be the much more conventional Blythe or Alison if she'd wanted it.
I think in the case of female names there’s so much cyclic effect to it that it’s really hard to say. I could just as easily imagine a few decades ago people saying the same thing about Melissa, Jessica, Amanda, Jennifer, Lauren, Ashley, and all the other names that were super trendy for millennial girls.
Now Ashley is 35 and pushing little Kaylee in the pram, or maybe she’s a business executive — either way, our perception has shifted that “Ashley” is now a woman not a girl. Just like how Gertrude, Beatrice, and Florence are more likely to be playing dolls at the park than bridge at the nursing home.
"we create a model which is not only tractable and clean, but also perfectly captures the real world. We then extend our investigation with numerical experiments, as well as analysis of large language model tools."
Whether family names are more differentiated depends on where you live.
The USA has a wide variety, but there are also places like Vietnam where only a handful of family names are in common use and more than 30% of people are Nguyens.
I think that makes sense both from an organizational and cultural perspective. Context usually supplies whatever information is needed for personal names so less disambiguation is required, and they are used much more so some simplicity is useful/natural consequence of human nature. Family names are used less frequently and with less context and frankly is how people distinguish their group from others. So yeah, think it checks out.
> Because this paper was written in 2024, we include an obligatory section involving generative AI and LLMs.
> Another ERA is the Mayfly Parenthood Assumption, in which all parents perish immediately upon naming their child, which makes the math substantially easier.
> It is well-known that parents are always in complete agreement over the name they would prefer to pick for their newborn child.
I do think the best names are ones with the most meaning.
You name a kid Isaac, you could be naming him after Isaac Newton. It puts something on to him.
If you name a kid William, maybe you hope he will be the next Shakespeare.
Simply by naming someone something, you imprint something on to them. The history and power of a culture.
Yet for this very reason, especially when people see the culture as dark, they choose unique names, names that say you can be who you want to be.
Though I think I still prefer old names, looking at names of people who have done something, and then hoping to do something similar.
I think this is kind of why a convert to an orthodox Christianity, from some heterodoxy, or atheism, or from the religion of the “infadels” takes a new name in baptism. They hope to live up to whomever. If you take the name Theresa at baptism with a sense of obligation to love the lowly like Mother Theresa and so on.
I named my kid Dexter. Despite my best efforts he won't wear lab coats or speak with an accent. When I try he just asks me to go buy some plastic drop cloths and goes back to sharpening the kitchen knives.
> You name a kid Isaac, you could be naming him after Isaac Newton. It puts something on to him.
My son's name. I was thinking of Isaac Asimov and I had Isaac Newton in mind as well. I know an SF writer who I worked with who named his sons Arthur and Robert, after famous SF writers obviously in his case.
>LEVITT: Yeah, one of the most predictable patterns when it comes to names is that almost every name that becomes popular starts out as a high-class name or a high-education name. So in these California data we had we could see the education level of the parents. And even the names that eventually become the, quote, “trashiest” kinds of names, so the Tiffanys and the Brittanys, and I’ll probably get myself in trouble, and the Caitlyns and things like that start at the top of the income distribution, and over the course of 20 or 30 or 40 years they migrate their way down, becoming more and more popular among the less-educated set.
What you see with Mabel in the paper is a fad name coming back. Hipsters bring it back, then upper class parents with hipster pretentions popularize it, then it spreads to the general population. The trick is to pick a name that sounds outdated or obscure but will come into popularity within the child's lifetime. If you wanted to do that now, you would pick something like Linda or Iris.
I would also be interested to see analysis on syllable counts. When will the boomer 2 syllable names will come back into style?
I worked with a Harrison (born in the 70s) who commented that the name had a bathtub curve - most people with the name were either really old or really young (he knew more Harrisons in his toddler's preschool class than his own age).
Compare with a name like Michael ( https://www.wolframalpha.com/input?i=Michael ), which while it has fallen out of favor with newer names, is still the most common male name in the population - though the average age is 48 years.
My first thought is that children born in the 70s named "Harrison" owe their names to Harrison Ford, at that time wildly popular for Star Wars and Indiana Jones. "Mabel" may owe some popularity thanks to Selena Gomez's character in "Only Murders In The Building."
One of the things to take note of between those two charts is that the most popular names are less popular. Parents are choosing distinctive names rather than common names.
In 1970, the top five male names represent 2.5 million births. Michael (the most common name) was 707,377 of them.
In 2010, the most common name was Noah with 183,258 births. In 1970, a name with that much popularity would be #20.5 between Thomas and Timmothy.
That 2.5 million again... in 2010s that's 19 names.
... Another visualization of the data. https://namerology.com/baby-name-grapher/ This looks at the top 200 names for boys and girls over time. However, the downward slope isn't fewer overall births but rather the reduction of popularity of the most common names.
I've always found it somewhat amusing that, at least for my age range, I have a given name that's not unique or obviously weird but pretty uncommon. At the last place I worked, that was guaranteed to--on the odd conference call--have one of the two of us sharing a given name periodically be "Why the hell is someone asking me about $THING_I_KNOW_NOTHING_ABOUT ?" While both my first and last names are northern European, they are also from different countries so as far as I know I'm unique among living people with an Internet presence which is presumably better than sharing a name with someone who is widely hated for some reason or other.
A paper by Jinseok Kim, Jenna Kim, and Jinmo Kim: "Effect of Chinese characters on machine learning for Chinese author name disambiguation: A counterfactual evaluation" . Obviously the authors don't have Chinese names but I would imagine personally having names that need disambiguating might spur one's interest in this research area. (And they do mention in the paper that it's also an issue for Korean names.)
Yeah, it's interesting how the practice of only listing surnames works well in cultures where people have long and distinct surnames (and often common first names) and is just silly in cultures where surnames are short and common and most of the information content of the name is in the first name.
> The above model contains several Extremely Reasonable Assumptions (ERAs). [...] Another ERA is
the Mayfly Parenthood Assumption, in which all parents perish immediately upon naming their child, which
makes the math substantially easier."
I lay awake at night thinking about the baby naming problem.
I want my child to have a name in the sweet spot. Not too common, not too unique, and, crucially, not a name that is popular for only a brief period so that everyone will know about how old they are just by their given name[0].
But people thinking along these lines inevitably gravitate to the same small handful of names, causing the "too popular for a brief period of time" effect against their will. I've already failed once; my cat is named Olivia, the popular girl's name of the decade, apparently.
I was never going to have kids, but if I did, I had rules for naming.
1. Name them what you're going to call them. If you want a "Kate", don't name them "Katherine". If you want a "Sam", don't name them Samuel.
2. Don't give them the same first name as a close relative.
3. Don't give them a unique spelling of a common name. You're just giving them a life-long annoyance of having to spell their name out any time they're telling someone their name vocally.
My parents broke the first two rules when they named me and it created headaches as I transitioned into adulthood. It even caused problems when I interned at Intel, where he'd been working for 15 years. I got e-mails that were supposed to go to him, and vice-versa.
I’m glad my parents didn’t follow rule 1. They wanted to call me Kathy. It took me until grad school to convince everyone in my life that I was Katherine and absolutely not a Kathy, tyvm. If I’d had to run it by a judge, I’d have been pretty unhappy. As it was, I was grateful they gave me a classic name with lots of nickname options. (Too bad I didn’t know about the paper in time to join in.)
Yes to 2, BIG yes to 3, a “special” spelling is a curse, why do that to your kid?
I do prefer full versions for the name instead of shortenings or nicknames. I think it lets them feel freer, earlier, to switch to the full version if they like it better than the nickname or shortened version. More options.
One of my kids’ middle name is my grandmother’s first name. The other’s first name is my great-grandmother’s first name (which is also my mother’s middle name).
Why not let your child be a product of their times? Whatever name you pick it isn’t like your great grandchildren will think you picked a cool, relevant name for their grandmother.
Because I personally find my name being a product of its time annoying, I think it's reasonable to suspect that someone else would also find that annoying.
Especially given that the hypothetical person in question would get half their DNA from me and be raised by me it seems a pretty reasonable suspicion.
Despite the name it has a pretty large category of real world names. :)
I am being completely sincere. There are thousands of lists of baby names. Classic information overload. A randomizer let you look at ten options and if you don't like those you can get another ten.
My suggestion is to hit up the Social Security Administration website: https://www.ssa.gov/oact/babynames/. Go back a century (or to some era of choice), walk the list, find one that you like that isn't even in the top X these days and you'll be fairly safe. You'll end up with a reasonable, not-ridiculously-unique name that this generation mostly doesn't have (the site has recent years and names don't usually go from unlisted to popular overnight).
> the popular girl's name of the decade, apparently.
Keep in mind, in the internet era it can actually be nice to have a bad-SEO common name (though that's often dependent upon surname too).
I like having a name like you describe. It's former popularity makes it a familiar name but I've only met a couple of other people with the same first name. Interestingly combined with my similar popularity last name there are on the order of 30 people with matching first and last names in the US.
> The above model contains several Extremely Reasonable Assumptions (ERAs). The first ERA is the very
conservative assumption that there is only one gender, with all children and all names adhering to the same
gender. Thus any child may be given any name, so long as it exists in the names list1. Another ERA is
the Mayfly Parenthood Assumption, in which all parents perish immediately upon naming their child, which
makes the math substantially easier.
> Through making several Extremely Reasonable Assumptions (namely, that parents are myopic, perfectly knowledgeable agents who pick a name based solely on its “uniqueness”)
What a weird assumption. We named our daughter picking four names, starting respectively with the letters 'G', 'A', 'T' and 'C'.
Yeah, I was thinking the same thing - as a 1st time new parent you're famously not well plugged into the names currently being chosen by other parents, which is why our son Max was one of 3 in his class
I have only now realised the reason my name is fairly unique.
My father was a teacher, so he did know names people were using, and for any given name could probably think of a child he wouldn't want me to share a name with!
I definitely looked up what names people were naming their kids when choosing my kid's name. It allowed me to pick one that was not so unusual as to be seen as weird, but wasn't going to clash with everybody else out there either. It worked. He occasionally meets others with his name, but not very often. The only issue is that his name has about five different common spellings.
My name is Michael and I definitely used this data because I didn't want my kids to have very common names. As it turns out, the first person to compile this data in the US was an actuary for the Social Security Administration, also named Michael, who was trying to name his kid and wanted to know what the most common names were so that his kids wouldn't have names that were too common: https://nameberry.com/blog/most-popular-names-how-the-list-w...
But sticking to the familiar crowd-pleasing members of that sequence to make the joke stick, knowing that even on this site working in poor neglected uracil would be trying too hard
35 years ago I knocked up my soon to be wife. We picked out name and opted for a home birth, confident that no other couples had made those same choices.
That birth month, Life magazine featured a full page spread of a home birth (ewwww); their newborn had the same name.
This event is on a list of stuff I/we came to on our own, at the same time as everyone else.
Most of us like to believe that we're not slaves to fashion but we often are.
One of my favorite examples (although there are many) is inline skating/rollerblading. It was all the rage in the early 90s or so. It's rare to see someone rollerblading today. I pick that example because it was somewhat related to tech that it took off. But there's no good reason for it to have pretty much died off.
The amusing part about inline skating is that when you go by kids they're look "wow look at that guy skating". And then they never skate themselves.
I suspect there's a few issues
- If your parents don't skate you'll probably never get competent at it
- You need to be actually good to not injure yourself or on a pretty flat area. Rollerblades do not handle holes in pavement nearly as well as bicycles or shoes.
- Bicycles have actual utility like getting to work/places. For Covid a ton of people bought skates [1] but honestly I never saw that many more people skating then compared to before/after.
- They're pretty bulky. A Bicycle can transport and lock itself up but if you skate somewhere you'll need a bag to store them.
I played ice hockey through college and beyond so rollerblading was pretty straightforward modulo rough surfaces and pavement that isn't flat. But I sort of agree. Even though it was popular at one point, it's something that has a learning curve for someone who hasn't, often painfully, learned activities that provide an on-ramp. Not that I became an expert but picking up inline skating as an adult was pretty easy for me.
And, yeah, it isn't an activity that has any real utility. I don't really bike (didn't learn as a kid because didn't have a real place to safely bike--narrow country roads). But might have done so if there was a real practical reason to do so.
Of course, inline skating was a popular activity at one time and it just fell out of fashion.
and this callout is funny because it's actually re-emerging as a popular hobby. so there is probably some subconscious influence that led you to call it out here.
If you combine Katherine, Catherine, Kat, Kate, Caty, Katy, Katie, and Katheryn (there are SO MANY variants, but most of them have never been popular), peak popularity for girls in the U.S. in the last century is in 1986 at only 1.8% of baby girls.
That's less popular than the single name Matthew for boys, or any one of Jessica, Ashley, Amanda, or Jennifer, in that same year. I expected it would be higher: my own sister is one of these, and I had a friend circle in my 20s that included a Katie, a Katherine (who went by Kat), a Caitlin, and a Kathryn.
Not going to lie the first thing I thought was, "Why is the John Green book on HN??" I mean, cool, but surprising. Then I read the end and it made more sense ha
My wife and I legitimately sat down and came up with a list of 50 names we each liked, and from the 98 we had totaled up, we applied a series of different filters to get to an answer over the span of several weeks.
First we each went through the list, and force removed half of them, each of us taking turns eliminating one at a time.
From the remaining 50 we rated them, and removed anything that scored under a 6 from either of us, or under 15 points total.
Then we had 20 left, that we talked through each fairly extensively. We covered etymology, popularity, age association, popular cultural associations, you name it. After that we each removed 5 more.
Once we were down to the top ten, armed with frankly far too much knowledge about these names at this point, we reranked them individually and tallied up the scores.
Two names stood head and shoulders above the rest, one scoring around a 19 total and the other scoring around a 17. Those became our daughter's first and middle names.
We never intended to have children, ended up with two boys.
The first was named by his mother's choice: First name after her father (a perfectly reasonable "Edward"), the middle name after my dad ("Leonard," which we never call him.)
The second was even more of a surprise than the first, but that meant it was my turn: "Jonathan," after a favorite character from a novel. The middle name was chosen by his big brother, to give a little sense of ownership or participation: "Adam."
Some parents try to have their cake and eat it too by altering the spelling or pronunciation of otherwise common names, thus ensuring their child both fits in and is unique.
Cheekiness aside, naming our children has been a fun, stressful, but ultimately rewarding endeavor and this paper was very on point.
Like any good Presbyterian, I named my Son after the great Archibald Alexander, the progenitor of Princeton Theological Seminary . Myself, I am named after the great theologian John Calvin.
However, if I have a daughter, I will name her Britney - an anagram for Presbyterians
I was going to name my child Seven, Mickey Mantle's number, a great name for a boy or a girl. Then some friends overheard it and stole it for their baby.
This is kind of depressing: every time I make a somewhat-obscure sci-fi reference here, usually no one gets it (or it takes a very long time). But an obscure Seinfeld reference gets a full citation in 2 minutes.
In my whole life I was able to watch 20 minutes of Seinfeld. I feel that i must be an exceedingly weird person to find it absurdly boring and depressing when almost everyone loves it.
> 2 Related works: Surprisingly, no one has ever done any research on naming strategies (so long as you conveniently ignore [4, 5, 7, 8, 9, 10, 11, 13, 14, 15, 17, 18, 19, 20, 21, 23, 24, 25] and likely other work).
As a Nick born in the 70's, I thought my world was getting a bit weird as Nick's were poping up everywhere all of a sudden. Then I saw a video on the top ten boy's names from 1880-2020[1] and saw Nicholas pop up in the top ten in 1986, peak at #5 in 1990-1992, and drop off in 2004. I blame Nicholas Cage and Nick Nolte.
For people unfamiliar with common English names, all of the authors have first names similar to or derived from Katherine.