Names in the USA (1880-2025)

An exploration of baby name trends in the USA from 1880 to 2025

1 Introduction

Baby name rankings are very popular on the internet and on social media. A typical ranking is the top baby names for each gender in the USA in 2025 (source):

Ranking	Male	Female
1	Liam	Olivia
2	Noah	Charlotte
3	Oliver	Emma
4	Theodore	Amelia
5	Henry	Sophia
6	James	Mia
7	Elijah	Isabella
8	Mateo	Evelyn
9	William	Sofia
10	Lucas	Eliana

What these rankings often don’t show is the quantitative values behind these rankings. That is, how many babies were given each name in the year 2025, and how does that compare to previous years?

Luckily the USA Social Security Administration (SSA) releases the full dataset for baby names going back to 1880. I took this data, grouped the names by ranking and summed up their counts each year, and plotted them against each other. See the above graph. (The rankings are inclusive. For example, Top 10 also include Top 1.)

From this graph we can see that the popular names are getting less popular, while the number of births is still in line with what it was 50 years ago. This is something that plain rankings obscure.

If “Liam” or “Olivia” were ranked in previous years with the same frequency as 2025, they would be ranked noticeably lower than #1. For example, in 2000 “Liam” would have been ranked 11th and “Olivia” would have been ranked 12th. In 1950, “Liam” would have been ranked 19th and “Olivia” 27th. But names are more spread out these days, so their frequencies are sufficient to be ranked #1 in 2025.

The other sections present other charts and insights derived from the data.

These are the top level findings:

There has been an almost 20× increase in the number of births each year from 146 years ago.
- The 5 year average increased 18.6× from 215,000 births in 1885 to 4.02 million in 2025.
These days there are more than 15× names to choose from than 146 years ago.
- From 1000 names each for boys and girls in 1880, there are now more than 17,000 unique names for girls and more than 14,000 unique names for boys each year.
However, given all this choice, the names are still heavily skewed towards a relatively small subset.
- In 2025, the top 100 names were given to more than 35% of newborns.
- Over the whole 146 years, the top 100 names account for 46% of all names given.
That said, the top ranked names are less popular both absolutely and relatively since the 1900s, and this trend is continuing.
- The frequency of the top 100 males names has decreased absolutely by 2.5×, from covering 1.62 million of all births in 1956 to 650,000 births in 2025. Relatively they decreased by almost half, now accounting for 38% of all births from a peak of 81% in 1880.
- The frequency of the top 100 females names has decreased absolutely by 2.5×, from covering 1.31 million of all births in 1957 to 514,000 births in 2025. Relatively they decreased by more than half, now accounting for 31% of all births from a peak of 77% in 1880.
Girl names are consistently slightly more diverse than boy names.
- Over the whole period there were on average 40% more girl names than boy names each year. Over the last 5 years, there were 24% more girl names on average.
- Over the whole period the top 100 girl names accounted for 10% less of girls than the top 100 boy names did for boys. Over the last 5 years the 100 girl names have accounted for 7% less on average.

2 Methodology

2.1 Data

The dataset used in this article is the “National data” released in 2026 by the USA Social Security Administration (SSA) at SSA: Beyond the Top 1000 Names. It has the following limitations:

It does not account for every US citizen, because not every US citizen has a social security number. The social security number was only introduced in 1936, so the data before then (from 1880-1935) is incomplete. Non-citizens are not included.
Only names with more than 5 births per year are reported. This is to protect identities.

The dataset consists of 146 CSV files for each year from 1880 to 2025. Each CSV file has three columns: name, gender (M or F) and frequency. The CSV files are ordered by gender (F then M), then descending frequency and then alphabetically for tied frequencies.

2.2 Dense Ranking

I used a dense ranking. This means that identical counts are given the same rank and there are no gaps in the rankings. So the top 100 ranked names could account for more than 100 names in a given year. However, ties in the top 100 names are rare, but they get more common as the counts get lower. At the lowest values there can be up to 2000 names per ranking.

No single year exceeded a dense ranking of 1000 per gender despite some years having more than 20,000 unique names for a single gender.

2.3 Code

I used Julia to analyse the data. I will not present the full code here, but here are some snippets.

Loading a single file and transforming:

using CSV, DataFrames
filepath = joinpath(data_dir, "yob2025.txt")
df = CSV.read(filepath, DataFrame; header=["Name", "Gender", "Count"])
transform!(
    df,
    :Name => ByRow(x -> x[1]) => :FirstLetter,
    :Name => ByRow(length) => :Length,
) # name composition
transform!(groupby(df, :Gender), 
    :Count => cumsum => :CumulativeCount,
    :Count => (x -> x ./ sum(x)) => :Frequency,
    :Count => (x -> denserank(x, rev=true)) => :Rank,
) # gender ranks
transform!(groupby(df, :Gender), :Frequency => cumsum => :CumulativeFrequency)

Filtering on gender:

m_df = filter(:Gender => ==("M"), df);
f_df = filter(:Gender => ==("F"), df);

Quantiles:

idx = something(
    findfirst(m_df.CumulativeFrequency .>= 0.5),
    nrow(m_df)
) # rank of 50% quantile / median

Loading multiple files and joining into one large dataframe:

using CSV, DataFrames
using Parquet2
filepath = joinpath(data_dir, "yob1880.txt")
df = CSV.read(filepath, DataFrame; header=["Name", "Gender", "1880"])
for year in 1881:2025
    print("$(year), ") 
    filepath_next = joinpath(data_dir, "yob$year.txt")
    next_df = CSV.read(filepath_next, DataFrame; header=["Name", "Gender", "$year"])
    df = outerjoin(df, next_df, on=[:Name, :Gender])
end
year_matrix = df[:, string.(1880:2025)]
df.Total = sum(eachcol(coalesce.(year_matrix, 0)))
df.Count = sum(eachcol(.!ismissing.(year_matrix)))
size(df) # (117820, 150)
Parquet2.writefile("names_ssa_1880-2025.parquet", df)

Transforming the joint dataframe:

transform!(groupby(df, :Gender), 
    :Total => (x -> denserank(x, rev=true)) => :TotalRank,
) # gender total ranks
transform!(groupby(df, :Gender),
    [y => (x -> denserank(x, rev=true)) => "Rank$y" for y in years]...
) # gender yearly ranks
transform!(df,
    :Name => ByRow(length) => :Length,
    :Name => ByRow(x -> x[1]) => :FirstLetter,
); # name composition

Export data to JSON:

using JSON
out = Dict{String, Any}("years"=> 1880:2025)
names_to_save = Dict("M"=> ["John"], "F" => ["Mary"])
for gender in ["M", "F"]
    out[gender] = Dict{String, Any}()
    gender_df = gender == "M" ? m_df : f_df
    for name in names_to_save[gender]
        idx = findfirst(gender_df.Name .== name)
        out[gender][name] = Dict(
            "count" => Vector(gender_df[idx, years]),
        )
    end
end
JSON.json("output/names.json", out)

3 Top Names

The “Top 1” dataset from the Top N graph can be decomposed into the top names each year. This produces the following graphs:

These graphs show how relatively “unpopular” the most popular names are now compared to the popular names of the 1900s.

The number of names that have reached the top spot is very small. There are only 19 in total, 8 boy names and 11 girl names. Mary alone was the #1 girls name for 76 years, more than half the total period from 1880 to 2025.

Here is how these top yearly names are ranked across all 146 years:

Rank	M	Total Rank	F	Total Rank
1	James	1	Mary	1
2	John	2	Jennifer	4
3	Robert	3	Linda	5
4	Michael	4	Jessica	11
5	David	6	Lisa	16
6	Jacob	29	Emily	18
7	Noah	63	Ashley	20
8	Liam	97	Emma	28
9			Olivia	51
10			Sophia	82
11			Isabella	86

There are gaps here because many popular names have never been ranked #1. For example, “Elizabeth” is the overall #2 female name, but there was never a year it was ranked #1.

4 Unique Names

The number of unique names has grown from 2000 names in 1880 to over 31,000 names in 2025. Every year has seen names added and removed from the list, with up to 4,000 removed and added each year in the 2020s. Overall there are 117,820 unique names in the dataset. Of these 31,227 (26.5%) are represented in 2025.

The above graph shows how skewed the dataset is, with the top 75% quantile line (75% of all baby births) hovering at around 4% of all names.

Girl names are consistently slightly more diverse than boy names. Over the whole period there were on average 40% more girl names than boy names each year. Over the last 5 years, there were 24% more girl names on average.

Some insight can be gained by looking at the ratio of the total count each year (the total number of births) to the count of unique names each year, bearing in mind the heavy data skew. (At the extremes, there are 100,000 births per name for the top names, and 5 births per name for the bottom names.) From this graph we can see that names were most concentrated in the 1950s, with about 470 births per name for boys and 300 births per name for girls. These ratios have come down almost 4×, and now sit at 120 for boys and 91 for girls. This implies a greater diversity in naming in recent years.

5 Composition

We can also investigate the composition of the names in the dataset. Here I do so for the first letter and also for the name length.

Over the whole period the most popular first letter for boy names was “A” (Anthony, Andrew, Alexander) and “J” (James, John, Joseph), and for girls was also “A” (Anna, Ashley, Amanda) followed by “S” (Susan, Sarah, Sandra).

If we were to take a random person at any year in the period, for a man their name would most likely start with a “J” while for a woman it would most likely start with an “M” (Mary, Margaret, Michelle).

The names vary in length from 2 letters (Al, Ty, Jo, Lu) to 15. (Many of the 15 letter names look like concatenations of shorter names and might be mistakes e.g. Muhammadibrahim, Christopherjohn, Mariadelosangel.) Most names are 5 to 8 letters long.

6 Bonus

My name is one of the many rare names. It is a Hebrew name, spelt as ליאור and transliterated as “Lior” or “Leor”. It is gender neutral. There is also a female only version, ליאורה, which is transliterated as “Liora” or “Leora”.

The data shows that “Leora” has been used in the USA since at least 1880, but “Leor” was only first used in 1979. It is ever so slightly gaining in popularity, with 93 baby boys and 473 girls given a variation of the name in 2025. For girls, the “Liora” spelling recently overtook “Leora” in popularity.

← Previous Post

Table of Contents