This project exists to make structured, machine readable data out of the Meertens Dutch Family Name Database, which contains about 320 000 last names that were recorded in a 2007 census as well as in a 1947 census.
A basic list is available in multiple formats:
Format |
---|
CSV |
JSON |
Fusion table |
.lst (Alphabetic list of natural names) |
.lst (names with frequency >= 5) |
.lst (names with frequency < 5) |
no. | name | count in 2007 |
---|---|---|
1 | de Jong | 83782 |
2 | Jansen | 73533 |
3 | de Vries | 71065 |
4 | van den Berg | 57377 |
5 | van Dijk | 56408 |
6 | Bakker | 55273 |
7 | Janssen | 54040 |
8 | Visser | 49525 |
9 | Smit | 42275 |
10 | Meijer | 38472 |
11 | de Boer | 38191 |
12 | Mulder | 36207 |
13 | de Groot | 36032 |
14 | Bos | 35402 |
15 | Vos | 30279 |
16 | Peters | 30106 |
17 | Hendriks | 29492 |
18 | Dekker | 27946 |
19 | van Leeuwen | 27819 |
20 | Brouwer | 25419 |
21 | de Wit | 24055 |
22 | Dijkstra | 23510 |
23 | Smits | 23205 |
24 | de Graaf | 21004 |
25 | van der Meer | 20591 |
26 | Kok | 20325 |
27 | Jacobs | 20148 |
28 | van der Linden | 20132 |
29 | Vermeulen | 20110 |
30 | de Haan | 20011 |
31 | van den Heuvel | 19899 |
32 | van den Broek | 18447 |
33 | van der Veen | 18366 |
34 | de Bruin | 17593 |
35 | Schouten | 17147 |
36 | van Beek | 16708 |
37 | van der Heijden | 16663 |
38 | de Bruijn | 16562 |
39 | Willems | 16508 |
40 | van Vliet | 16346 |
41 | Maas | 15620 |
42 | Hoekstra | 15613 |
43 | Verhoeven | 15525 |
44 | Koster | 15346 |
45 | van Dam | 15288 |
46 | Prins | 14894 |
47 | Huisman | 14682 |
48 | Blom | 14679 |
49 | Peeters | 14054 |
50 | de Jonge | 13989 |
We scrape the Meertens website to generate a CSV file that contains: the family name, the number of times it was counted in 2007, and name as lemma (meaning the 'base' version names that has multiple variants, e.g. Jansen
is the lemma for both Janßen
and Jansen
).
We then apply some formatting. Mainly, we add a column for the 'natural name', which transforms a name such as Veld, in 't
to in 't Veld
.
Note that the list still contains some strange cases, such as in 'tVeld
(missing space) and van 0s
(with the number 0 instead of the letter O), but we do not attempt to correct these.
Below are some (a-)typical examples or rows you'd find in the family_names_in_the_netherlands_with_natural_name.csv
file.
natural name | meertens db name | href | count in 2007 | lemma |
---|---|---|---|---|
Jansen | Jansen | link | 73.533 | Jansen |
Janßen | Janßen | link | < 5 | Jansen |
Trompeter | Trompeter | link | 0 | Trompeter |
Trompetter | Trompetter | link | 457 | Trompetter |
Trompper | Trompper | link | 37 | Trompper |
Trompslager | Trompslager | link | 0 | Trompslager |
Trompé | Trompé | link | 6 | Trompe (é) |
Van 't Veld | Veld, Van 't | link | < 5 | Veld, van 't |
van 't Veld | Veld, van 't | link | 431 | Veld, van 't |
van 't Veldt | Veldt, van 't | link | 37 | Veldt, van 't |
van 't Velt | Velt, van 't | link | 17 | Velt, van 't |
van 't Oosten, zich noemende Heijkoop | Oosten, zich noemende Heijkoop, van 't | link | 6 | |
Prinses der Nederlan Hare Koninklijke Hoogheid Máxima | Hare Koninklijke Hoogheid Máxima, Prinses der Nederlan | link | 0 | Hare Koninklijke Hoogheid Máxima, Prinses der Nederlanden, Prinses van Oranje-Nassau, Mevrouw van Amsberg |
Code is available under MIT License
Data is available under CC-0 License
Source | URL |
---|---|
Dutch name (Wikipedia) | https://en.wikipedia.org/wiki/Dutch_name |
Meertens Dutch Family Name Database | http://www.meertens.knaw.nl/nfb/ |
Maarten Trompper ([email protected])