Who’s Got The Biggest One? Vocabulary, that is.

The other day, I had an interesting debate with some of my advanced students about who’s got the biggest one … vocabulary, that is.

English naturally, I inform them and I am immediately challenged on this. One of my students even had the audacity to claim that his own language (Polish) has a larger vocabulary.

Polish – the largest vocabulary? Nie!


First, allow me to contextualise this. These guys – who work in a top end IT related industry, more details of which I cannot divulge – are seriously brainy. So brainy that they are headhunted from all over the globe, remote working being one of three benefits brought about by the Chinese flu. For the record, the other two were cheaper petrol (at least for a while) and social distancing. I don’t like getting too close to random people, so this has worked very well for me.

I digress.

I just about keep ahead of these guys in our lessons because a) I’m a native speaker with a fairly big one … vocabulary again amigos and, b) I’m the teacher – ergo, I am always right. Now, you’re probably wondering why I used the word “ergo” instead of “so”, and the answer is because I can do. I know what it means, and I use it with the same panache that dullards use the word “eclectic” – a word, incidentally, which I detest – at every opportunity. This is because they’ve heard it somewhere, and it fits nicely with their image of projected self-betterment. I’m going to come back to “ergo” in a minute, but let’s move on.

So … drumroll … who has got the biggest one?

Now I know this is going to disappoint you, but the consensus view is that it’s impossible to tell.

The writer of a fairly venerable, but nonetheless informative article in The Economist, countered a claim made by Stephen Fry that English wins the prize by a country mile. And if you’ve got time, it’s certainly worth a read .

If you haven’t, allow me to paraphrase some of it for you. Firstly, the only way to assess the size of the vocabulary of a given language is by counting dictionary entries. But it’s not as simple as that. First there are headwords; let’s take “run” as an example. Run, runs, ran and running are forms of the same lexeme with run as the lemma by which they are indexed. Run is a noun and a verb; it is a form of escape; it is a physical activity; it can be a string of connected happenings, or it can be the method of scoring in cricket. The lemma is used to locate an entry but the gloves are off after that, and it is entirely up to the interpretation and the orientation of the lexicographer as to what is included and what is not. And that’s before we even mention derivatives, dialectic entries, or subentries.

Now, I don’t want to get bogged down with this, so I’ll just tell you that if we follow the “no holds barred” approach of certain dictionaries, Korea wins hands down with a staggering aggregate of 1,100,373 words. But – and it is a very big but – this includes the dialects of both North and South Korea and is compiled as an online dictionary, so in my view this doesn’t count. I’ll tell you why.

Little Sung-ho (meaning successor and greatness) goes to school and teacher asks him to look up a certain word in his dictionary.

Little Sung-ho – why no dictionary?

‘I’ve not brought it,’ little Sung-ho replies.

‘And may I ask why?’ asks the teacher, smugly.

‘Because,’ replies Sung-ho, ‘it consists of 20,486 pages and weighs more than my father’s largest ox.

Okay, so let’s narrow our search down a bit and select only dictionaries that are a) regularly updated and, b) that don’t include dialectical entries. Why not, you may ask? Simply because ‘wile cowl’ – an expression with extensive usage in the North-West of Northern Ireland, would have absolutely no meaning for a Cockney … or a Scouser … or a Scotsman. Get it?

So let’s focus our research on the above, and guess what? The Oxford English Dictionary (OED) wins by several furlongs. By the same yardstick, the Swedish language is the smallest, weighing in with a mere 20,000 headwords.

For the record, using similar criteria, here is the scoreboard relating to European languages:

English:          171,476

Polish:             140,435

French:           135,136

Russian:          130,782

German:         100,902

Spanish:         93,308

Ha! I was right. I must confess I was a little surprised that Polish was ranked in second place, so well done to my learned student: you were nearly right, but your language is still 31,042 words short of knocking English off its perch.

Another thing about the methodology used by the OED is that constant reviewing allows the inclusion of new words and the ejection of redundant ones, and there are currently 47,156 of the latter.

Language is constantly evolving. When Chaucer finished writing The Canterbury Tales – a tome often credited with being the cornerstone of the English language, merging Anglo-Saxon with certain derivatives of French, ancient Greek and Latin – he didn’t sit back and say: ‘there you are: English – sorted.’ Well, of course he may have, but there is not record of this.

The OED works a bit like Homer Simpson’s brain: when a new piece of information is pushed in through one ear, the piece that has been retained the longest is pushed out through the other.

Homer Simpson
In one ear, and …?

So, for example, in 2017 the word ‘selfie’ (a word which I detest as it reflects all that is bad about young people: namely their selfishness and lack of social integration with normal humans) replaced ‘charabanc’ (I word that I happen to love). And if you don’t know what a charabanc is, look it up. Duh … it’s not in the OED any more; but it is in Wikipedia. And before you say it – yes, ‘charabanc’ is a French word, ‘ergo’ has it’s origins in classical Greek, and therefore perhaps we should exclude all vocabulary which can be traced back to another language. However, I make the rules for this particular survey, so we’re not going to.

Didn’t we have a lovely time the day we went to Bangor?

Amig@s, I do hope you’ve found this to be informative and entertaining. I’ll confess that my research could have been conducted in greater depth but I’d still be right, as I’m the teacher.

However, if you want to argue the toss, do please leave a comment.

My next blog with tell you about the joy of international travel during a pandemic. I’ll bet you can barely wait.

Hasta pronto chic@s!

3 Responses to Who’s Got The Biggest One? Vocabulary, that is.

  1. Cec says:

    Good job you put an ‘e’ into ‘ergo’ ha ha
    Whilst we are on words, my pet hate is ‘obviously’… what’s obvious to one person is not obvious to another. Whoops! I’ve just used the word twice… enjoy the vino and the sunshine.

  2. David Stewart says:

    I still use ‘charabanc’ occasionally.
    Much nicer word than people carrier.

