An Interactive Analysis of Tolkien’s works – Character Co-occurence Lord of the Rings / (c) Emil Johansson Character Co-occurence Lord of the Rings / (c) Emil Johansson

Everyone who takes an interest in Tolkien fandom in recent years will have heard of Emil Johansson’s not only because some of his published work has been featured in WIRED, the Smithsonian or TIME Magazine but also because his way of presenting Tolkien facts and related ideas with both stunning visuals and quite a bit of humour (one of my favourites is his blog post dealing with the question: What if the Eye of Sauron was a light bulb?)

Now, with his latest addition to the fold Emil has raised the bar again for real-time social media research on Tolkien’s works. Yep, sounds weird but with the data provided on The Hobbit, The Silmarillion and The Lord of the Rings we will be able to have a fresh look at some of JRRT’s writings.

You will be able to browse through a number of extremely interesting points comparing Tolkien’s most well-known writings:

  • Word count and density
  • Character mentions
  • Keyword frequency
  • Common words
  • Sentiment analysis
  • Character co-occurence
  • Chapter lengths
  • Word appearance

I am not a statistician so I have to base this article on the results given as is. Please do keep in mind that judging Tolkien’s works from Emil’s analysis will probably have to see further scrutiny from a statistical point of view – but Emil has kindly told me that all data extracted are based on the eBook versions of the 2009 HarperCollins editions (as mentioned on his site) and implemented as well as possible. Nothing is perfect but as I know Emil to be a perfectionist on these things we may safely assume that we may rely on the data available.

This is just a quick look at the most interesting points, do have a look for yourself (particularly because your point of view will be different from mine, I am sure :))

Word count and density

The word count for the Silmarillion does not include the index in the back. The appendix to the Lord of the Rings has not been included in these numbers.

No surprises here, except maybe for one thing: the “Unique word density” value is highest for The Hobbit. Quite an intriguing piece of children’s literature.

Character mentions

Probably one of the most entertaining parts of this survey. If you have a look, for example, at The Hobbit, and enter Bilbo, Gandalf and Gollum, you will see that Gollum’s appearance is basically restricted to the chapter ‘Riddles in the Dark.’ When searching for Sauron in The Silmarillion you will find his singing contest with Finrod Felagund (protecting Beren) shown quite clearly as well as in ‘Akallabêth’ (i.e., Second Age) and ‘Of the Rings of Power and the Third Age’ where he played a substantial part. However, except for this, Sauron is basically not mentioned. With this feature it may well be possible to determine the relevance of certain characters in any given story.

Keyword frequency

ABOUT SEARCHING: Searching will give all results beginning with a query. Typing “Hobbit” would for example return both “Hobbit” and “Hobbits”. To search for a query as it is you must end with an extra blanc space, i.e. “Hobbit “.

Fairly self-explanatory although I don’t know why Emil put in ‘Mordor’ as the default search term 🙂 And yes, he mentioned in a tweet only minutes after publishing it that someone had searched for ‘sex’, of course. Do have a look yourself what you consider important. A very good overview, for example, visualising the story flow with The Lord of the Rings ‘concerning Hobbits.’

Common words

If we forget about the verbs things are pretty clear for The Hobbit: it is all about Bilbo, Gandalf and Thorin & Co. With the three volumes of The Lord of the Rings it becomes very clear to those who may still not believe it – Tolkien isn’t about amazing heroes and their swords, it is about the most famous of anti-heroes in literary history, Frodo and Sam. Yes, there is mighty Gandalf, the wizard, and the returned king, Aragorn, but they are nothing against those two Hobbits. Oh, and if you ever thought about Middle-earth being a place to escape to because that’s what escapists do – the words coming up in all of the four books mentioned are great, dark, far and Gandalf. If you think of him as an ‘Odinic wanderer’ (as Tolkien called him in Letters, no. 107) then you will know what a happy and welcoming place Middle-earth is to a tourist.

Sentiment analysis

These graphs show an analysis of the feeling for each page throughout Tolkien’s works. The sentiment has been analysed for each sentence and then average over each page. Green, yellow and red indicate positive, neutral and negative sentiments respectively.

This is probably the single most amazing feature in this long list. Yes, to define “sentiment” and to determine its relative values will be open to discussion but judging from the results available one can quickly see a few things (some of them surprising):

  • The Hobbit isn’t the happy children’s book some people would have it. Fear is a powerful emotion and that is probably the most important single emotion in this book, particularly in the chapters ‘Riddles in the Dark’,’Flies and Spiders’ and ‘Barrels out of Bond’.
  • The saddest stories by far in The Silmarillion are ‘Of the Darkening of Valinor’ and ‘Of the Flight of the Noldor’ as well ‘Of Túrin Turambar’, making it the (probably single most) important epitome of Greek tragedy in the 20th century.
  • The Lord of the Rings isn’t all about violence and war as such (negative sentiments). Green (positive sentiments) and yellow (neutral sentiments) are overwhelmingly present with the exception of ‘The Return of the King’ with its climactic battle and Frodo’s and Sam’s journey into Mordor.

Character co-occurence

A visual representation you have to look at for yourself 🙂 But wow, does this tell you a lot about the importance/ networking aspects of the characters.

  • The Silmarillion. Melkor is the man. I mean, really!
  • The Hobbit. Bilbo is the hobbit. I mean, really!
  • The Lord of the Rings. Frodo and Sam. I mean, really! Character Co-occurence Lord of the Rings / (c) Emil Johansson Character Co-occurence Lord of the Rings / (c) Emil Johansson

Chapter lengths

Go for the vertical bars, directly – I find them most useful. Again, for a quick perusal on chapter length as an indication for importance to the story line this is a very neat overview. This has to be seen, of course, in comparison to the rest of the story (‘The Council of Elrond’ as a major part exposition chapter has to be long!)

Appearance of new words

There are a lot of words, characters and locations in Tolkien’s works. This is an attempt to visualize where in the books new terms and words appear.

Tolkien as a master craftsman in words. It would be most wonderful to have a comparison on this with J.K. Rowling, George R.R. Martin und Terry Pratchett. I’d love to see how they are holding up (my guess: Pratchett – Martin – Rowling – in descending order.) Or maybe I am completely mistaken 🙂 and Tolkien didn’t use as many different words as they have. Only a comparison would give us certainty.

The impact of visualising data in Tolkien’s works

Emil (and possibly others, but he is the prime example to me) has started a new trend in Tolkien fandom, fueled by the simple fact that the digital revolution, software knowledge and an eye for truly interesting questions may not only lead you to run a website full of geeky humour but one which is also tremendously insightful. Now, statistics aren’t all in life and any data presented will always see at least two different interpretations (you know, global warming isn’t real, depending on who interprets the data available): The Smithsonian simply stated that in Tolkien’s works are 81% male, others called Middle-earth a sausage-fest. However, these huge amounts data are still valid for research and they may provide new insights we would otherwise never have come up with, even if only because of the simple fact they are visualisations. An image often provides a different angle to supposedly well-known facts, thereby opening up the way to closer scrutiny and improvement in research.

I am not quite sure yet on where this is leading us but I am quite convinced visualisations such as those provided by Emil will offer us even more interesting and delightful discussions.






Marcel R. Bülles

Marcel R. Bülles is the author of, a specialist blog centering on worldwide Tolkien fandom, geekdom and research. He works as a freelance translator, journalist and writer and is the founder of the German Tolkien Society as well as a co-founder to RingCon, Europe's formerly biggest fantasy film convention. You can find him in cafés all over the world sipping an espresso blogging, writing, reading. At one point he was married to an extremely lovely French lady by the nickname of Sauron. Yes, that Sauron. He is also active with the International Tolkien Fellowship on Facebook and the Tolkien Folk on Instagram.