Sentiment Analysis & Book Publishing

Posted: August 25, 2017  |  Updated: December 9, 2022

StoryFit Emoji - Steely Blue face smiling with Chills

sentiment analysisA few weeks ago, I talked to our developers about a phrase I heard them throwing around a lot: Sentiment Analysis. Once they finished their explanation, I immediately asked them to do a write up for the blog. This is fascinating! The people must know!

And…they gave me a blank stare, a kind smile and then promptly went back to work (as they should). So, with their guidance and fact checking, I’ve tried to translate their detailed, data-rich reports and updates for your reading pleasure.

So, to start off, what exactly is Sentiment Analysis and why are we talking about it? Simply put, sentiment analysis is one way of looking at books and is one of the analytic methods we use to analyze manuscripts. Technology can actually interpret the very life and breath within a manuscript.

How does that work?  

Sentiment Analysis 101

At a glance, sentiment analysis is fairly straightforward: text is analyzed and, using natural language processing, each part of the text is categorized positive–happy statements–or negative–sad/angry statements. Within each language, words can be determined positive (elated, kiss, jump), negative (smash, kill, cry), or neutral (the, a, road). Take these altogether and graph the results, and you can see the emotional arc of a text mapped out in a physical form, called a sentiment map.

Recent Sentiment Analysis Study —  Best Sellers

If you’re still with me, maybe you’re thinking: what can the sentiment map tell us about plot? We mapped out three extremely popular best sellers — Fifty Shades of Grey, The Girl With the Dragon Tattoo, and Gone Girl–to illustrate how these novels are constructed and why some books are said to be more surprising than others.

All three of these books buck the conventional trend of a high, happy opening and a high, happy, neatly wrapped up ending: they all have endings that are dramatically lower in sentiment than the highest point of the book or even the beginning. And, looking at the plot points for each, the graphs make sense.

*Spoiler Alert for all three novels*

1. 50 Shades of Grey by E.L. James

Unlike most romances, the end of Fifty Shades is not happy.  You can see the sentiment taking a sharp decline in the last few pages as the book ends with the main character crying, swearing she never wants to see her lover again.

2. Girl with the Dragon Tattoo by Stieg Larson

The major dip around 3/4th of the way through the The Girl With the Dragon Tattoo is when Mikael is trapped by Martin Vanger and almost dies. The sentiment rises as Lisbeth frees him, and continues to rise as Mikael publishes the expose on Wennerstrom. The sentiment heads back downward through the end of the book as  Lisbeth goes to tell Mikael she loves him, only to find him with Erika Berger (Poor Lisbeth hasn’t she been through enough?!)

3. Gone Girl by Gillian Flynn

Arguably the progenitor of a recent trend in unreliable, potentially psychotic female central characters, Gone Girl’s graph is easy to follow. Consider: in the first half, Nick is looking for his missing/possibly dead wife Amy, during which time it’s revealed how much Nick cheated on her (a LOT) the seriously fatal state of their marriage, meanwhile Nick is named the number one suspect in her disappearance. So, the general downward spiral, errr slope, of the graph makes sense. The brief, sharp climb around point 60 is when Amy comes back and it turns out Nick won’t be going to jail, but the sentiment just plummets right on down because Nick is still unhappy, Amy is still terrifying and dangerous, and life does not seem to actually be getting better.

The final, brief peak at the end is when Amy tells Nick she’s pregnant, but even that is barely above neutral. In this world, news of Amy and Nick spawning is not really good news, and the sentiment analysis confirms that. And, additionally, this bit of information may give us insight to why so many people read this book and thought “WTF IS UP WITH THAT ENDING?!” It’s because most novels don’t end with a fairly neutral ending; so the reader is left feeling unsettled, as if  there is something missing that they can never recoup.

Had I convinced my technology-minded colleagues to write this blog, it may have ended with a discussion of linear regression, decision boundary, sentiment vectors. and macro-arcs. But since my background is in publishing,  I will end more philosophically. Thinking about editing and reviewing, this kind of technology is so exciting. It gives an editor another way to explain the pacing of a book to their writer. It gives a publisher the ability to look at the breadth of their work to see what kind of brand and niche they are establishing and where the holes may be that they could fill. It gives writers an ability to objectively see if the impact they are trying to achieve is actually coming across. And these are just a few of the possibilities.

Sentiment Analysis and other machine learning actions are just additional tools in the hands of smart literary professionals. It’s a way to analyze across books and within books. And a way to look at thousands of books in the same time you or I can review one or two.