Presidential Politics Data Series

Does Size Matter? How Accounting for Tweet Volume Impacts Sentiment Analysis and What That May Mean for the Election

by Sarah Ammar

While the debate about the size of Donald Trump’s hands is in the not-so-distant past of the primaries, tweet volume is on our minds at the Applied Policy Research Institute the day after the first debate of this general election.

If you followed our live tweets, or those of our friends at Kno.e.sis or Cognovi Labs, you saw some interesting real-time data output about how Twitter users were responding to the candidates during last night’s debate.

As a refresher, Cognovi Labs pulls in tweets about the candidates and assigns them a “sentiment score.” So if a tweet discusses a candidate, it’s counted in the candidates sentiment score. If that tweet says something negative about the candidate, it’s given a score of “-1” and if the tweet says something positive, it’s assigned a “+1” sentiment score. Those candidate related tweets are then aggregated for each and assigned an average sentiment score. Think of it as a sliding scale of favorability (or unfavorability).

Using the geospatial output interface of Cognovi’s Twitris tool, we watched the sentiment across the US ebb and flow for the candidates. The day of the debate, at about midday it appeared Trump was doing significantly better, from a Twitter sentiment perspective. Nationally he was holding strong positive sentiment in a number of states, though Ohio was a holdout of comparably negative sentiment for both.

Midday.jpg

And even minutes before the debate, all though some of his stronger favorability had tapered off, Trump was still leading in national sentiment, and even seemed to be pulling ahead in Ohio.

Pre debate.jpg

Once the debate kicked off, the map started to very, very gradually warm, though not much for Clinton's sentiment, and Trump began to lose the favorability he’d held most of the day. 

minutesin.jpg
 

This trend continued until about 10:14pm, just following comments by Clinton that she had prepared not only for the debate, but for office of the President.  At that point, a sudden drop in sentiment for Clinton occured, and the trend reversed. Although we cannot attribute this drop to those comments, they do correlate strongly with Twitter reactions following these comments.

Once that change occurred, the map began to warm back up in Trump’s favor. For a substantial portion of the debate, both held overall negative sentiment, with Trump being the candidate with “less negative sentiment” which underscores the “lesser of the two evils” narrative heard in popular media. By the end of the debate, however, Trump’s map included some states with clearly positive sentiment, and several neutral. Clinton however showed negative to very negative sentiment nearly across the board.
 

postdebate.jpg

Accounting for Volume

But we also recognized another factor: Tweets about Donald Trump outnumbered tweets about Hillary Clinton more than two to one. During our live “war room” analysis with Cognovi and Kno.e.sis, we discussed controlling for volume, which statistically adjusts for things like retweets and other factors, and “equalizes” the volume for each candidate. Once that was done, Cognovi’s trendline for the duration of the debate was that Clinton outperformed Trump on sentiment.

postdebatelines.jpg

So the question we’ll be investigating further is, does the magnitude of the Twitter conversation about Donald Trump, coupled with the sentiment scores amount to a mobilized base of actual voters? Or does it simply mean more people like to tweet about the candidate over his opponent? If the answer to the former is yes, then raw sentiment scoring is the best indicator of the debate performance and possibly an indicator of the election outcome.

If not, normalizing for volume will be critical. As you can see in the normalized trend lines, Clinton is the clear debate winner, which is consistent with most post debate polls.

However, the curious data point comes just after the debates conclusion, when Trump regains the sentiment lead. This indicates that even when controlling for volume, the needle, once the debate was over, was unmoved. 

Either way, volume is an important factor we’ll be watching in upcoming debates. Stay tuned for more, and don’t forget to follow us on Twitter at @apri_wsu, as well as our friends @CognoviLabs, and @Knoesis for more.


Sarah Ammar is the Senior Strategist for the Applied Policy Research Institute.