I think now is a good time to bring up this follow up post to the recent Geurilla White Paper posting. I’m sure for some, looking at the paper was an immediate shock of words and numbers! I hope in this small version there’s a bit more understanding of what the paper tried to get across.
So whats this paper about? Social Media Measurement and the common use of machine sentiment engines as a way of measuring effectiveness….. or how un-effective this is based on the research we’re presenting. Getting right down to the point, language is COMPLEX! Thats to say that the words and messages we share with our friends are full of double meanings, slang and one of my personal frustrations lack full context. I’m sure most people have had online conversations where you’ve had to defend what you’ve said as it was read in the wrong way.
Knowing that online messaging is really complex, often times in measuring the success/failure of social media initiatives firms use a sentiment engine to rate mentions of their brand on a scale of Negative, Neutral and Positive. With this in mind, we the Syncapse Measurement Science team looked to see if this was an a reliable measure.
What did we find? That again language is complex and no two people will perceive a set of messages the same way.
This research was based on an earlier survey we released on twitter where we asked people to rate a set of 20 messages ( all taken from a twitter search for ‘books’). The most significant finding of this survey is the lack of agreement on the survey as a whole. With such a simple answer set we did not find a single pair of respondents (out of 102) who could agree on the sentiment of these messages.
To put a further test in our survey we also put a wrench in the works with a trick question. We decided to repeat a question. While a majority of people rated sentiment of this question the same on each response, 19% didn’t! So in a span of a few minutes a person’s feelings about a message can change [ meaning you have to be particularly careful in how you time your emails! ]
What in the end do the findings of this paper mean? Well to quote directly from the paper:
” If a group of 102 humans could not agree, how can a single machine output a single score that everybody would agree with?”
The current sentiment score system that’s in use isn’t netting information that should be depended upon to make real business decisions. Language is just too complicated for a human programed machine to define the sentiment of social media messages.
But WAIT there’s more! If current sentiment scores should not be used in decision making then what should be put in its place? Well I know a team that does some mighty fine work in social media measurement ….. Ok, so there’s ALOT to measure in regards to social media measurement and no one has yet created the gold standard. As an emerging industry, measurement is still playing catchup to everything else currently happening (which keeps things interesting). What we could agree on with things as they currently stand is a more pragmatic approach, looking at the measures and feedback that is important to a brand( or ‘Brand Health’). This means looking at the overall online communications about a firm and determining what aspects of these activities are most important to have knowledge of.
Have any questions/comments ? Contact me at kevin@kevrichard.com or send me a twitter message .
PS: I know I’ve been on a massive blog hiatus, I’m looking to change this ASAP!
Tags: Measurement, Sentiment, Social Media, Syncapse