The Data Of Deceit

About this year’s Datafication Project : Social data exposes lies.

Truth, Lies and Trust on social media.

Let’s face it, we all lie on social media to some degree; sometimes to make the story funny, sometimes to make ourselves look good or merely to airbrush out the negatives. Marketers get hot and sweaty about building positive sentiment for their brand on social media, they REALLY want you to like and trust their brand. Some marketers fear social media, so this is our small contribution to taking away that fear. If you understand why people are deceitful in social media you take away some of that trepidation.

Working with Dr. Suresh Sood from UTS we set about focusing Datafication, our ongoing big data analytics project, on the thorny topic of truth, lies and building brand trust in social media.

First of all we started to look at the science of lie detection in the written words we all post every day on social media.

One of the leading experts in this area is James Pennebaker, Professor in the Department of Psychology at the University of Texas in Austin; he demonstrated the relationship between the language we use and its relationship with our state of mind and behaviour. Interestingly we are pretty bad at detecting deceit and lies, with humans only picking up 54% of untruths that are served up each day. In fact, according to the University of Portsmouth we are lied to as many as 200 times a day. We also start young with our deceit with research showing that 6-month-old babies develop deception skills (crying/laughing etc.) to get what they want.

In social media we don’t have access to physical ‘tells’ of deceit like eye contact, fidgeting, etc. Instead we use words and symbols to communicate, so the data analytics team at The Works decided to develop a deceit algorithm that takes the science and applies it to large social media data sets in order to understand more about the nature of deceit in social media. Our words are very revealing, for example if we use the pronouns ‘I’ or ‘me’ we are less likely to be lying as we subconsciously distance ourselves from what we know to be a lie.

The deceit algorithm takes 4553 such indicators (including for the first time emoticons) and classifies them into categories and allows a score to be appended to any social media post. This score is indicative of the presence of deceit and by aggregating these scores we are able to look at facts, trends and comparisons across different social platforms. We overlaid other data points like gender, location and nationality to deepen the insights.

There are two types of lies in the analysis: White Lies (embellishment) and True Lies (deceit), we looked at both.

This Datafication website will continue to sharing learnings and insights from the project.

The navigation at the top of this blog will lead you to our previous years of insights.