Author | Marketer | Speaker

I help companies turn data, ideas and relationships into reach and influence. 

With noise in the numbers, how can brands find social signal strength?

When people talk about social data, they usually focus on two dimensions that are relatively easy to measure and articulate—volume and source growth. More data coming from more places. But something else is happening with a lot of social data that makes it difficult to draw conclusions from on its own—noise is increasingly making it difficult to achieve true signal strength. A lot of companies may be experiencing this without even realizing it.

Let’s say I’m a corporate user of the average listening tool, and I’m trying to trend some of the standard social metrics over time. Specifically, I’m looking at number of mentions, follower/fan/subscriber numbers, and sentiment for Twitter, Facebook and blogs.


Imagine that hundreds or thousands of tweets and blog posts mention my brand every week, and that number went up this week. In raw form, what does that number tell me? On the surface, it tells me we’re doing something right…right? Not necessarily. How many of the accounts that tweeted it are actually human, controlled by humans and followed by other humans? Estimates of the prevalence of Twitter spam accounts are difficult to come by, but some are as high as 48% and even 57%. It’s clear that spam mentions shouldn’t factor into your assessment of your brand’s social performance, but that’s just the tip of iceberg. Redundancy matters, too. If two accounts for the same entity tweet the same positive thing about your brand, for instance, that doesn’t mean you have two advocates. The best you can hope for is that different sets of real people follow them, and that same message thus reaches more people.

Within that spike of activity, you notice that blog mentions of your brand have gone up. You dig a little deeper and see that the blogs are actually saying the same exact thing—down to the letter. It’s scraped content that has been duplicated over and over across the web, usually without the original author’s permission. Maybe a keyword in the text triggered it, or maybe your own content has been added to a feed that disseminates it into hundreds of nearly-identical (and totally useless) scraping sites across the web. If only one of these mentions is original content, and 100 of them are scraped content, the raw data tells you that your presence on blogs has increased a hundred times over. It hasn’t. In fact, if it’s your content, you’re likely being hurt because duplicate can hurt your search rankings.


Now let’s talk sentiment. Most tools out there today for assessing sentiment from unstructured social data aren’t very accurate, but that’s not a problem with the data itself. One of the biggest problems is that most companies want to know the sentiment of people toward their brand and products, and raw, unstructured social data is full of data from non-people, like automated RSS feeds. For example, if my company puts out a press release—which of course will contain a lot of positive text—and it’s picked up by 10 different automatic Twitter or blog feeds that post things from the various press release wires out there, this tells me absolutely nothing about how people feel about my brand. If an actual human reads the release and posts something negative about it, my aggregate sentiment data is going to reflect something completely false: that positive sentiment is 10 times higher than negative sentiment.


On to follower/fan/subscriber numbers. Would you rather have100 followers/fans/subscribers that never interact with you in any way, or one follower that does? The only thing those 100 followers can do for you is provide a tiny amount of social proof by making you look more important to people that use follower count as a proxy for importance. But if your single follower actually pays attention to you, responds to your calls to action, or shares your content with their followers on occasion, he or she is way more valuable to your brand—and you’ll have to earn it.

What’s a brand to do?

Tom Foremski outlines the problem well:

“Accurate data on social media users is essential. It’s the foundation of all successful social media marketing and advertising campaigns: the precise targeting of related groups of users with their interests.”

The best solution to this problem has three parts.

  1. Raw, unstructured social data needs to be processed, filtered and cleaned up before it means much of anything
  2. Once signal is separated from noise, it should be paired with reliable data from other sources to create a more accurate, holistic view of your customers. For example, you can match your social data to your CRM records
  3. Look for direct results, not proxies. Are visitors from Facebook converting at higher or lower rates than other visitors? How much did revenue increase after a product change was made based on your analysis of social feedback?

None of these things are particularly easy. All of them are totally worth it.


© 2016 Ian Greenleigh