What do Seinfeld and Big Data have in common?

Dan Grady's picture
 By | July 30, 2014
in WebFOCUS, Technology, Best Practices, Big Data, WebFOCUS overview
July 30, 2014

Word Frequency Analysis of the conversation on the Seinfeld Facebook page

The recent celebration of the 25th anniversary of the hit series Seinfeld got me to thinking, what can a self-proclaimed "show about nothing" have in common with the all-knowing force that is "Big Data"?

As I see it, there are several analogies that can be applied.

Seinfeld was originally designed to be a show about how a stand-up comedian went about gathering events (data) from his "everyday" life and creating content for his act. And what most episodes consisted of were several story threads converging in an interwoven and ironic fashion, to create the desired outcome: 30 minutes of laughter. 

Week after week, the distinct characters found that their individual life events always wound up impacting one another. Whether it was George being forced to reveal his very secretive ATM code, "BOSCO" to save a man from a burning building, or Kramer dropping a Junior Mint into a patient’s body during surgery that ultimately affects the value of a painting, the twisting plots of these characters managed to all fit together in the end.

Similarly, if you think about the core of many "big data" projects, the analyst is tasked with pulling together data from disparate sources, creatively merging it all together, and then telling a data driven story. The difference is that the desired outcome is business impact, not laughter. Regardless, if someone is laughing at your analysis, channel your inner George Costanza and start laughing as well to play it off as joke.

Much of the data that makes up these projects is based on the analysis of every day events.  I like to call this "all day, every day" data. By now most of us have heard the stat that 90 percent of all the data in the world has been generated over the last two years. The majority of this data is generated by the fact that almost everything in our daily lives is now digitized.

"Not that there’s anything wrong with that."

These data driven stories are so dependent on the underlying data. 

In the end all that the executive or management level wants and often what gets presented is only "top of the muffin" analysis, but it’s important not to disregard or throw away the supposed "stumps". This is because it’s the "stumps," or processes that the analyst has to go through to acquire and massage meaningful data, that often hold many big data projects back and hold the key to the project’s overall value.

The challenge is twofold.

The mighty "Internet of Things" is producing data at an unbelievable clip. Keeping up with the volume and variety of these "all day, every day" sources of data may seem overwhelming but it’s these sources that add that different dimension to "big data" projects and often drive their insights and value.

As if dealing with those third party external data sources wasn’t challenging enough, many analysts deal with unnecessary hurdles when trying to access internal data.  Many who I’ve spoken to have indicated that they have their own version of the "Soup Nazi" at their organizations and when they go looking for data the response is often "No Soup (data) for You!"

Once past the data access hurdles, assembling all of this diverse data into insights can also become cumbersome for the analyst.  A lot of manual data massaging can go on if they don’t have access to sufficient tools, and in some cases less sophisticated tools will offer inadequate data "blending" techniques. 

So while what gets presented in the application or visualization may look "real and spectacular," when questions start getting asked like "How did you come to this conclusion?" you get a lot of "Well, I pulled in this data and then Yada, Yada, Yada, here is a map with bubbles on it." Or since we are talking about bubbles the conversation may turn into a debate more pointless than "Moops vs. Moors" and if the data didn’t come from a managed and agreed upon source there would be no way to settle it.

At Information Builders, our customers get the benefit of the data integration and data integrity functionality being part of our overall platform based approach. So the analyst has a streamlined process to access the data from a variety of sources, and they can be confident that the data is being managed properly and it is trusted by everyone within the organization. You then layer our world class Intelligence capabilities on top of that and the analyst has a multitude of ways to allow others within the organization to consume the results, whether it be through advanced visualizations, easy to use InfoApps, or even search based applications.

Having this flexibility not only from a data access and integration perspective, but also from a consumption perspective is key to a Big Data project’s success.

Not everyone wants a "Puffy Shirt" or is capable of "reading lips" from across the room to understand what the data is saying.

What made Seinfeld such a huge success was that it wasn’t a show about "nothing" – it was a show about "everything" we experience, narrowed down and connected in a different way, giving us a different perspective in consumable (and enjoyable) 30 minute episodes. 

If more organizations approached their "big data" projects in the same fashion, they’d get better ratings. Narrow the scope down, create a data driven story from a different perspective that offers some insight to make a more informed business decision than before.  If that decision has a business impact, it will be a hit, regardless of its size.  Because sometimes, a series of "little" things can have a "Big" impact.

If you are interested in reading more about Information Builders approach to Big Data take a look at this White Paper

Trademark owned by Castle Rock Entertainment and Sony Pictures