Big Data: Big Challenges and Big Opportunities

Uncategorized Posted Sep 27, 2012 by chenpr

Last week, I attended Mass TLC’s seminar, “Big Data: What Does All This Data Mean,” at the IBM Innovation Center in Waltham, MA. The discussion was led by moderator, Richard Dale, Managing Director of the new Big Data Boston Ventures. A panel discussion ensued between an impressive group of industry professionals:

Martin Leach, Chief Information Officer at the Broad Institute of MIT and Harvard
Andrew Pandre, Principal at Sears Holdings Corporation
Irene Greif, Chief Scientist for Social Business at IBM
Some key takeaways from the seminar include:
• Big Data is a collection of data sets so large and complex that it becomes difficult to process using hands-on database management tools.
• Data visualization plays an essential role in extracting value from the data but it’s not always an accurate representation of the data
Ethical implications and dangers of data visualization

In his opening remarks, Dale first defined big data as “a collection of data sets so large and complex that it becomes difficult to process using hands-on database management tools.” He then followed with a personal definition, saying that “the data becomes big because it is easier and cheaper to collect than it is to analyze.” This definition seemed more effective in sparking a discussion focused on what to do with all the data. And while a broad range of topics were broached, Pandre, who is responsible for Data Visualization Architecture, led a conversation focused on big data visualization. Data visualization, a term new to me, is defined as the study of the visual representation of data, expressed through graphical means, combining aesthetics and functionality to maximize effectiveness.

In his own opening remarks, Pandre explained that data visualization is essential because it allows people to extract value from the data. Once the floor was opened up for discussion, the other panelists echoed this notion. “Visualization lets you drill down to what is meaningful,” Martin said. Greif talked about IBM’s tool Many Eyes as a data visualization tool, and believes that the same techniques could and should be used by many companies. From here, the focus became how to effectively and ethically represent big data.

In discussing the dangers of big data, the panelists agreed that data visualization is not always an accurate representation of the data. Depending on the data and the way it’s displayed, visualization can modify what it truly represents. This concept, echoed by the panelists, underscores the importance of ethics in relation to big data. According to Leach, visualization is important because it creates a talking point, but it is important that there is an understanding of what lead to the visualization.

Data visualization just begins to scratch the broad surface of big data. Big data challenges facing companies include storage, analysis, and the capturing of data. One CHEN client, Sonian Inc., is immersed in the storage discipline, helping provide email archiving and search in the cloud for a variety of customers. At the end of the seminar, I was hoping Dale would give a summation of the points made by each panelist, and attempt to provide an answer to the question posed in the seminar sub head, “What Does All This Data Mean.” Instead, Dale concluded with a comment that kept with the tone of the panel, saying that as much as we would like to make big data simplistic, we’re not quite there yet. Big Data is a big topic and as the moderator and panelist recognize that – everyone is really just starting the hike up the mountain of big data.

By Nick Rossetti