The volume associated with the Big Data phenomena brings along new challenges for data centers trying to deal with it: its Variety. With the explosion of sensors, and smart devices, as well as social collaboration technologies, data in an enterprise has become complex, because it includes not only traditional relational data, but also raw, semistructured, and unstructured data from webpages, web log files (including click-stream data), search indexes, social media forums, e-mail, documents, sensor data from active and passive systems, and so on. What’ s more, traditional systems can struggle to store and perform the required analytics to gain understanding from the contents of these logs because much of the information being generated doesn’t lend itself to traditional database technologies.
Quite simply, variety represents all types of data—a fundamental shift in analysis requirements from traditional structured data to include raw, semistructured, and unstructured data as part of the decision making and insight process. Traditional analytic platforms can’t handle variety. However, an organization’s success will rely on its ability to draw insights from the various kinds of data available to it, which includes both traditional and nontraditional.
Just 20 percent of the data is of the relational kind that’ s neatly formatted and sits ever so nicely into strict schemas. But something like 80 percent of the world’s data (and more and more of this data is responsible for setting new velocity and volume records) is unstructured, or semistructured at best. If we look at a Twitter feed, we’ll see structure in its JSON format, but the actual text is not structured, and understanding that can be rewarding. Video and picture images aren’t easily or efficiently stored in a relational database; certain event information can dynamically change (such as weather patterns), which isn’t well suited for strict schemas, and more. To capitalize on the Big Data opportunity, enterprises must be able to analyze all types of data, both relational and non relational : text, sensor data, audio, video, transactional, and more.