What is Big Data? (Part 1)

So in my last post, I was talking about Business Intelligence. And one may think that now we have a good handle over my data, I am all set for making wonderful business decisions and everything is good ever after. If the world was so fair!

Hmm, what do you mean? What happened??
Before we could munch the data (or byte the data :) ), the volume of data increased.

How much? Gigabytes? Terrabytes?

Data is pouring in from every conceivable direction: from operational and transactional systems, from scanning and facilities management systems, from inbound and outbound customer contact points, from mobile media and the Web. The following facts will help to understand what I am talking about:

  • Wal-Mart handles more than a million customer transactions each day and imports those into databases estimated to contain more than 2.5 petabytes of data.
  • Radio frequency identification (RFID) systems used by retailers and others can generate 100 to 1,000 times the data of conventional bar code systems.
  • Facebook handles more than 250 million photo uploads and the interactions of 800 million active users with more than 900 million objects (pages, groups, etc.) – each day.
  • More than 5 billion people are calling, texting, tweeting and browsing on
    mobile phones worldwide
  • The Large Hadron Collider at CERN the European Organization for Nuclear Research can generate 40 terabytes every second during experiments

We have officially entered the Big Data era of computing. And the hopeful vision of big data is that organizations will be able to harvest and harness every byte of relevant data and use it to make the best decisions. Big data technologies should not only support the ability to collect large amounts, but more importantly, the ability to understand and take advantage of its full value.

Defining big data
Let’s define big data now. Big data is broadly defined as the capture, management, and analysis of data that goes beyond typical structured data, which can be queried by relational database management systems — often to unstructured files, digital video, images, sensor data, log files, and really any data not contained in records with distinct searchable fields. In some sense, the unstructured data is the interesting data, but it’s difficult to synthesize into BI or draw conclusions from it unless it can be correlated to structured data.

Some further interesting reads:

About these ads

4 thoughts on “What is Big Data? (Part 1)

  1. I always have/had this perception…Data is huge, but probably not all of it is useful. Is software industry looking at ways to filter the data at source itself to populate ‘useful data-sinks’ ? Or would you even say that even the filtered data is ‘Big’ !

    ‘Big’ data that is being buzzed all over seems to mostly refer to unstructured information like Tweets, Facebook updates, Google searches etc., Do you think even the transactional data is being considered ‘Big’ ? My personal perception is that transactional data being structured in nature can always be subject to better filtering and can always be limited to ‘practical’ sizes in the current computing limits.

    And by the way, have you seen any references to applying the Big data computing to solving any problems in Agriculture etc., Weather I certainly know they do use models computed by super computers for forecasting…but weather always plays jokes/fools with super computers :-)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s