Data Science Vs BI & Predictive Analytics

Business intelligence (BI) has been evolving for decades as data has become cheaper, easier to access, and easier to share. BI analysts take historical data, perform queries, and summarize findings in static reports that often include charts. The outputs of business intelligence are “known knowns” that are manifested in stand-alone reports examined by a single business analyst or shared among a few managers. For example, who are the probable high-net-worth clients to sell them a premium bank account. There can be some consideration like the average account balance etc.

Predictive analytics has been unfolding on a parallel track to business intelligence. With predictive analytics, numerous tools allow analysts to gain insight into “known unknowns”. These tools track trends and make predictions, but are often limited to specialized programs. In the previous example, the probable high-net-worth client could also be the spouse of an existing high-net-worth client that can be figured out using predictive analytics.

Data Science on the other hand is an interdisciplinary field that combines machine learning, statistics, advanced analysis, high-performance computing and visualizations. It is a new form of art that draws out hidden insights and puts data to work in the cognitive era. The tools of data science originated in the scientific community, where researchers used them to test and verify hypotheses that include “unknown unknowns”. Here are some of the examples:

  • Uncover totally unanticipated relationships and changes in markets or other patterns. For example the price of a house based on nearness to high voltage power lines or based on brick exterior.
  • Handle streams of data—in fact, some embedded intelligent services make decisions and carry out those decisions automatically in microseconds. For example analyzing the users click pattern to dynamically propose a product or promotion to attract the customer.

As discussed, Data Science different from from traditional business intelligence and predictive analytics in the following way.

  • It brings in data that is orders of magnitude larger than what previous generations of data warehouses could store, and it even works on streaming data sources.
  • The analytical tools used in data science are also increasingly powerful, using artificial intelligence techniques to identify hidden patterns in data and pull new insights out of it.
  • The visualization tools used in data science leverage modern web technologies to deliver interactive browser-based applications. Not only are these applications visually stunning, they also provide rich context and relevance to their consumers.

Data science enriches the value of data, going beyond what the data says to what it means for your organization—in other words, it turns raw data into intelligence that empowers everyone in your organization to discover new innovations, increase sales, and become more cost-efficient. Data science is not just about the algorithm, but about deriving value.

 

Disclaimer: The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.

 

Advertisements

The Best Data Science Platform

Data science platforms are engines for creating machine-learning solutions. Innovation in this market focuses on Cloud, Apache Spark, automation, collaboration and artificial-intelligence capabilities.When choosing the best one, organizations often trust on The Gartner Magic Quadrants which aims to provide a qualitative analysis into a market and its direction, maturity and participants. Gartner previously called these platforms “advanced analytics platforms”. But since this platform is primarily used by data scientists so from this year the Quadrant has been renamed to Magic Quadrant for Data Science Platforms. 

This Magic Quadrant evaluates vendors of data science platforms. These are products that organizations use to build machine-learning solutions themselves, as opposed to outsourcing their creation or buying ready-made solution. These platforms are used by data scientists for  demand prediction, failure prediction, determination of customers’ propensity to buy or churn, and fraud detection.

The report aims to rank the BI platforms on the ability to execute and the completeness of vision. The Magic Quadrant is divided in 4 parts:

  • Niche Players
  • Challengers
  • Visionaries
  • Leaders

    gartnerdatascienceplatform
    Source: Gartner (Feburary 2017)

Adoption of open-source platforms and Diversity of tools is an important characteristic of this market. IBM’s mission is to make data simple and accessible to the world and commitment to open source and numerous open-source ecosystem providers made it most attractive platform for Data Science.  A data scientist needs the following to be more successful, which is provided by IBM Data Scientist Experience

  • Community: A data scientist needs to be updated with the latest news from the Data Science Community. There are plenty of new Open Source packages, libraries, techniques and tutorials available every day. A good data scientist follows the most important sources and shares their opinion and experiments with the community. IBM brings this into the UI of the DSX.
  • Open Source: Today there are companies that rely on open source for data science. Open source has become so mature that is directly competing with commercial offerings. IBM provide the best of open source within DSX, such as RStudio and Jupyter.
  • IBM Value Add: DSX improve open source by adding some capabilities from IBM. Data Shaping for example takes 80% of the data scientist time. IBM tools with visual GUI to help users better perform this task. You can execute Spark jobs on  managed Spark Service in Bluemix from within the DSX.