The Best Data Science Platform

Data science platforms are engines for creating machine-learning solutions. Innovation in this market focuses on Cloud, Apache Spark, automation, collaboration and artificial-intelligence capabilities.When choosing the best one, organizations often trust on The Gartner Magic Quadrants which aims to provide a qualitative analysis into a market and its direction, maturity and participants. Gartner previously called these platforms “advanced analytics platforms”. But since this platform is primarily used by data scientists so from this year the Quadrant has been renamed to Magic Quadrant for Data Science Platforms. 

This Magic Quadrant evaluates vendors of data science platforms. These are products that organizations use to build machine-learning solutions themselves, as opposed to outsourcing their creation or buying ready-made solution. These platforms are used by data scientists for  demand prediction, failure prediction, determination of customers’ propensity to buy or churn, and fraud detection.

The report aims to rank the BI platforms on the ability to execute and the completeness of vision. The Magic Quadrant is divided in 4 parts:

  • Niche Players
  • Challengers
  • Visionaries
  • Leaders

    Source: Gartner (Feburary 2017)

Adoption of open-source platforms and Diversity of tools is an important characteristic of this market. IBM’s mission is to make data simple and accessible to the world and commitment to open source and numerous open-source ecosystem providers made it most attractive platform for Data Science.  A data scientist needs the following to be more successful, which is provided by IBM Data Scientist Experience

  • Community: A data scientist needs to be updated with the latest news from the Data Science Community. There are plenty of new Open Source packages, libraries, techniques and tutorials available every day. A good data scientist follows the most important sources and shares their opinion and experiments with the community. IBM brings this into the UI of the DSX.
  • Open Source: Today there are companies that rely on open source for data science. Open source has become so mature that is directly competing with commercial offerings. IBM provide the best of open source within DSX, such as RStudio and Jupyter.
  • IBM Value Add: DSX improve open source by adding some capabilities from IBM. Data Shaping for example takes 80% of the data scientist time. IBM tools with visual GUI to help users better perform this task. You can execute Spark jobs on  managed Spark Service in Bluemix from within the DSX.

IBM and Informatica Leads Gartner Magic Quadrant for Data Integration Tools 2013

Gartner Magic Quadrants are based on product research, interactive briefings with vendors, web based survey of customers, Gartner customer enquiries, market share estimates.

When evaluating a Data Integration Suite Gartner looks at connectivity, modes of interaction (CDC, bulk event), data transformation capabilities, metadata and modelling, design and development, data governance, deployment options, operations and administration, architecture, service enablement.

I am happy to share the Gartner Magic Qudrant for Data Integration Tools 2013. IBM and Informatica are the leaders the field of enterprise data integration.

IBM’s Strengths

Since I work for IBM InfoSphere so I am focusing on IBM strengths.

  1. Breadth of Functionality: ETL, CDC, propagation, data replication, federation, virtualization are all listed.
  2. Installed base and diversity of usage, often cited as the enterprise standard for data integration.
  3. Alignment with EIM trends: on the cutting edge of metadata management, Hadoop support, self service data preparation and deeper cloud support.

For more details you can read Magic Quadrant for Data Integration Tools.

InfoSphere Information Server #1 in market share for data integration tools

I have been working for InfoSphere Information Server development team since December 2005 (It was known as DataStage in those days). Gartner recently announced that InfoSphere is the leader in every sub-category of Information Integration and Governance: Data Integration, Data Quality, Master Data Management, Database Archiving, and Database Security. Analyst firm, Gartner, cited IBM with InfoSphere Information Server as number one in market share for data integration tools in a recent Gartner publication. Last month, Gartner also announced that InfoSphere MDM was cited as the leader in the customer market with 40 percent share, two times that of the nearest competitor.

Finally, with an incredible 76 percent market share, IBM InfoSphere Optim was cited as number one in the database archiving market in a new Gartner publication, “Market Trends: World, Database Archiving Market Continues Rapid Growth” and a companion report, “Solid Vendor Solutions Bolster Database Archiving Market.”

According to Gartner, the database archiving market is projected to grow more than 20 percent Compound Annual Growth Rate (CAGR) from 2009 to 2014, proving that IBM’s Information Integration and Governance sales plays are on target for our clients!In the reports, IBM dwarfed three leading vendors in market share in data archiving, with Informatica ranking second at 16.2 percent, Solix ranking third at 3.5 percent, and HP ranking fourth at 2.6 percent. IBM was also ahead in terms of total number of customers, at 71 percent, with Informatica at 14.2 percent, HP at 8.4 percent, and Solix at 6.4 percent of the total customer base.

Closing Thought: IBM differentiates itself (from competition) in field information integration. Why do we need this information integration after all? Will it solve my real problem of life? Will Harry marry Sally? Stay Tuned… 🙂

Disclaimer: The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions