The Best Data Science Platform

Data science platforms are engines for creating machine-learning solutions. Innovation in this market focuses on Cloud, Apache Spark, automation, collaboration and artificial-intelligence capabilities.When choosing the best one, organizations often trust on The Gartner Magic Quadrants which aims to provide a qualitative analysis into a market and its direction, maturity and participants. Gartner previously called these platforms “advanced analytics platforms”. But since this platform is primarily used by data scientists so from this year the Quadrant has been renamed to Magic Quadrant for Data Science Platforms. 

This Magic Quadrant evaluates vendors of data science platforms. These are products that organizations use to build machine-learning solutions themselves, as opposed to outsourcing their creation or buying ready-made solution. These platforms are used by data scientists for  demand prediction, failure prediction, determination of customers’ propensity to buy or churn, and fraud detection.

The report aims to rank the BI platforms on the ability to execute and the completeness of vision. The Magic Quadrant is divided in 4 parts:

  • Niche Players
  • Challengers
  • Visionaries
  • Leaders

    gartnerdatascienceplatform
    Source: Gartner (Feburary 2017)

Adoption of open-source platforms and Diversity of tools is an important characteristic of this market. IBM’s mission is to make data simple and accessible to the world and commitment to open source and numerous open-source ecosystem providers made it most attractive platform for Data Science.  A data scientist needs the following to be more successful, which is provided by IBM Data Scientist Experience

  • Community: A data scientist needs to be updated with the latest news from the Data Science Community. There are plenty of new Open Source packages, libraries, techniques and tutorials available every day. A good data scientist follows the most important sources and shares their opinion and experiments with the community. IBM brings this into the UI of the DSX.
  • Open Source: Today there are companies that rely on open source for data science. Open source has become so mature that is directly competing with commercial offerings. IBM provide the best of open source within DSX, such as RStudio and Jupyter.
  • IBM Value Add: DSX improve open source by adding some capabilities from IBM. Data Shaping for example takes 80% of the data scientist time. IBM tools with visual GUI to help users better perform this task. You can execute Spark jobs on  managed Spark Service in Bluemix from within the DSX.
Advertisements

How BlueMix can help in a Natural Disaster

A few minutes back the news headline reads “A powerful earthquake has struck south Asia, with tremors felt in northern Pakistan, India and Afghanistan”. Natural Disasters are becoming a commonplace. Technology can help in predicting about such natural disaster and also can help in relief effort, post disaster. Based on my involvement in Uttarakhanda Disaster relief and Nepal Disaster relief, I want to share how technology can help in post disaster relief.

Why Cloud?

A solution on Cloud is inevitable because of the following reasons:

  • Location – the Cloud datacenters are physically distant from the area of the natural disaster and applications can keep running even when power and telecommunications are disrupted.
  • Autoscaling – applications designed for Cloud can automatically scale up easily to accommodate the sudden spike in the application usage on the event of disaster.
  • Support for distributed team development – you won’t be tied to inaccessible physical build and deployment servers if you hit a bug at exactly the wrong time
  • On demand pricing – Using the infrastructure only when it is required – Reduces cost of solution. No need to keep the infrastructure ready, waiting for the disaster to strike.

 

DisasterWhy Bluemix?

BlueMix offers many out of the box services that can help in this effort and one need not have to create applications afresh for these. A catalog of IBM, third party, and open source services allow the developer to stitch an application together quickly.

  • Lots of available libraries for Node.JS for implementing pop-up sites like Wikis
  • Language translation with Watson can be helpful for displaced persons whose first language is not English
  • Twilio can integrate into SMS messaging and VOIP phone networks

How ETL tool like Information Server can help?

We can use the following functionalities of InfoSphere Information Server in Disaster Management

  • Data Standardization: Lot of Data about the location or disaster victims is passed around. This comes from various sources and can be dirty or unusable. Data Standardization service can do data cleansing to remove noise and make it usable.
  • Data Matching: Victim information needs to be dynamically communicated between disaster relief team and the friends and relatives of the victim. These two different sources need to find each other and exchange information. Probabilistic Matching algorithms become inevitable to bring these two together.

These are some of my thoughts. Please share yours so that others can learn and benefit …

Data Governance: And the winner is…

When an organization runs a strong Information Governance program, it helps ensure that information used for critical decisions in the organization is trusted, particularly from such a central hub as the information warehouse. The information must come from an authoritative source and is known to be complete, timely, and relevant to the people and systems that are involved in making the decision. It must be managed by an Information Steward who can communicate to others about its purpose, usage, and quality. Through communication of Information Governance policy and rules, business terms, and their relationship to the information assets, the information can be clearly understood across the organization.

I was going through The Forrester Wave™: Data Governance Tools, Q2 2014.  IBM has been named a leader and has earned the highest scores for both strategy and market presence.

IBM was adjudged the Leader based on the evaluation on the following 5 domains of data governance management

  1. quality
  2. reference
  3. life-cycle management
  4. security/privacy
  5. metadata

These are the products in the Information Governance story of IBM (with links to my previous blogs on these topics)

 

 

IBM – 21 years of patent leadership

IBMers were granted a record of 6,809 U.S. patents in 2013 , the 21st consecutive year IBM has led in U.S. patent issuances – and third year in a row of more than 6,000. This year’s total is more than the combined totals of Amazon, Google, EMC, HP, Intel, Oracle/SUN and Symantec. Many of these patents are in strategic areas-–such as IBM’s Watson, cloud computing, Big Data analytics and the new cognitive computing era.

Top Ten 2013 U.S. Patent Leaders

Further Reading:
IBM sets a new Patenting record in 2012 

IBM and Informatica Leads Gartner Magic Quadrant for Data Integration Tools 2013

Gartner Magic Quadrants are based on product research, interactive briefings with vendors, web based survey of customers, Gartner customer enquiries, market share estimates.

When evaluating a Data Integration Suite Gartner looks at connectivity, modes of interaction (CDC, bulk event), data transformation capabilities, metadata and modelling, design and development, data governance, deployment options, operations and administration, architecture, service enablement.

I am happy to share the Gartner Magic Qudrant for Data Integration Tools 2013. IBM and Informatica are the leaders the field of enterprise data integration.
MagicQudrant

IBM’s Strengths

Since I work for IBM InfoSphere so I am focusing on IBM strengths.

  1. Breadth of Functionality: ETL, CDC, propagation, data replication, federation, virtualization are all listed.
  2. Installed base and diversity of usage, often cited as the enterprise standard for data integration.
  3. Alignment with EIM trends: on the cutting edge of metadata management, Hadoop support, self service data preparation and deeper cloud support.

For more details you can read Magic Quadrant for Data Integration Tools.

IBM sets a new Patenting record in 2012

patent_tumblr2012 marked 20 consecutive years of patent leadership for IBM.

IBMers earned a record 6,478 U.S. patents in 2012, the 20th consecutive year IBM has led in U.S. patent issuances — and more than the combined totals of Accenture, Amazon, Apple, EMC, HP, Intel, Oracle/SUN and Symantec.

Not only did 2012 mark 20 years of patent leadership, it was also the fifth year in a row IBM had broken its own patent record. Here is a sample of these inventions…

Patent #8,275,803: System and method for providing answers to questions

This patented invention was implemented in the IBM Watson system and describes a technique that enables a computer to take a question expressed in natural language, understand it in detail, and deliver a precise answer to the question. Read more about the current work in healthcare between IBM Watson and the Cleveland Clinic.

U.S. Patent #8,250,010: Electronic learning synapse with spike-timing dependent plasticity using unipolar memory-switching elements

This patent relates to algorithms and circuits for efficiently mimicking the learning function of brain’s synapses and lays the foundation for a non-von Neumann computer architecture. IBM is working on a cognitive computing project called Systems of Neuromorphic Adaptive Plastic Scalable Electronics (SyNAPSE), which aims to emulate the brain’s abilities for perception, action and cognition while consuming orders of magnitude less power and volume without being programmed.

U.S. Patent #8,341,441: Reducing energy consumption in a cloud computing environment

This patented invention describes an technique that enables more efficient and effective use of cloud computing resources, thereby reducing and minimizing energy consumption.

U.S. Patent #8,121,741: Intelligent monitoring of an electrical utility grid

This patent describes a method that uses sensors and intelligent electricity meters to remotely monitor, manage and adjust power usage across an electric grid.

Other notable patents

1911: IBM’s first patent, #998,631, was issued to John Pierce for a Perforating Machine.

1968: IBM Fellow Dr. Bob Denard was issued patent #3,387,286 for DRAM. He was awarded the US National Medal of Technology in 1988 for the invention.

1981: Gerd Binning and Heinrich Rohrer were issued patent #4,343,993 for the Scanning Tunneling Microscope, which could visualize individual atoms. They earned the Nobel Prize in 1986 for their breakthrough.

1985: IBM Fellow Dr. Mark Dean was issued patent #4,528,626 for “microcomputer system with bus control means for peripheral processing devices” (the IBM PC).

1998: Master Inventor Dr. Dimitri Kanevsky was issued patent #6,236,968 – “a system capable of keeping a driver awake.” Read more about Dimitri’s work on People for a Smarter Planet.

2010: Bob Friedlander and Jim Kraemer were awarded patent #7,693,663 for a “system and method for detection of earthquakes and tsunamis,” that interfaced with warning systems.

2012 U.S.Patent Leaders*

  1. IBM 6,478

  2. Samsung 5,081
  3. Canon 3,174
  4. Sony 3,032
  5. Panasonic 2,769
  6. Microsoft 2,613
  7. Toshiba 2,447
  8. Hon Hai 2,013
  9. General Electric 1,652
  10. LG Electronics 1,624

*Data provided by IFI CLAIMS Patent Services

You can watch a video on IBM Patent Leadership

IBM Ranks #1 in Social Business

According to IDC, the enterprise social platforms market is expected to reach $4.5 billion by 2016, representing growth of 43 percent over the next four years. The reason for popularity of Social networking is that organizations are looking for ways to adopt social business practices to

  •     Integrate global teams
  •     Drive innovation
  •     Increase productivity
  •     Better reach customers and partners.

Organizations are looking for ways to embrace social capabilities to transform their business operations, from marketing to research innovation and human resources. They require tools to gain insight into the enormous stream of information and use it in a meaningful way.

IBM’s social networking platform, IBM Connections, allows for instant collaboration with one simple click and the ability to build social communities both inside and outside the organization to increase customer loyalty and speed business results. IBM Connections is available both on premise and in the cloud.

Due to IBM’s strong presence and wide acceptance, IDC recently ranked IBM number one in worldwide market share for enterprise social software.  According to IDC’s analysis of 2011 revenue, IBM grew faster than it’s competitors and nearly two times faster than the overall market which grew approximately 40 percent. Read the IDC Report for details.

Some highlights about IBM’s social software:

  • 8 out of the top 10 retailers and banks use IBM social business software
  • More than 1/3 of Fortune 100 companies are working with IBM to become a social business.
  • IDC indicates that IBM is nearly double the size of Jive, our nearest competitor.

For more information, visit www.ibm.com/socialbusiness.