Watson Analytics

Need for Watson Analytics
If an organization is good at analyzing data and extracting relevant insights from it then decision makers can make more informed and thus more optimal decisions. But the decision makers are forced to make decisions with incomplete information. The reason?  Decisions makers/ Citizen Analysts, for the most part, tend to be mainly consumers of analytics and they rely on more skilled resources (Like Data Engineer, Data Scientist, Application developer) in the organization to provide the data driven answers to their questions. Moreover the answer to one question is just the start of another. Think of a detective interrogating a suspect. The consumer/builder model is hardly conducive to the iterative nature of data analysis. Therefore, the time it takes for these answers to be delivered to the decision makers is far from optimal – and many questions go unanswered every day.

watsonlogoWatson Analytics
So a logical solution is to provide an easier to use analytics offerings. Watson Analytics provides that value add so that more people will be able to leverage data to drive better decision making using analytics.

When we think of Watson, we think about Cognitive. And when we think about Analytics, we think  about traditional analytics (querying, dashboarding), along with some more advanced analytic capabilities (data mining, and social media analytics). So Watson Analytics is a Cloud based offering which can make analytics a child’s play even for a non-skilled user.

Watson Analytics helps users understand their data in a guided way using a natural language interface to ask a series of business questions. Example, a user can ask “What is the trend of revenue over years?” and get a visualization in response. So, Instead of having to first choose a visualization and working backwards to try answer the business question, Watson Analytics allows you to describe your intent in natural language, and it chooses the best visualization for you. Even better, Watson Analytics gives you some initial set of questions which you can keep refining.

Watson Analytics for Social Media
Watson Analytics can work on Social Media data to take the pulse of an audience by spotting trends and identifying new insights and relationships across multiple social channels allowing greater visibility into a given topic or market. It combines structured and unstructured self-service analysis to enrich your social media analytics experience for exceptionally insightful discoveries. All on the cloud!

Summary of Steps:
Watson Analytics does the following to provide insights hidden in your Big data. Mouse-over the below images to get the details of the steps.

  • Import data from a robust set of data source (on Cloud and on premise) options, with the option to prepare and cleanse via IBM Bluemix Data Connect.
  • Answering What: Identifying issues, early problem detection, finding anomalies or exceptions, challenging conventional wisdom or the status quo.
  • Understanding or explaining outcomes, Why something happened.
  • Dashboarding to share results
Advertisements

Match and Manage your Data on Cloud

We left the last blog with two questions.

A few weeks back I wrote on IBM Bluemix Data Connect. If you missed it, then watch this video on how you can put data to work with IBM Bluemix Data Connect.

Now, Business Analysts will be able to leverage Entity Matching technology using Data Connect. The Match and Manage (BETA) operation on Data Connect identifies possible matches and relationships (in plethora of data sets, including master data and non-master data sets) to create a unified view of your data. It also provides a visualization of the relationships between entities in the unified data set.

For example, you have two sets of data : One containing customer profile information and the other containing a list of prospects. A Business Analyst can now use intuitive UI to do the Match and Manage operation to match these two data sets and provide insights to questions such as:

  •  Are there duplicates in the prospect list?
  • How many of the prospects are already existing customers?
  • Are there non-obvious relationships among prospects and customers that can be explored?
  • Are there other sources of information within that could provide better insights if brought together?

The two data set are matched using Cognitive capabilities which allows the MDM– matching technology to be auto-configured and tuned to intelligently match across different data sets:

dataconnect

Business Analyst can understand the de-duplicated datasets by navigating through a relationship graph of the data to understand how the entities are related across the entire dataset. Now they can discover new non-obvious relationships within the data that were previously undiscoverable. The following generated canvas enables them to interactively explore relationships between entities.

dataconnect1

In the above example it was illustrated as how clients can now easily understand the data they hold within their MDM repositories and how now they can match their MDM data with other data sources not included within the MDM system. This simplifies the Analytical MDM experience where MDM technologies are accessible to everyone without the need to wait for Data Engineers to transform the data into a format that can be matched and rely on MDM Ninja’s to configure matching algorithms.

Summary:

IBM Bluemix Data Connect provides a seamless integrated self-service experience for data preparation. With addition of entity analytics capability, business users are empowered to gain insight from data that wasn’t previously available to them. Now organizations can extract further value from their MDM data by ensuring it is used across the organization to provide accurate analytics. Entity analytics within Data Connect is now available in beta. Go ahead and experience the next evolution of MDM.

The 4 Personas for Data Analytics

Due to new modernization strategies, data analytics is architected from  top down or through the lens of the consumers of the data. In this blog, I will describe the four roles that are integral to the data lifecycle. These are the personas who interact with data while uncovering and deploying insights as they explore this organizational data.

Citizen analysts/knowledge workers

A knowledge worker is primarily a subject-matter expert (SME) in a specific area of business—for example, a business analyst focused on risk or fraud, a marketing analyst aiming to build out new offers or someone who works to drive efficiencies into the supply chain. These users do not know where or how data is stored, or how to build an ETL flow or a machine learning algorithm. They simply want to access information on demand, driving analysis from their base of expertise, and create visualizations. They are the users of offerings like the Watson Analytics.

Data scientists

Data scientists can do a more sophisticated analysis, find a root cause to a problem, and develop a solution based on an insight that he discovers. They can use SPSS, SAS, etc or open source tools with built-in data shaping and point-and-click machine learning to manipulate large amount of data.

Data engineers

They focus enable data integrations, connections (plumbing) and data quality. They do the underlying enablement that a data scientist and citizen analyst depend on. They typically depend on solutions like DataWorks Forge to access multiple data source and to transform them within a fully managed service.

Application developers

Application developers are responsible for making analytics algorithms actionable within a business process, generally supported by a production system. Beginning with the analytics algorithms built by citizen analysts or data scientists, they work with the final data model representation created by data engineers, building an application that ties into the overall business process. They use something like Bluemix development platform and APIs for the individual data and analytics services.

Putting it all together

Image a scenario where a Citizen analyst notices (from a dashboard) that retail sales are down for the quarter. She pulls up Watson Analytics and uses it to discover that the underlying problem is specific to a category of goods and services in store in a specific region. But she needs more help to find the exact cause and a remedy.

She engages her data scientists and engineer. They discuss the need to pull in more data than just the transactional data the business analyst already has access to, specifically weather, social, and IoT data from the stores. The data engineer helps create the necessary access – the data scientists can then form and test various hypothesis using different analytic models.

Once the data scientist determines the root cause, he then shares the model with the developer who can then leverage it to improve the company’s mobile apps and websites to be more responsive in real-time to address the issue. The citizen analyst also shares the insight with the marketing department so they can take corrective action.

screen-shot-2016-12-05-at-1-33-18-pm

Lift your Data to Cloud

database_migrationTo stay competitive and reduce cost, several Enterprises are realizing the merits of moving their data to Cloud. Due to their economies of scale cloud storage vendors can achieve lesser cost. Also Enterprises escape the drudgery of [capacity] planning, buying, commissioning, provisioning and maintaining storage systems. Data is even protected by replication to multiple data centers which Cloud vendors provide by default. You can read this blog listing the various advantages to move data to cloud.

But now the BIG challenge is to securely migrate the terabytes of Enterprise data to Cloud. Months can be spent coming up with airtight migration plan which does not disrupt your business. And the final migration may also take a long time impacting adversely the users, applications or customers using the source database.

Innovative data migration

In short, database migration can end up being a miserable experience. IBM Bluemix Lift is a self-service, ground-to-cloud database migration offering from IBM to take care of the above listed needs. Using Bluemix Lift, database migration becomes fast, reliable and secure. Here’s what it offers:

  • Blazing fast Speed: Bluemix Lift helps accelerate data transfer by embedding the IBM Aspera technology. Aspera’s patented and highly efficient bulk data transport protocol allows Bluemix Lift to achieve transport speeds much faster than FTP and HTTP. Moving 10 TB of data can take a little over a day, depending on your network connection.
  • Zero downtime: Bluemix Lift can eliminate the downtime associated with database migrations. An efficient change capture technology tracks incremental changes to your source database and replays them to your target database. As a result, any applications using the source database can keep running uninterrupted while the database migration is in progress.
  • Secure: Any data movement across the Internet requires strong encryption so that the data is never compromised. Bluemix Lift encrypts data as it travels across the web on its way to an IBM cloud data property.
  • Easy to use: Set up the source data connection, provide credentials to the target database, verify schema compatibility with the target database engine and hit run. That’s all it takes to kick off a database migration with Bluemix Lift.
  • Reliable: The Bluemix Lift service automatically recovers from problems encountered during data extract, transport and load. If your migration is interrupted because of a drop in network connectivity, Bluemix Lift automatically resumes once connectivity returns. In other words, you can kick off a large database migration and walk away knowing that Bluemix Lift is on the job.

Speed, zero downtime, security, ease of use and reliability—these are the hallmarks of a great database migration service, and Bluemix Lift can deliver on all these benefits. Bluemix Lift gets data into a cloud database as easy as selecting Save As –> Cloud. Bluemix Lift also provides an amazing jumping-off point for new capabilities that are planned to be added in the future such as new source and target databases, enhanced automation and additional use cases. Take a look at IBM Bluemix Lift and give it a go.

IBM Bluemix Data Connect

I have been tracking the development on IBM Bluemix Data Connect quite closely. One of the reason is that I was a key developer in the one of the first few services that it launched almost two years back under the name of DataWorks. Two weeks back I attended a session on Data Connect by the architect and saw a demo. I am impressed at the way it has evolved since then. Therefore I am planning to re-visit DataWorks again, now as IBM Bluemix Data Connect. In this blog I will reconcile the role that IBM Bluemix Data Connect play in the era of cloud computing, big data and the Internet of Things.

Research from Forrester found that 68 percent of simple BI requests take weeks, months or longer for IT to fulfill due to lack of technical resources. So this entails that the enterprises must find ways to transform line of business professionals into skilled data workers, taking some of the burden off of IT. It means business users should be empowered work with data from many sources—both on premises and in the cloud—without requiring the deep technical expertise of a database administrator or data scientist.

This is where cloud services like IBM Bluemix Data Connect comes into picture. It enables both technical and non-technical business users to derive useful insights from data, with point and click access—whether it’s a few Excel sheets stored locally, or a massive database hosted in the cloud.

Data Connect is a fully managed data preparation and movement service that enables users to put data to work through a simple yet powerful cloud-based interface. The design team has taken lot of pain to design the solution in most simplistic way, so that a basic user can quickly get started with it. Data Connect empowers the business analyst to discover, cleanse, standardize, transform and move data in support of application development and analytics use cases.

Through its integration with cloud data services like IBM Watson Analytics, Data Connect is a seamless tool for preparing and moving data from on premises and off premises to an analytics cloud ecosystem where it can be quickly analyzed and visualized. Furthermore, Data Connect is backed by continuous delivery, which adds robust new features and functionality on a regular basis. Its processing engine is built on Apache Spark, the leading open source analytics project, with a large and continuously growing development community. The result is a best-of-breed solution that can keep up with the rapid pace of innovation in big data and cloud computing.

So here are highlights of IBM Bluemix Data Connect:

  • Allow technical and non-technical users to draw value from data quickly and easily.
  • Ensure data quality with simple data preparation and movement services in the cloud.
  • Integrate with leading cloud data services to create a seamless data management platform.
  • Continuous inflow of new and robust features
  • Best-of-breed ETL solution available on Bluemix  – IBMs Next-Generation Cloud App Development Platform

Cognitive 5 – IBM Watson as a cloud service

After its success on Jeopardy!, IBM research worked to make this technology open for public use. Later, IBM established a separate business unit for Watson called The Watson Group and a dedicated workforce to continuously improve Watson’s capabilities. The aim is to bring the power of Watson and cognitive computing to market using cloud delivery models. With those efforts, some of the Watson capabilities are now available on IBM Bluemix. These are available “as a service,” meaning you can use them in your own applications and services, embed them anywhere using Watson APIs and enhance your application capabilities dramatically. This also means that soon you might be able to do all the magic behind the Jeopardy! challenge within your application, with just a couple of clicks!

We can open the Bluemix dashboard and start using these services. Here they are:

watsonapis

AlchemyAPI
Using AlchemyAPI, developers can perform tasks such as extracting the people, places, companies, and other entities mentioned in any publicly-accessible webpage, posted HTML/text document, or a predefined corpus of news articles.

Conversation:
Using Conversation APIs, you can add a natural language interface to your application to automate interactions with your end users. Common applications include virtual agents and chat bots that can integrate and communicate on any channel or device.

Document conversion
The IBM Watson Document conversion service converts a single HTML, PDF, or Microsoft Word™ document into a normalized HTML, plain text, or a set of JSON-formatted Answer units that can be used with other Watson services.

Language Translation:
It dynamically translate news, patents, or conversational documents and can instantly publish content in multiple languages. As a result French-speaking staff can be empowered to instantly send emails in English.

Watson Personality Insights:
Personality Insights derives insights from transactional and social media data to identify psychological traits which determine purchase decisions, intent and behavioral traits; This can be utilized to improve conversion rates.

Retrieve and Rank service
Watson Retrieve and Rank service helps users find the most relevant information for their query by using a combination of search and machine learning algorithms to detect “signals” in the data. The Retrieve and Rank Service can be applied to a number of information retrieval scenarios. For example, an experienced technician who is going onsite and requires help troubleshooting a problem can use this.

The Text to Speech and Speech to Text
Text to Speech service processes text and natural language to generate synthesized audio output complete with appropriate cadence and intonation (multi language support exists). The Speech to Text service converts the human voice into the written word.

Tone Analyzer
This API leverages cognitive linguistic analysis to identify a variety of tones at both the sentence and document level. It detects three types of tones, including emotion (anger, disgust, fear, joy and sadness), social propensities (openness, conscientiousness, extroversion, agreeableness, and emotional range), and language styles (analytical, confident and tentative) from text. This insight can then used to refine and improve communications.

Tradeoff Analytics
This API helps people make better choices while taking into account multiple, often conflicting, goals that matter when making that choice. The service can be used to help make complex decisions like what mortgage to take, and also for helping with more everyday ones like which laptop to purchase.

Using these Watson APIs now you can build cognitive apps that help enhance, scale, and accelerate human expertise. In our next blog, we will explore some of these cognitive apps. Stay tuned.

How BlueMix can help in a Natural Disaster

A few minutes back the news headline reads “A powerful earthquake has struck south Asia, with tremors felt in northern Pakistan, India and Afghanistan”. Natural Disasters are becoming a commonplace. Technology can help in predicting about such natural disaster and also can help in relief effort, post disaster. Based on my involvement in Uttarakhanda Disaster relief and Nepal Disaster relief, I want to share how technology can help in post disaster relief.

Why Cloud?

A solution on Cloud is inevitable because of the following reasons:

  • Location – the Cloud datacenters are physically distant from the area of the natural disaster and applications can keep running even when power and telecommunications are disrupted.
  • Autoscaling – applications designed for Cloud can automatically scale up easily to accommodate the sudden spike in the application usage on the event of disaster.
  • Support for distributed team development – you won’t be tied to inaccessible physical build and deployment servers if you hit a bug at exactly the wrong time
  • On demand pricing – Using the infrastructure only when it is required – Reduces cost of solution. No need to keep the infrastructure ready, waiting for the disaster to strike.

 

DisasterWhy Bluemix?

BlueMix offers many out of the box services that can help in this effort and one need not have to create applications afresh for these. A catalog of IBM, third party, and open source services allow the developer to stitch an application together quickly.

  • Lots of available libraries for Node.JS for implementing pop-up sites like Wikis
  • Language translation with Watson can be helpful for displaced persons whose first language is not English
  • Twilio can integrate into SMS messaging and VOIP phone networks

How ETL tool like Information Server can help?

We can use the following functionalities of InfoSphere Information Server in Disaster Management

  • Data Standardization: Lot of Data about the location or disaster victims is passed around. This comes from various sources and can be dirty or unusable. Data Standardization service can do data cleansing to remove noise and make it usable.
  • Data Matching: Victim information needs to be dynamically communicated between disaster relief team and the friends and relatives of the victim. These two different sources need to find each other and exchange information. Probabilistic Matching algorithms become inevitable to bring these two together.

These are some of my thoughts. Please share yours so that others can learn and benefit …