What is Business Intelligence?

In my last sets of posts, I was mentioning the journey of data to a Data Warehouse. It started from an ETL job that can has an ability to extract the Data from different types of sources (Z/OS, SAP or Custom) and on the way getting transformed, cleansed to land into a final source as “Trusted Information”(which by definition means Accurate, Complete, Insightful and Real Time). So why was this effort made after all? An answer that I already provided was that for compliance purpose this much effort is needed and more or less sufficient (at least from IT perspective) to ensure that our records are proper. But beyond that, often such data can be further used to give valuable insights. This is where BI (called as Bee Eye) or Business Intelligence comes into picture.

What is BI?
Business intelligence (BI) is defined as the ability for an organization to take all its capabilities and convert them into knowledge, ultimately, getting the right information to the right people, at the right time to make right decisions. These decisions drive organizations. Making a good decision at a critical moment may lead to a more efficient operation, a more profitable enterprise, or perhaps a more satisfied customer. BI tools and processes working on Trusted data provides a safer way to make decision than making a decision by a “gut feeling”.

Where does BI Apply?

  • BI can be used to segment customers and identify the best ones. We can analyze data to understand customer behaviors, predict their wants and needs, and offer fitting products and promotions. Finally, we can identify the customers at greatest risk of attrition, and intervene to try to keep them.
  • The human resources department can learn which individuals in an organization are high performers and then hire, train, and reward other employees to become similar high performers.
  • Inventory managers can segment their inventory items by cost and velocity, build key facilities in the best locations, and ensure that the right products are available in the right quantities.
  • Production can minimize its costs by setting up activity-based costing programs.

With BI can I make any business decisions accurate?
BI just assists in making a proper decision. But in places, intuition may be required. What if we do not have sufficient time to run our tools and get a report before making a decision? What if we have no precedent data to make decision or that history is misleading?

Further…
So does BI munch the trusted data and gives you some gyan (sanskrit word for giving insight)? Not really. There are two additional things that it should do. It should measure the results according to predetermined metric and also feed the lessons from one decision into the next.

Finally…
These are my tit-bits gathered from reading about BI from various sources. I welcome readers to share their understandings or point to some more interesting read in this upcoming area.

Content Analytics

Content analytics is an emerging field of analytics that enables companies to unlock the insights contained in unstructured content. This unstructured content can include forms, documents, comment fields in databases, web pages, customer correspondence, and other information that is not stored within structured data fields. Content analytics offers the ability to access, sort, and analyze content, and then combine it with structured data and other existing information resources and applications, for reporting and analytics.

Content analytics is a natural extension of business intelligence. Many organizations already use business intelligence for “data-driven decision-making.” This decision-making process is based on insights gleaned from records of past transactions and other structured information typically housed in data warehouses. Organizations can supplement these business intelligence methods with content techniques, which can be used to expose trends within unstructured content. For example, organizations can analyze content to address key business problems such as the following:

  • Identify fraudulent claims based on the content of insurance claims forms.
  • Measure and monitor customer-service metrics based on an analysis of text in call-center records.
  • Plan product-release priorities based on an analysis of warranty records.
  • Develop a winning competitive-selling strategy based on an analysis of text in competitor filings and win/loss data.

Providing relevant Social Media Analytics

Slowly social media is becoming very prevalent place where individuals or groups of individuals express their opinions on the gamut of tools/ services/ latest gizmos/ ad campaigns to an organization as a whole.  An organization should be aware of what the consumers/ customers or competitors are talking about them to decide the future course of plan and remain competitive in market place.  For example on a launch of a new product a negative chatter starts about it. It could have been a misinformation, but this chatter has potential to do the damage to the sales of the product and many consumers may form their opinions. So it would be nice for an organization to scan the social media and get the answers to the questions like following:

  • How do consumer feel about our new product or ad campaign?
  • What are consumer hearing about our brand [Brand reputation]?
  • What are the most talked about product attributes in my product category [Like in my smart phone whether people are talking about screen, battery life or camera]. Is it good or bad?
  • What is my competitor doing to excite market [Competitive analysis]?
  • Are my business partners helping or hurting my reputation?
  • Is there a negative chatter that my PR team should respond to?

Cognos Cosumer Insight (CCI) does the same thing. Typically it does not crawl the data and uses the service(s) of some known social media crawlers (like Board Reader) to get social media content. This ASCII data comes in Jason format and CCI processes the data using Hadoop from BigInsights. It applies sentiment analysis (using SystemT) and does the following

  • Perform pattern matching, based on the input keywords, and then look for sentiments
  • Check for positive, negative, neutral words / phrases (grammar, slang, typos, synonyms etc)
  • Save results for further processing by visualization / search engine.

Finally it provides Sentiment Analysis using the Visualization Engine.  The job of CCI is done now, but the story does not end here. Now we know the sentiment of the customer as of now. Organizations may need the ability understand the key factor driving the sentiment and IBM SPSS is a tool for that. Once there is a fair prediction, we need to act on those and monitor results and IBM Coremetrics and Unica are the tools used for the same.

Data Governance – IV (Achieving Data Privacy using Data Masking)

Data Privacy Threat in non-production environments
As I mentioned in my initial posting, Security and privacy are amongst the goals on Data Governance— Data privacy protection is a tremendous focus for the IT community today. Organizations are making great strides to protect sensitive data in live application environments. But the “untold story” of implementing protection strategies in non-production (testing, development and training) environments remains a critical risk. As data breach headlines continue to mount, organizations must begin to address the most vulnerable areas of IT infrastructure—non-production environments.

So, what makes non-production environments so unique?
The answer lies in the methods used to create non-production databases. Commonly, live production systems are cloned (copied) to a test environment—confidential data and all. Developers and QA testers find it easy to work with live data because it produces test results that everyone can understand. But this poses a great threat to data privacy. What if the developer / tester shares this data accidentally with another customer while trying to reproduce some scenario?

Solution – Data Masking
Solution lies in understanding that non-production environments actually do not require live data. Using realistic data is essential to testing, but live data values are not specifically necessary. Capabilities for “de-identifying” or masking production data offer a best practice approach for protecting sensitive data while supporting the testing process.

Data masking is the process of systematically transforming confidential data elements such as trade secrets and personally identifying information (PII) into realistic but fictionalized values. Data that has been scrubbed or cleansed in such a manner is considered acceptable to use in non-production environments. Masking enables developers and QA testers to use “production-like” data and produce valid test results, while still complying with privacy protection rules.

Challenges in Data Masking
Data masking represents a simple concept, but it is technically challenging to execute. Most organizations operate within complex, heterogeneous IT environments consisting of multiple, interrelated applications, databases and platforms. IT managers do not always know where confidential data is stored or how it is related across disparate systems. The ideal solution must both discover sensitive data across related data stores and mask it effectively.

The IBM® InfoSphere™ Optim™ Data Masking solution provides comprehensive capabilities for masking sensitive data effectively across non-production environments, while still providing realistic data for use in development, testing or training. When you use InfoSphere Optim to mask confidential data, you protect privacy and safeguard shareholder value.

Optim data masking capability is available through IBM Infosphere Information Server. As of v9.1, all InfoSphere Information Server  is shipped with a Data Masking Stage (which includes the required pieces for integrating with Optim Data Privacy).

Further Information:
Installing and configuring Optim Data Privacy Providers for IBM InfoSphere DataStage Data Masking stage

Data Governance – III (The need for Data Governance)

In my previous blogs, I mentioned about Data Governance. In this blog, I will share my understanding on why Data Governance is required?

A goal of data governance is to have fewer negative events as a result of poor data. Poor information quality can begin to lead to adverse events both inside and outside the walls of the company, even though they are seemingly unrelated events. For example:

  • Marketing becomes counterproductive. Marketing may be inadvertently barraging certain customers with multiple mailers due to near-duplicates customer details.
  • Inability to ship products on time due to incorrect inventory levels. The data for the inventory is unstructured and not standardized and so one may not know the exact inventory level – too high and you’re stuck with extra inventory. Too low and you’re not able to deliver products/service on time.
  • Your company buys equipment that you already have in inventory. Because of the data inconsistencies, a simple equipment check showed no available transformers, which appeared in the system as “trnsfmr.” Spending money when you don’t have to is bad enough.
  • The IT team spends millions on a new ERP or CRM system. However, because of poor data governance and poor data quality, the system is unusable. Business users can’t get the information they need and lose faith in the value of the IT team.
  • Lack of understanding of your biggest customer causes you to lose that customer.

But with proper Data Governance in place the following  can be achieved to much greater extent:

  • Improve the level of trust that users have in reports
  • Ensure consistency of data across multiple reports from different parts of the organization
  • Ensure appropriate safeguards over corporate information to satisfy the demands of auditors and regulators
  • Improve the level of customer insight to drive marketing initiatives
  • Directly impact the three factors an organization most cares about: increasing revenue, lowering costs, and reducing risk
  • Designating accountability for information quality

Data Governance – I (Basics)

Data governance is a set of processes that ensures that important data assets are formally managed throughout the enterprise. Data governance guarantees that data can be trusted and that people can be made accountable for any adverse event that happens because of poor data quality. So Data Governance is about putting people in charge of fixing and preventing issues with data, so that the enterprise can become more efficient.

Data governance encompasses the people, processes, and information technology required to create a consistent and proper handling of an organization’s data across the business enterprise.

  •  People – Effective enterprise data governance requires executive sponsorship as well as a firm commitment from both business and IT staffs.
  • Policies – A data governance program must create – and enforce – what is considered “acceptable” data through the use of business policies that guide the collection and management of data.
  •  Technology – Beyond data quality and data integration functionality, an effective data governance program uses data synchronization technology, data models, collaboration tools and other components that help create a coherent enterprise view.

The benefits to a holistic approach are obvious; better data drives more effective decisions across every level of the organization. With a more unified view of the enterprise, managers and executives can create strategies that make the company more profitable.

Cloud Computing – II (Use Cases)

In my last blog, I explained what is Cloud computing. Now one may say, it sounds ok and yes there is lot of hype around it. But what real world problem is it solving. So in this blog, I will share some of my understandings on what Cloud computing can do. This is definitely not going to be an exhaustive list. It is just my understanding on the benefits that boarding on to Cloud may bring.

  • Imagine that a big vendor is selling their services through a Web portal. They need access to servers and infrastructure to support the biggest peaks in demand (say Income tax filing happens just once a year), but most of the time they can manage with smaller capacity. In a typical scenario, they would buy exactly the capacity they need during peak times. But with cloud computing, they would pay for peak capacity based on usage. Cloud computing makes it possible to scale the resources available to the application.
  • A start-up business needs to put and advertising campaign. They are not sure as to how people will respond to it. What if it works a bit too well and jams the servers? Cloud computing will help the customer. Now they don’t have to know (and buy) the full capacity they might need at a peak time.
  • If a Web site gets a lot of visitors five days a week, but is almost dead on weekends? What if the service infrastructure is used during the work hours and remaining hours it may be not be used? With cloud computing the customers can pay for the capacity they need (and use) only on the weekdays.
  • There is lot of talk about going “Green”. How is that achieved through Cloud? Will the Cloud bring rain to make things green? May not be so. But Because the data centers that run the services are huge, and share resources among a large group of users, the infrastructure costs are lower (electricity, buildings, and so on).
Disclaimer: The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions

Cloud Computing – I (Fundamentals)

What is cloud computing?

Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources, such as networks, servers, storage, applications, and services, that can be rapidly provisioned and released with minimal management effort or service provider interaction.

The flexibility of cloud computing is a function of the allocation of resources on demand. This facilitates the use of the system’s cumulative resources, negating the need to assign specific hardware to a task. Before cloud computing, websites and server-based applications were executed on a specific system. With the advent of cloud computing, resources are used as an aggregated virtual computer. This amalgamated configuration provides an environment where applications execute independently without regard for any particular configuration.

Advantages of being on the cloud?

There are valid and significant business and IT reasons for the cloud computing paradigm shift. The fundamentals of outsourcing as a solution apply.

  • Reduced cost: Cloud computing can reduce both capital expense (CapEx) and operating expense (OpEx) costs because resources are only acquired when needed and are only paid for when used.
  • Refined usage of personnel: Using cloud computing frees valuable personnel allowing them to focus on delivering value rather than maintaining hardware and software.
  • Robust scalability: Cloud computing allows for immediate scaling, either up or down, at any time without long-term commitment.

Cloud formations
There are three types of cloud formations: private (on premise), public, and hybrid.

  • Public clouds are available to the general public or a large industry group and are owned and provisioned by an organization selling cloud services. A public cloud is what is thought of as the cloud in the usual sense; that is, resources dynamically provisioned over the Internet using web applications from an off-site third-party provider that supplies shared resources and bills on a utility computing basis.
  • Private clouds exist within your company’s firewall and are managed by your organization. They are cloud services you create and control within your enterprise. Private clouds offer many of the same benefits as the public clouds — the major distinction being that your organization is in charge of setting up and maintaining the cloud. According to SFGate article, IBM’s cloud offerings ranked first among developers targeting private clouds.
  • Hybrid clouds are a combination of the public and the private cloud using services that are in both the public and private space. Management responsibilities are divided between the public cloud provider and the business itself. Using a hybrid cloud, organizations can determine the objectives and requirements of the services to be created and obtain them based on the most suitable alternative.

Summary

Screenshot 2019-08-18 at 2.19.34 PM

Types of Cloud Computing

 

IaaS enables IT infrastructure (servers, virtual machines, storage, networks, and operating systems) to be leased from a cloud provider with a payment system as clients use it.

PaaS is a cloud computing service that delivers an on-demand environment for the development, testing, delivery, and management of software applications.

SaaS is a method of delivering software for applications through the Internet on demand, and usually on a subscription basis.

Screenshot 2019-08-18 at 2.32.17 PM

To be considered a cloud, a technology model must possess these five characteristics: on-demand self-service, meaning that anyone with a browser can subscribe to the service; measured service, meaning that monitoring capabilities allow providers to offer service by subscription, pay-per-use, or other pricing models; elastic scalability, which means that cloud subscribers can adjust computing resources as they see fit; resource pooling, which means that virtualized storage, servers, and networks are pooled together at a single location or across many locations to create a virtually infinite supply of resources; and broad network access. The cloud can be deployed in one of three methods: the public cloud, which allows you to pay only for the resources you use; the private cloud, which runs on dedicated IT only; and the hybrid cloud, which combines the scalability of the public model with the security of the private model.

A lot of channels on youTube are worth exploring:

  • The IBM Cloud channel on youTube has a short series of videos for getting started with cloud computing.
  • The IBM WebSphere Education channel on youTube provides a sampling of content from their course catalog, as taught by IBM instructors. There are over 25 videos available covering WebSphere Application Server system administration, SOA, BPM, and more.
  • IBM Education Assistant also has its own youTube channel here.
Disclaimer: The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions

Understanding Business Analytics through Use Cases

In my last post, I was mentioning about Data Integration. Once we have this data, what are some of the powerful things we can do with it? Let me share what I have been hearing of customer doing with this Big Data.

Predictive analytics makes it possible to understand what has happened in the past, anticipate what may happen next, and then take appropriate and timely action.

Security: Clever, new data analytics software can sift through vast quantities of data to anticipate security threats and breaches, such as an employee trying to access unauthorized information.

Education: Analytics is helping educators measure and monitor student success, report results and keep students on the right track.

Doctors: Doctors use analytics to make better diagnosis and treatment decisions, develop new drugs, and predict health issue before they happen – by crunching data in days and weeks instead of months and years.

Insurance Companies: Insurance Companies use analytics to see patterns in billions of claims, and identify the few that are fraudulent.

Customer Sentiment: Businesses are discovering the value in performing text analytics on data and online conversations derived from Twitter, Facebook and other social forums to determine how companies or their products are faring among consumers.

Police: Everyone knows good police work relies on good information. And the NYPD is using business intelligence to access and analyze billions of records and zero-in on criminal suspects withing minutes. This allows the police to consistently make the best possible decisions, leading to more accurate arrests an less crime.

Smarter Energy: For Vestas Wind Systems, choosing the best location for its wind turbines so that they can produce enough electricity is a critical task. Now, thanks to analytics that can quickly sift through weather reports, moon and tidal phases, sensor data, satellite images and other data, Vestas is able to pinpoint the best location and meet all of its energy goals.

How it works?
The underlying technology works in three stages.

  • First, advanced analytics algorithms using search and index technologies begin sifting through the different pieces of information for gems of intelligence.
  • Second, the information is correlated and analyzed for patterns at more than 200 times a second – faster than a hummingbird can flap its wings!
  • Third, this advanced analysis is quickly turned into insight that is used to determine which actions will drive the best results!

For more information:
From Information to Analytics: The IBM Story

Disclaimer: The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions