Whats new in IBM InfoSphere Information Server 11.7 – Part 3

In my last blog, we discussed about Information Governance Catalog (IGC). In this blog I wish to touch upon some new features of Information Governance that were introduced along with the new look and feel with  IBM InfoSphere Information Server version 11.7.

Enterprise Search

Social Collaboration
InfoSphere Information Server also brought social collaboration to the domain of Information Governance. When you browse your data, sometimes you would want to know what other experts think about critical assets such as reports, source files, and more.. Now it is possible, as you can rate an asset on a scale of one to five stars, and you can leave a comment with a couple of words. This enables all members of your organization to collaborate and share their expertise right where it’s needed. Also remember that the more popular the asset is, the higher is its position on the search results list.

Searching for assets
With 11.7, Searching for assets has become very easy. You don’t need to know anything about the data in your enterprise to explore it. Let’s assume that you want to find information about bank accounts, simply type ‘bank account’ in the search field in enterprise search, and that’s it. The search engine looks for the information in all asset types. It takes into account factors like text match, related assets, ratings and comments, modification date, quality score, and usage. And if  you already familiar with your organization and looking for something more specific, then you just open the catalog with your data, and select asset types that you want to browse. To narrow down search results, apply advanced filters like creation and modification dates, stewards, labels, or custom attributes.

Unstructured data sources
The data in your enterprise consists of databases, tables, columns, and other sources of structured data. What about email messages, word-processing documents, audio or video files, collaboration software, or instant messages? They are also a very valuable source of information. To support a unified approach to enterprise information management, IBM StoredIQ can now be set up to synchronize data with IBM Information Governance Catalog. So now you can classify such information in IGC too.

Exploring Relationships
Data in large organizations can be very complex, and assets can be related to one another in multiple ways. To understand these complex relations better, explore them in a graphical form by using graph explorer. This view by default displays all relationships of one asset that you select. But this is just the starting point, as you can further expand relationships of this asset’s relationships in the same view. Having all this information in one place in a graphical format makes it a lot easier to dig into the structure of your data. Each relationship has direction and name. You’ll be surprised when you discover how assets are connected!

To have a look at the new Information Governance Catalog, view this video.



Information Governance – Revisited

IIGIt has been more than 5 years that I wrote on Information governance. Over the period of last 5 years some areas of Information Governance became more matured and I thought of re-visiting this topic. In a simple analogy, what library do for books, Data governance does for data. It organizes data, makes it simple to access the data, gives means to check for validity/ accuracy of data and makes it understandable to all who need it.  If Information Governance in place, organizations can use data for generating insights and also they are equipped for  regulatory mandates (like GDPR).

There are six sets of capabilities that make up the Information Management & Governance component:

  1. Data Lifecycle Management is a discipline that applies not only to analytical data but also to operational, master and reference data within the enterprise.  It involves defining and implementing policies on the creation, storage, transmission, usage and eventual disposal of data, in order to ensure that it is handled in such a way as to comply with business requirements and regulatory mandates.

2. MDM: Master and Entity Data acts as the ‘single source of the truth’ for entities – customers, suppliers, employees, contracts etc.  Such data is typically stored outside the analytics environment in a Master Data Management (MDM) system, and the analytics environment then accesses the MDM system when performing tasks such as data integration.

3. Reference Data is similar in concept to Master and Entity Data, but pertains to common data elements such as location codes, currency exchange rates etc., which are used by multiple groups or lines of business within the enterprise.  Like Master and Entity Data, Reference data is typically leveraged by operational as well as analytical systems.  It is therefore typically stored outside the analytics environment and accessed when required for data integration or analysis.

4. Data Catalog is a repository that contains metadata relating to the data stored in the Analytical Data Lake Storage repositories.  The catalog maintains the location, meaning and lineage of data elements, the relationships between them and the policies and rules relating to their security and management .  The catalog is critical for enabling effective information governance, and to support self-service access to data for exploration and analysis.

5. Data Models provide a consistent representation of data elements and their relationships across the enterprise.  An effective Enterprise Data Model facilitates consistent representation of entities and relationships, simplifying management of and access to data.

6. Data Quality Rules describe the quality requirements for each data set within the Analytical Data Lake Storage component, and provides measures of data quality that can be used by potential consumers of data to determine whether a data set is suitable for a particular purpose.  For example, data sets obtained from social media sources are often sparse and therefore ‘low quality’ but that does not necessarily disqualify a data set from being used.  Provided a user of the data knows about its quality, they can use that knowledge to determine what kinds of algorithms can best be applied to that data.


InfoSphere DataStage – XVI (Business Glossary)

Let’s now talk about why would an enterprise need a Business Glossary?

I have spoken about it in my previous blog that Business glossary is a repository used to communicate and govern the enterprise’s business terms along with the associated definitions and the relationships between those terms.
In summary:
  • Business Glossary brings understanding, consistency, and trust in information to any application or context.
  • This authoritative source of information promotes better communication among business and technical teams and aligns cross-team efforts.
  • The line of business uses this centralized information source as a gateway to all information assets to support data governance initiatives.
  • It can associate key business concepts to a vast array of heterogeneous source systems, ETL processes, BI reports, data models, and business rules, and more, automatically.

Now to IBM InfoSphere Business Glossary. IBM InfoSphere Business Glossary is an interactive, web-based tool that enables users to create, manage, and share controlled vocabulary and information governance controls in a repository called a business glossary. The vocabulary and governance controls define business semantics and enable business leaders and IT professionals to manage enterprise-wide information according to defined regulatory or operational business requirements. IBM InfoSphere Business Glossary Anywhere, its companion module, augments InfoSphere Business Glossary with more ease-of-use and extensibility features.

Business Glossary, Business Glossary browser, and Business Glossary Anywhere support complex enterprise development environments with a unique set of the following capabilities:

Manage business terms and categories
Business Glossary provides a dedicated, web-based user interface for creating, managing, and sharing a controlled vocabulary, including batch
editing capabilities. Terms represent the major information concepts in your enterprise and categories are used to organize into hierarchies.

Manage stewardship
Stewards are people or organizations with the responsibility for a given information asset. By using Business Glossary, administrators can import
steward profiles from external sources, generate and edit profiles in the web interface, and create relationships of responsibility between stewards and business terms or any of the artifacts that are managed by Information Server.

Customize and extend

The needs around business metadata tend to differ from one enterprise to the next. For this reason, there is no “one-size-fits-all” meta-model. In addition to the ability to customize the entry page to the application, administrators can extend the application with custom attributes on business categories and business terms.

It is not enough to simply document business metadata. This information is active in the enterprise with open access to all members of business and development teams. IBM InfoSphere Business Glossary provides a collaborative environment in which users can evolve this important
information asset as the business changes and adapts to market conditions, shifting customer needs and competitive threats.

Contextual search and visibility business term definitions
Business Glossary Anywhere is an application independent search window that can be called from any application (such as Microsoft Excel, data
modeling tools, reporting applications, and Microsoft Word) that provides instant access to Business Glossary terms, taxonomies, and stewards.

Simply Browse
Business Glossary browser is an intuitive, read-only web-based interface that requires no training to use. Business users can search and explore the common controlled vocabulary and relationships, identify stewards that are responsible for assets and provide direct feedback.

Business Glossary – A use case

As I promised some time back, here is a blog that will explain the business value of Business Glossary through a simple use case.

Business analysts and subject matter experts can use InfoSphere™ Business Glossary to create and manage a controlled vocabulary and classification system. Such a system enables them to build a common language between business and information technology.

InfoSphere Business Glossary provides a web interface where you can manage the important business aspects of your assets from any computer. With the business glossary, you can create categories and terms, define custom attributes and values, search the glossary, and assign a steward to assets.

The IBM® InfoSphere Information Server metadata repository stores metadata about tools, processes, and data sources. Individual instances of metadata are called “information assets”, or just “assets”. Examples of assets are implemented data resources such as tables and columns, ETL jobs, profiling processes, routines, and functions. “External assets” are assets that are not stored in the metadata repository. External assets can include such items as business process models, Web services, or reports that are in an external asset management system.

The business glossary organizes your metadata into categories that contain terms. Terms can relate to the assets that are stored in the metadata repository or to external assets according to the standards and practices of your enterprise. You can also designate specific users or user groups as stewards who are responsible for particular assets.

For example, you need to work with business analysts to provide information about the purchase patterns of customers in Europe. You use the business glossary to look up the category named European Sales that contains a term named Customers. That term is related to assets such as multiple database tables that are associated with the customers of the European sales operation. Then, business analysts and subject matter experts can:

  • browse the European Sales category
  • view its contained terms
  • browse the Customers term to see which database tables or other assets are related
  • see the steward who is responsible for that information