In one of my previous blogs, I was mentioning how a data lake is a set of one or more data repositories that have been created to support data discovery, analytics, ad hoc investigations, and reporting. Some Enterprises have invested money and created data lake, but are not sure how to begin utilizing their data. IA Thin Client gives the first grip on the data to the business user or analyst. Extending the capacity of Information Analyzer on Hadoop and giving a user friendly thin client, it helps the Enterprises to get to know their data. Here are few of it’s capabilities
1. Customers can see the listing of all the data they have in there HDFS file system which they can preview and select a handful of interesting ones.
2. They can group these interesting ones into some Workspaces say – Customer related, Employee related, Finance related and so on.
3. IA Thin Client gives them a dashboard where they can see the overall picture of data in a particular Workspaces.
4. From Workspace you can drill into details of of one of these interesting structured / semi structured data and run data analysis to find more details about the data. This detailed analysis gives insight about data in easily understandable way – What is the quality of data? What is format of data? Can the data be classified into one of the several known data classifications? User can also see detailed information for each of the columns of the data (format, any data quality problem observed, data type, min-max values, classification, frequent values, sampling of actual values and so on).
5. Using the tool user can make some suggestion to the meta data of the data. For example after looking they feel that some data formats do not look correct, or the minimum value should have been something else, or the data quality problem identified can be ignored etc. Editing these also reflect on the overall data quality score.
6. Tool allows to add a note to data or link one of the interesting data to the existing data governance catalog.
7. Tool allows the customer to apply some existing data rule to the data and see how the data performs against it.
8. Moreover this is done on a simple, intuitive, easy to use thin client so that a non-technical person can easily navigate through the data.
You can watch a 4 minute video to get a first hand experience of the tool.
Or see InfoSphere Information Analyzer thin client presentation that provides a comprehensive overview of the Information Analyzer thin client.