The rapidly growing Indian economy, increasing affluence, communications revolution and innovative technologies are transforming the retail, FMCG, travel and transportation industries. In this article, I would like to highlight how smarter software can help companies become more agile and effective in utilizing newer and more effective distribution channels, and creating competitive advantage through service differentiation.
Current BI tools provide reports and analytics on warehoused enterprise data. Hence, there is often a significant delay between creation and use of insights thereby leading to actions not being taken in a timely fashion. These solutions become inadequate given the fact that businesses increasingly want to include more and more operational data for insight generation i.e., they would like to capture an event as soon as it occurs.
Delivering the right information at the right time to make a winning decision could be provided by a Dynamic Warehouse. A Dynamic Warehouse collates real time data from various applications regardless of platform, programming language and vendor. With accurate data in place, clients will not base their decisions on intuition or suspicion but rather on facts — real information.
Note: Dynamic Warehousing includes handling unstructured data. IIS can extend to provide such handling, but in this post, I will not share that aspect. It will be one of the coming attraction.
Need for SOA in implementing Dynamic Warehouse:
SOA is described in Wikipedia as: “A software architecture that defines use of loosely coupled software services to support the requirements of business processes and software users. The resources on a network in an SOA environment are made available as independent services that can be accessed without knowledge of their underlying platform implementation.”
While not mandatory at first, SOA becomes increasingly important as BI gets closer to real-time analytics. This is because the information should flow as freely as possible. To put it succinctly, the IT infrastructure should be to information as a power grid is to electricity.
SOA provides the following:
- SOA breaks the application into component services
- Pack information as a service to business processes
- Decouples application logic from data model
- Optimizes data services infrastructure
Who needs a Dynamic Warehouse anyway?
Suppose a Media and Entertainment solution provider (that warehouses Digital Assets) intermittently receives data from various service providers. This data should be immediately made available to the users of their Data Warehouse. There would be a delay in the availability of this information if the Warehouse is updated in batches. So customers may want to build a Dynamic Warehouse for this Metadata (about the video clippings/ pictures etc.) so that all the new information would be reflected in the Data Warehouse, as soon as the information was generated.
Or another example, say a e-commerce site wants to know how the customers are reacting to the recently launched ad-campaign. On some festival sale, what are the buying patterns of customers in different regions. What are the high performing products. Having a handle to such key information at right time definitely helps the client to make correct business decisions.
But the generation of this information takes time as shown below and we loose the value of the information…
Is there an ETL tool that can help me in easily creating a dynamic warehouse?
The role of SOA in building Dynamic Warehouse is already discussed. Traditional approaches to build a solution require extensive custom programming and it may not be able to scale. I tried IBM Information Server. It came out to be the natural choice because it is the product that I work on :). I found the following advantages with it:
- Broad range of integration functionality: Including federation, ETL, in-line transformation, replication, and event publishing
- SOA support: Integration logic built within IBM Information Server can easily be deployed and managed as a shared service within a SOA.
- Less time to Deployment: Same tooling for creating, managing and exposing services for many sources of data
- Ease of use: Services creation and deployment is an intuitive and interactive process that allows data experts to expose higher level services without knowing how to code
- Data governance issues: consistent security, service monitoring, logging
- Deployment: Infrastructure for load balancing, failover and scaling by adding additional servers
The implementation bois down to dynamically update the Data Warehouse (DB2) as various independent service providers pump in data at run time. We can create IIS job(s) that takes the input, transforms it and finally store it into the Warehouse. Finally we can expose the job as a service such that it can be invoked using SOAP messages.
Step wise implementation:
Step 1: Install IBM Information Server (IIS) with ISD (InfoSphere Services Director). IIS is supported on multiple OS.
Step 2: Create table definitions on the chosen Data Warehouse. IIS connects to almost all prominent databases.
Step 3: Create a DataStage job to collect the incoming data / transform it and load it into the Warehouse. This is done using easy to use drag and drop functionality from IIS job designer. The job can be set to run in parallel on a multi-node configuration. A simple job looks like the following:
Step 4: Deploy the job as service on the WebSphere application server that comes along with IIS installation. ISD allows any of these jobs to be easily deployed as Web services or EJBs, in minutes, without any hand-coding.
Step 5: Test the service and fine tune for performance. WISD load balances service requests across multiple Information Server nodes, to ensure smooth pick-up of load spikes, and to ensure fault tolerance and high availability.
In today’s competitive business environment, access to timely and accurate data has become a critical element of success. For this we need to adopt the concept of dynamic warehousing. With correct data in place, employees will not base their decisions on intuition or suspicion but rather on facts — real information. They can put together a complete picture that contains all pertinent data points, empowering them to make the right decision faster and with confidence. What’s more, an SOA is a good foundation from which to build.
InfoSphere Information Server (IIS) provides a parallel processing infrastructure, connectivity to nearly any data or content source, and the ability to deliver information through a variety of mechanisms. Underlying these functions is a unified metadata management foundation that provides seamless sharing of knowledge throughout a project lifecycle, along with a detailed understanding of what information means, where it came from, and how it is related to information in other systems. Integration logic built within IIS can easily be deployed and managed as a shared service within a SOA thus making it a perfect choice to build a Dynamic Warehouse.