An IBM InfoSphere DataStage job consists of individual stages that are linked together and describe the flow of data from a data source to a data target. Balanced Optimization allows you to maximize job performance and optimize resources usage, which enables you to balance the workload across source and target systems. This allows Information Server to support not only the Extract-Transform-Load paradigm, but alternatives such as Extract-Load-Transform, where transformation tasks are performed on the target system, such as an IBM PureData™ for Analytics data warehousing appliance.
Balanced Optimization helps to improve the performance of your InfoSphere DataStage job designs that use connectors to read or write source data. You design your job and then use Balanced Optimization to redesign the job automatically to your stated preferences.
For example, you can maximize performance by minimizing the amount of input and output (I/O) that are used, and by balancing the processing against source, intermediate, and target environments. You can then examine the new optimized job design and save it as a new job. Your root job design remains unchanged.
You can use the Balanced Optimization features of InfoSphere DataStage to push sets of data integration processing and related data I/O into database management systems (such as an IBM PureData System for Analytics warehousing appliance) or into a Hadoop cluster.