In an ETL tool, we may want to invoke an external Java code for some intermediate processing of data. Information Server has a stage called “Java Integration Stage” that is meant to accomplish this.
Java Integration Stage:
The Java Integration stage provides the functionality to invoke Java code that interfaces with InfoSphere Data Stage and Quality Stage parallel jobs. The customer will be able to use the stages to integrate their Java code into their job design.
- Provides functionality to
- Produce (write) rows that are used within the job
- Consume (read) rows that are supplied on input links
- Process rows from an input link and generate rows on the output link
- Query column and stage metadata
- Supports Java Beans for simplicity, and to allow for a user’s existing Java code to be invoked from the Java Integration Stage.
- Supports a ‘column-based’ mode for querying metadata dynamically at runtime, and for dynamic access of column data.
- Provides a discovery interface that allows a user’s code to learn about the calling environment, and for the framework to learn about the user’s code capabilities.
- Supports any number of inputs and outputs
- Supports reject links, and the ability to transfer records from an input to an output
- Improves design issues with the current Java Pack API (such as being able to get the links’ column metadata in initialize() without having to create an input row).
- Supports sending end-of-wave markers to output links
- Supports Runtime Column Propagation (RCP)
- Supports Automatic Column Transfer