In one of my previous blogs I mentioned about ETL jobs and in my last couple of blogs I was writing about role of an ETL tool. Now suppose you have created an ETL batch job and you want to enable the user call it dynamically by passing the name of input and output file in real time.
Here is the use case in detail.
1. User has a bunch of input files on server.
2. User deploys the job as a service which takes the input as the name of input file and name of the output file. (Just names as they are at predefined locations)
3. Based on the name of the input file, a particular file is selected, processed and processed output is written to the output file.
So is it possible? I was not sure till today morning and I am assuming some of the readers would also not be. But it is possible.
Deploying the Batch Job
User can deploy a batch job and then starts on demand. Each service request starts one instance of the job that runs to completion. This job typically initiates a batch process from a real-time process that does not need direct feedback on the results. It is tailored for processing bulk data sets and is capable of accepting job parameters as input arguments.
These jobs are called Topology 1 jobs and have the following characteristics:
- Start and stop times
- The elapsed time for starting and stopping a batch job, also known as latency, is high. This factor contributes to a low throughput rate in communication with the service client.
- Job instances
- The Information Service Framework (ISF) agent starts job instances on demand to process service requests, up to a maximum that you configure. For load balancing, you can run the jobs on multiple InfoSphere DataStage servers.
- Input and output
- An information service that is based on a batch job can use job parameters as input arguments. This type of service returns no output. If you design the information service, you can set values for job parameters. If the job ends abnormally, the service client receives an exception.
Like the following job takes bunch of Addresses and Validates/Standardizes it:
User can parametrize the names of input and output file in the job. It signals the engine that these values will be provided during run time. Then we deploy these jobs using ISD. These parameters appear as the input parameters for the deployed job, as if by magic🙂. If you have selected Rest Bindings, you can invoke the job by adding the parameters to the end of the URL.
For example the following is your Rest URL: https://Server:port/wisd-rest2/USAC/Addr_Validation/newOperation
Then this service can simply be invoked using https://Server:port/wisd-rest2/USAC/Addr_Validation/newOperation1?input=CASSIN&output=CASSOUT
And now you have successfully transformed your job to a Rest Service (or SOAP or EJB). Feel free to comment to get some of the missing details.