Making DataStage jobs Cloud ready using S3 connector

In my last blog, I was mentioning how a user can trigger a batch ETL job dynamically using the power of ISD. And we saw how the user can dynamically choose the input and output file. But there was a limitation, these files need to be available on the server where we have InfoSphere Information Server installed. Can we modify the behavior in a way that the input and output file exist on Cloud? In that case we have made the ETL batch job a step closer to be Cloud ready. Lets explore this.

In last blog we saw the following job taking the input and output from a file.

CASS

In my earlier blog I mentioned how in Information Server 11.3 release there were some Cloud Additions (to S3). Now we can replace the input and output of the above DataStage job with S3 connector (connector that can directly connect to the storage available on Amazon). This is as simple as deleting the Sequential File input and output and replace them with S3 connector. The job now looks like the following.

S3

To access  a File on S3, we require the following parameters. Access Key, Secret Key, Bucket, File name. Now we double click the Amazon_S3 stage to edit the properties and parametrize these values for both the input and output stage. S3_properties

In another minute we can compile and deploy this DataStage job as SOAP or REST Service. Now dynamically we can pick a file from Cloud, process it and then put the result back on Cloud! We do not have to write an app to test it also as there are many free plugins that can be used to verify this new Rest Service that was created. I typically use the Poster Plugin of Firefox Web Browser to compose the Rest request and view the response.

5 thoughts on “Making DataStage jobs Cloud ready using S3 connector

    • Hi Ruchi, That is a good question.
      Here is my understanding (and I may be wrong). The best way to do something like what you mentioned, we require the following:
      1. Have Information Server deployed instance in one of the Cloud Providers like Softlayer.
      2. Deploy the job as a service on Softlayer.
      3. Expose this service in Bluemix using user provided service. More detail on this can be found here.

  1. hi — With in Amazon S3 Bucket, if I have to read the file from a particular location lets say: Bucket/folder1/folder2/file. where do I mention this path “/folder1/folder2/” in the stage?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s