InfoSphere Quality Stage – VI (India Address Standerdization)

IBM’s InfoSphere Quality Stage has the capability to standardize the Indian addresses. In this blog I will mention some of it’s highlight, through some examples. It can do the following…
  • Standardizes India addresses (urban, rural, military etc) and provide a great degree of consistently in token definitions for producing high standardization outputs
  • Has the ability to be invoked real time by other applications in the enterprise so that data standardization can be handled real time
Here is one example of the Input and generated Standerdized output…
Standerdize India Address
Standerdize India Address
So we see that the Standerdize Stage identified the various tokens (individual words) in the input and identified a proper output column for it.
We typically use the IndiaAddressSharedContainer shared container in a job that standardizes Indian address and area data. Given any Indian address, the Shared Container will standerdize the input. The shared container is imported with the Indian address rule sets. Here are some more sample of input addresses Vs standardized addresses
As I understand, cleansing the Indian Address using Quality Stage has been validated by several Indian Customers with good rural presence.
If you want to know the details of working of India Rule Set, please check out my Developer Works Article.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s