Task Factory - Jay Cole
"I purchase Task Factory over a year ago, and let me tell you, IT WORKS FLAWLESSLY. I haven’t had a single problem with it, and I used to HATE having to make changes with the version you’re using. My advice (for what it’s worth) is to pay the money!"

Jay Cole

Task Factory - Mark Marinovic
"We bought Task Factory primarily for the SCD Type 2 task, as it enabled us to do several things the built-in SSIS SCD task did not do. Task Factory comes highly recommended from me as our ROI was reached in just days."

Mark Marinovic

Task Factory - Urban Outfitters
"We have found that Task Factory tremendously speeds up the development process and improves performance by means of the high-performance components."

Tim Harris

BI Data Engineer

Urban Outfitters Inc.

Task Factory
"I have found Task Factory to be a huge time saver with SSIS development. The SalesForce components alone saved close to two months of work on a major integration project. I highly recommend!"
Alan Rubel
Database Administrator
Joint Commission Resources
Task Factory Loews Corporation
"We bought the product mostly because of the company's reputation for the service they provide and the specific need of a secure SFTP task."

Matt Cushing

Application Systems Analyst/Developer

Loews Corporation

Greater New York City Area



Buy Now Free Trial Download Compare Editions More Features

Task Factory Address Parse Transform

Address data elements include AddressLine1, AddressLine1, City, State, PostalCode, StreetNumber, StreetDirection and much more.

Feature Highlights

  • Used to parse unformatted address data into USPS standardized address data
  • Two outputs can be used to detect valid and invalid addresses
  • Granular detailed data available about each address is parsed for use in the data flow
  • Address validity and parsing clean up solution  
Address Parse Transform

Address Parsing Transform- Step 1

The address parse transform is used to parse unformatted address data and transform it into USPS standardized address data.

An example is the address 10 HAMILTON STREET SUITE 5T ORLANDO FL 32801 will be turned into:

 Address Line 1  10 Hamilton St
 Address Line 2  Ste 5T
 City  Orlando
 State  FL
 Zip Code  32801

The address parse transform is used to parse unformatted address data and transform it into USPS standardized address data.

An example is the address 10 HAMILTON STREET SUITE 5T ORLANDO FL 32801 will be turned into:

 Address Line 1  10 Hamilton St
 Address Line 2  Ste 5T
 City  Orlando
 State  FL
 Zip Code  32801

Address Parsing Transform- Step 2

Address Data Input Tab

The address data input tab is used to map the source input columns to the appropriate address elements in the address parse transform.

There are two options when mapping the source columns.

Address Data Input Tab

The address data input tab is used to map the source input columns to the appropriate address elements in the address parse transform.

There are two options when mapping the source columns.

Address Parsing Transform- Step 3

Single column contains all address data

Use this option when all of the address data from your source is contained in a single column. In this screenshot all the data is contained in a column named 'AddressLine1'

Single column contains all address data

Use this option when all of the address data from your source is contained in a single column. In this screenshot all the data is contained in a column named 'AddressLine1'

Address Parsing Transform- Step 4

Address data is spread across multiple columns

Use this option when the address data from your source is not contained in a single column. You will choose which column from your source contains the address element needed by the transform. 

Address data is spread across multiple columns

Use this option when the address data from your source is not contained in a single column. You will choose which column from your source contains the address element needed by the transform. 

Address Parsing Transform- Step 5

Your source does not have to contain a one to one relationship to the address elements in the address parse. You can map a single column from your source to multiple address elements in the transform. For instance, if you have the data for your address in a column named 'Address' and the data for city state and zip in a column named 'CityStateZip' your mappings would be as follows:

 AddressLine1 column:  Address
 AddressLine2 column:  Address
 City column:  CityStateZip
 State column:  CityStateZip
 PostalCode column:  CityStateZip

Your source does not have to contain a one to one relationship to the address elements in the address parse. You can map a single column from your source to multiple address elements in the transform. For instance, if you have the data for your address in a column named 'Address' and the data for city state and zip in a column named 'CityStateZip' your mappings would be as follows:

 AddressLine1 column:  Address
 AddressLine2 column:  Address
 City column:  CityStateZip
 State column:  CityStateZip
 PostalCode column:  CityStateZip

Address Parsing Transform- Step 6

Address Quality Tab

The address quality tab is used to tell the transform what should happen to low quality addresses. 

Address Quality

The address parse assigns a level of quality to two sections of the address.

  1. Address Line 1 and Address Line 2
  2. City, State and Zip

There are three levels of quality that can be assigned to the two sections of an address

  1. High
    1. For address line 1 and address line 2, high quality means the address is technically complete
    2. For city, state and zip, high quality means the city, state and zip are technically correct
  2. Medium
    1. For address line 1 and address line 2, medium quality means the address is complete but missing a part of the address like apartment number or street suffix
  3. Low
    1. For address line 1 and address line 2, medium quality means the address is not verifiable.
    2. For city, state and zip, high quality means either the city, state or zip code is missing.

When the address parse runs it will parse out the data and assign the level of quality based on whether it can find a valid address and a valid city, state and zip according to the rules above.

You can either include all of the address as part of the output from the transform or you can choose "Include only high quality parsed addresses"

Selecting "Include only high quality parsed addresses" mean the output named "Parsed Address Output" (See output descriptions below) will contain addresses that are considered high quality. 

You must make a selection in the "What should happen to non-parseable (low quality) addresses?" drop down.

  1. Ignore failures (low quality addresses will be skipped) - This option will skip the address marked as low quality.
  2. Redirect errors to error output - This option will redirect the rows to the error output (See output descriptions below)
  3. Fail component - This option will fail the component when the first invalid address is found.
  4. Redirect errors to non parseable output - This option will redirect the rows to the "Non-Parsed Address Output" output.

Address Quality Tab

The address quality tab is used to tell the transform what should happen to low quality addresses. 

Address Quality

The address parse assigns a level of quality to two sections of the address.

  1. Address Line 1 and Address Line 2
  2. City, State and Zip

There are three levels of quality that can be assigned to the two sections of an address

  1. High
    1. For address line 1 and address line 2, high quality means the address is technically complete
    2. For city, state and zip, high quality means the city, state and zip are technically correct
  2. Medium
    1. For address line 1 and address line 2, medium quality means the address is complete but missing a part of the address like apartment number or street suffix
  3. Low
    1. For address line 1 and address line 2, medium quality means the address is not verifiable.
    2. For city, state and zip, high quality means either the city, state or zip code is missing.

When the address parse runs it will parse out the data and assign the level of quality based on whether it can find a valid address and a valid city, state and zip according to the rules above.

You can either include all of the address as part of the output from the transform or you can choose "Include only high quality parsed addresses"

Selecting "Include only high quality parsed addresses" mean the output named "Parsed Address Output" (See output descriptions below) will contain addresses that are considered high quality. 

You must make a selection in the "What should happen to non-parseable (low quality) addresses?" drop down.

  1. Ignore failures (low quality addresses will be skipped) - This option will skip the address marked as low quality.
  2. Redirect errors to error output - This option will redirect the rows to the error output (See output descriptions below)
  3. Fail component - This option will fail the component when the first invalid address is found.
  4. Redirect errors to non parseable output - This option will redirect the rows to the "Non-Parsed Address Output" output.

Address Parsing Transform- Step 7

Parsed Output Tab

The parsed output tab defines how you want the parsed address data outputted from the transform. The default is that there will be new columns for each address element and quality element as part of the "Parsed Address Output".

Parsed Output Tab

The parsed output tab defines how you want the parsed address data outputted from the transform. The default is that there will be new columns for each address element and quality element as part of the "Parsed Address Output".

Address Parsing Transform- Step 8

Output Columns Grid

Address Data Column 

The Address Data Column defines the data that will be outputted from the transform. The following parsed data will be outputted:

AddressLine1 - Contains the street address
AddressLine2 - Contains the suite, apartment or secondary address data
City - Contains the City of the address
State - Contains the State of the address
PostalCode - Contains the postal or zip code of the address
Country - Contains the Country of the address
Quality - Contains the quality (high, medium or low) for the address section (addressline1 and addressline2) of the parsed address
CSZQuality - Contains the quality (high, medium or low) for the city, state and zip section of the parsed address

Action Column

The action column defines whether the output will contain a new column or replace the data in an existing column. There are two options to select as an action:

Add New Column - This will tell the transform to add a new column to the output (default behavior). The columns to the right of the action column are used to define the properties of the output column. The data for the respective Address Data Column will be contained in the output column defined in the Output Column Name column.

If you use "Add New Column" you can verify the output columns were added by looking at the meta data. In the screenshot below, all columns have been selected to use "Add New Column".

Output Columns Grid

Address Data Column 

The Address Data Column defines the data that will be outputted from the transform. The following parsed data will be outputted:

AddressLine1 - Contains the street address
AddressLine2 - Contains the suite, apartment or secondary address data
City - Contains the City of the address
State - Contains the State of the address
PostalCode - Contains the postal or zip code of the address
Country - Contains the Country of the address
Quality - Contains the quality (high, medium or low) for the address section (addressline1 and addressline2) of the parsed address
CSZQuality - Contains the quality (high, medium or low) for the city, state and zip section of the parsed address

Action Column

The action column defines whether the output will contain a new column or replace the data in an existing column. There are two options to select as an action:

Add New Column - This will tell the transform to add a new column to the output (default behavior). The columns to the right of the action column are used to define the properties of the output column. The data for the respective Address Data Column will be contained in the output column defined in the Output Column Name column.

If you use "Add New Column" you can verify the output columns were added by looking at the meta data. In the screenshot below, all columns have been selected to use "Add New Column".

Address Parsing Transform- Step 9
Replace Column - This will tell the transform to replace the data in a source column instead of using an output column. You will need to select the name of the source column to replace in the Output Column Name column. If this option is select, all of the columns to the right of Output Column Name are now readonly.

In the screenshot below, the AddressLine1 Action has been changed "Replace Column"


Output Column Name

The Output Column Name defines either the name of the output column if "Add New Column" is selected in the Action column or the name of the source column to replace if "Replace Column" is selected.

 


Replace Column - This will tell the transform to replace the data in a source column instead of using an output column. You will need to select the name of the source column to replace in the Output Column Name column. If this option is select, all of the columns to the right of Output Column Name are now readonly.

In the screenshot below, the AddressLine1 Action has been changed "Replace Column"


Output Column Name

The Output Column Name defines either the name of the output column if "Add New Column" is selected in the Action column or the name of the source column to replace if "Replace Column" is selected.

 


Address Parsing Transform- Step 10

Output Case Formatting

Output Case Formatting tells the transform how the parsed output should be formatted. There are three options.

  1. Proper Case - This will transform all of the data in the parse address to proper case (e.g. turns "10 HIGHLAND STREET" into "10 Highland St")
  2. Upper Case - This will transform all of the data in the parse address to UPPER case
  3. No case formatting - This will not transform the case of the data.

 

Output Case Formatting

Output Case Formatting tells the transform how the parsed output should be formatted. There are three options.

  1. Proper Case - This will transform all of the data in the parse address to proper case (e.g. turns "10 HIGHLAND STREET" into "10 Highland St")
  2. Upper Case - This will transform all of the data in the parse address to UPPER case
  3. No case formatting - This will not transform the case of the data.

 

View Our Suite of Pragmatic Works Products
Copyright 2014 by Pragmatic Works