Thursday 20 March 2014

SQL Transformation -Practical Usage

The SQL transformation processes SQL queries midstream in a pipeline. You can insert, delete, update, and retrieve rows from a database. You can pass the database connection information to the SQL transformation as input data at run time. The transformation processes external SQL scripts or SQL queries that you create in an SQL editor. The SQL transformation processes the query and returns rows and database errors.

Script mode : The SQL transformation runs ANSI SQL scripts that are externally located. You pass a script name to the transformation with each input row. The SQL transformation outputs one row for each input row.
Query mode : The SQL transformation executes a query that you define in a query editor. You can pass strings or parameters to the query to define dynamic queries or change the selection parameters. You can output multiple rows when the query has a SELECT statement.

Main Properties

Passive or active transformation : The SQL transformation is an active transformation by default. You can configure it as a passive transformation when you create the transformation.
Database type: The type of database the SQL transformation connects to.
Connection type. Pass database connection information to the SQL transformation or use a connection object.

There is lot of theoretical document about this transformation about this on internet and help files I will try to show how we can use it in different practical scenarios

  1. Remove duplicate records
  2. Dynamic Query in SQL Transformation
  3. Capture Database errors using SQL Transformation
  4. SQL Transformation with Non-Equi Join
  5. Dynamic DB Connections using SQL Transformation
  6. One to Many Records SQLTx
  7. NonEql Join MultMatch Lkup
  8. Dynamic Split Data

Remove duplicate records

We can remove duplicate records by using SQL Transformation by using below method.
First use aggregator transformation to count of duplicate rows then use same variable in SQL transformation using parameter substituting
Use below in SQL transformation after creating sql transformation in query mode
delete test_dup where PRODUCT_ID=~PRODUCT_ID~ and SUBPRODUCT_ID=~SUBPRODUCT_ID~ and rownum <~count_dup~
TEST_DUP is table and we want to find duplicate data on basis of product id and sub product id.
We have used string substitution feature of SQL transformation here i.e. use of  ~

Dynamic Query in SQL Transformation :

In most of datawarehouse projects we encountered error ORA-02149 Specified partition does not exist as sometime we don’t have partition created for new data or data comes with unexpected value which does not have corresponding partition.
Though it can be overcome by using interval partitioning feature of oracle but that creates partition name with name like SYS*
We can use SQL Transformation to achieve this.
Suppose we have sales table created with list partitioning

create table sales (
        product_id      number,
        trans_amt       number,
        sales_dt        date,
        state_code      varchar2(2)
partition by list (state_code)
partition ct    values ('CT'),
partition ca    values ('CA')
Suppose we get every day data in sales file which get loaded into sales table but if data comes with other value than CT or CA for state code then load will fail with error error ORA-02149 Specified partition does not exist
We can create a SQL Transformation to achieve this

It can be achieve in below steps
Source is sales file and create a look up on all_tab_partitions view

Call this unconnected lookups from expression using below syntax
Chk_partition_exist = :lkp.LKP_ALL_TAB_PARTITIONS('SALES',state_code)

create a filter transformation with iif(isnull(Chk_partition_exist),TRUE,FALSE) so that data for   which partition does not exist will pass

Create SQL transformation like below

If  input data comes with new values of state code then it will automatically create partitions so that we will never see error ORA-02149 Specified partition does not exist again.

We will continue to cover more practical usages of SQL transformation in next article.


  1. Informatica Data Quality Online Training
    Informatica Online Training
    Informatica MDM Online Training
    . If you are seeking training and support you can reach me on 91-9000444287.
    Introduction and Defining Data Model
    Introduction to Informatica MDM Hub
    Master Data
    Master Data Management
    A Reliable Foundation of Master Reference Data
    Components of MDM Hub
    Application Server Tier
    Database Server Tier
    Batch Data Process Flow
    Trust Framework
    Consolidation Flag