The SQL transformation processes
SQL queries midstream in a pipeline. You can insert, delete, update, and
retrieve rows from a database. You can pass the database connection information
to the SQL transformation as input data at run time. The transformation
processes external SQL scripts or SQL queries that you create in an SQL editor.
The SQL transformation processes the query and returns rows and database
errors.
Mode
Script mode : The SQL
transformation runs ANSI SQL scripts that are externally located. You pass a
script name to the transformation with each input row. The SQL transformation
outputs one row for each input row.
Query mode : The SQL
transformation executes a query that you define in a query editor. You can pass
strings or parameters to the query to define dynamic queries or change the
selection parameters. You can output multiple rows when the query has a SELECT
statement.
Main Properties
Passive or active transformation : The SQL
transformation is an active transformation by default. You can configure it as
a passive transformation when you create the transformation.
Database type: The type of database the SQL
transformation connects to.
Connection type. Pass database
connection information to the SQL transformation or use a connection object.
There is lot of theoretical
document about this transformation about this on internet and help files I will
try to show how we can use it in different practical scenarios
- Remove duplicate records
- Dynamic Query in SQL Transformation
- Capture Database errors using SQL Transformation
- SQL Transformation with Non-Equi Join
- Dynamic DB Connections using SQL Transformation
- One to Many Records SQLTx
- NonEql Join MultMatch Lkup
- Dynamic Split Data
Remove duplicate records
We can remove duplicate records
by using SQL Transformation by using below method.
First use aggregator
transformation to count of duplicate rows then use same variable in SQL
transformation using parameter substituting
Use below in SQL transformation
after creating sql transformation in query mode
delete test_dup where
PRODUCT_ID=~PRODUCT_ID~ and SUBPRODUCT_ID=~SUBPRODUCT_ID~ and rownum
<~count_dup~
TEST_DUP is table and we want to
find duplicate data on basis of product id and sub product id.
We have used string substitution
feature of SQL transformation here i.e. use of
~
Dynamic Query in SQL
Transformation :
In most of datawarehouse
projects we encountered error ORA-02149
Specified partition does not exist as
sometime we don’t have partition created for new data or data comes with
unexpected value which does not have corresponding partition.
Though it can be
overcome by using interval partitioning feature of oracle but that creates
partition name with name like SYS*
We can use SQL
Transformation to achieve this.
Suppose we have
sales table created with list partitioning
create table sales (
product_id number,
trans_amt number,
sales_dt date,
state_code varchar2(2)
)
partition by list (state_code)
(
partition ct values ('CT'),
partition ca values ('CA')
);
Suppose we get
every day data in sales file which get loaded into sales table but if data
comes with other value than CT or CA for state code then load will fail with
error error ORA-02149
Specified partition does not exist
We can create a SQL
Transformation to achieve this
It can be achieve in below steps
Source is
sales file and create a look up on all_tab_partitions view
Call this
unconnected lookups from expression using below syntax
Chk_partition_exist
= :lkp.LKP_ALL_TAB_PARTITIONS('SALES',state_code)
create a filter
transformation with iif(isnull(Chk_partition_exist),TRUE,FALSE) so that data
for which partition does not exist will
pass
Create SQL
transformation like below
If input data comes with new values of state
code then it will automatically create partitions so that we will never see
error ORA-02149 Specified partition does not exist again.
We will
continue to cover more practical usages of SQL transformation in next article.