Datwarehousing/Oracle/Informatica/IDQ/MDM/Unix Plus interview questions

Sunday, 22 February 2009

Unix 2/22/2009 05:18:00 am

Unix Interview Questions (Part1)

Following are some unix commonly asked interview questions

Q 1 What is command to check space in Unix

Ans : df -k

Q 2 If a file has permission 000 then who can access the File

Ans : System Administrator .

Q 3 What is command to kill last background Job

Ans : kill $!

Q 4 How you will list all Hidden files

Ans : ls -la|grep ^[.]

Q 5 What is command to create Zero Byte File

Ans : touch filename

Q 6 What is difference between diff and cmp command

Ans : cmp -It compares two files byte by byte and displays first mismatch.

diff -It displays all changes required to make files identical.

Q 7 What does $# stands for

Ans : It will return the number of parameters passed as command line argument.

Q 8 How many prompts are availaible in Unix System

Ans : PS1 the default prompt

PS2 Multiline Prompt

Q 9 How to kill a process forcibily

Ans : kill -9 PID

PID (unique identifier of process)

Q 10 How to create a hidden file

Ans : While creating file put . DOT in front of file name

If you need more interview questions please mail us at support@itnirvanas.com.

My Profile

Unknown 4

Wednesday, 18 February 2009

Informatica Metadata Queries 2/18/2009 11:55:00 am

Informatica Metadata Queries (Part1)

Informatica Metadata Queries

In Informatica normally we come across situation like we need to find all mapping which use a particular table as source or something similar .It is very difficult to find it through Informatica tools.In such a situation Repository tables are very useful.

Repository tables name start with OPB and are continuously updated when we make changes.Along with OPB tables REP views are also present.

It is very dangerous to modify these tables so due care should be taken while dealing with OPB Tables.

In this article we will focus on some useful meta data queries

Purpose : Search for a table in Source Qualifiers Sql Override:

Query :

select distinct SUB.subj_name, MAP.mapping_name

from opb_widget_attr WID, opb_mapping MAP, opb_widget_inst WIDINST,

opb_subject SUB

where WID.widget_id = WIDINST.widget_id

and WID.widget_type = WIDINST.widget_type

and WID.widget_type = 3

and WID.attr_id = 1

and WIDINST.mapping_id = MAP.mapping_id

and MAP.subject_id = SUB.subj_id

and upper(WID.attr_value) like '%TNAME%' ;

Description : This query will give list of all mappings where a particular table is being used in sql override.

Purpose : Search for a table in Sources and Targets :

Query :

select SUB.subj_name, decode(WIDG.widget_type,1,'Source',2,'Target'),

MAP.mapping_name,

WIDG.instance_name

from opb_widget_inst WIDG, opb_mapping MAP, opb_subject SUB

where SUB.subj_id = MAP.subject_id

and WIDG.mapping_id = MAP.mapping_id

and WIDG.widget_type in (1,2)

and WIDG.instance_name like '%TNAME_%'

and SUB.subJ_NAME='YOUR_FOLDER_NAME'

Description : This query will give list of all folders,mappings where a particular table is being used as source or target instance.

Purpose : Query to give lookup information

Query :

Select distinct wid.WIDGET_ID, all_map.mapping_name, wid.INSTANCE_NAME Lkp_name, Decode(widat.attr_id,2,widat.attr_value) Table_name,

decode (widat.attr_id,6,widat.attr_value) src_tgt

FROM rep_all_mappings ALL_MAP, rep_widget_inst wid, OPB_WIDGET_ATTR widat

where all_map.mapping_id=wid.mapping_id

and wid.WIDGET_ID=widat.WIDGET_ID

and all_map.subject_area='DCM_SPP_UPL_DEVT'

and wid.WIDGET_TYPE=11

and widat.WIDGET_TYPE=11

and widat.ATTR_ID in (2,6)

Description : This query will give information about lookup transformations like lookup name,Tablename ,Mapping name etc.

Purpose : Query to give Invalid workflows

Query :

select opb_subject.subj_name, opb_task.task_name

from opb_task, opb_subject

where task_type = 71

and is_valid = 0

and opb_subject.subj_id = opb_task.subject_id

and UPPER(opb_subject.SUBJ_NAME) like UPPER('YOUR_FOLDER_NAME')

Description : This query will list of all invalid

If you have need some information and need query about something different related to metadata query please mail us at support@itnirvanas.com.

Unknown 4

Sunday, 15 February 2009

Oracle 2/15/2009 08:50:00 am

Oracle Deferred Constraints

Oracle Deferred Constraints :
Recently we came across a unique situation of updating a primary key of master table which being referenced by child table.Problem with updating primary key of master table was that it would have made child table orphan (ora-02292 child found)and we could not even update child table as it would have violated referential integrity.

To overcome this we used the deferrable constraint introduced by oracle.A deferrable constraint is only checked at transaction commit time.We altered the foreign key on child table to make constraint deferrable .After altering the foreign key constraint on child table we were able to update master table primary key with new values as constraint checking was deferred till commit point.

Now we will explain the deferrable constraints in detail.

Constraint are of three types
1) Non deferrable (Check constraint at time of commit and can not be deferred)
2) Deferrable-Initially immediate (Check constraint at time of statement execution )
3) Deferrable-Initially deferred (Check constraint at time of commit)

SQL> create table test_def (a varchar2(1) constraint chk_a check ( a in ('X','Y','Z') )
2 deferrable
3* initially immediate);
Table created.
SQL> insert into test_def values ('C');insert into test_def values ('C');

*ERROR at line 1:ORA-02290: check constraint (ITN.CHK_A) violated

Error as constraint is being checked at time of statement execution. Now we will change constraint to initially deferred.

SQL> alter table test_def modify constraint chk_a initially deferred;
Table altered.
SQL> insert into test_def values ('C');
1 row created.
SQL> commit ;
*ERROR at line 1:ORA-02091: transaction rolled back ORA-02290: check constraint (ITN.CHK_A) violated

As constraint is initially deferred it is being checked at commit type.
When should we use deferrable constraint :
Deferred integrity constraints should only be used when absolutely necessary but following are some scenario where deferrable constraint are used

1) In case of inserting large amounts of data into a table in a data warehousing environment lot of time can be saved.
2) Design issues (Updating Primary Key of Parent table in Parent Child table scenario).
Which constraints are deferrable? Only constraint created with the deferrable option can be deferred.by default constraints created are non-deferrable.

How to check constraint is deferrable ?
SELECT constraint_name , deferrable , deferred
FROM user_constraints
WHERE constraint_name like 'test_def%';

This article will not be complete without mention of option No Validate with constraints.

No Validate option with constraint : No Validate option will direct Oracle to enable the constraint but not to check existing data for constraint violation.This is particular useful where we have existing data that violates the constraint but urgent business requirement to enable constraints to avoid any future data with constraint violation.

If you have any question please contact us at support@ITNirvanas.com My Profile

Unknown 3

Saturday, 7 February 2009

Datawarehousing 2/07/2009 03:10:00 am

Kimball Vs inmon

There are two major design methodologies followed in data warehousing Ralph Kimball and Bill Inmon.We will discuss about both of these in detail.

Bill Inmon Approach : According to Bill Inmon Data warehouse need to fulfill need of all category of users .In an organization there are different type of user like

· Marketing
· Supply Change Management
· Operations

Each department has its different way of interpreting data so Data warehouse should be able to answer each department queries. This can be achieved by designing tables in 3NF form. According to him data in Datawarehouse should be in 3NF and lowest granularity level. The data should be accessible at detailed atomic levels by drilling down or at summarized levels by drilling up.

He stressed that data should be organized into subject oriented, integrated, non volatile and time variant structures. According to him an organization have one Data warehouse and Data mart source there information from Data warehouse. Inmon Approach is also called Top Down approach .

In this methodology data is brought into staging area from OLTP system or ODS (Operational Data store) and then summarized and aggregated. After this process data mart will source their data from data warehouse and will apply new set of transformation and aggregation according to their need.

Key points to be noted about this approach

1.Data should be organized into subject oriented, integrated, non volatile and time variant structures
2.Data in 3rd Normalization form
3.Top to down approach

4.Data Mart source from Datawarehouse

Pro’s of Bill Inmon approach

1. Easy to maintain
2. Well integrated

Cons of Bill Inmon approach
Difficult to implement

Ralph Kimball Approach :

Kimbell views Data warehouse as combination of data marts connected to Data warehouse bus structure.Data marts are focused on delivering business objectives of different departments and Data warehouse bus consists of conformed dimension, measures defined for whole organization. User can query all data marts together using conformed dimensions.

In this approach the data warehouse is not a physical storage of the data as in the Inmon approach. It is “virtual.” It is a combination of data marts, each having a star schema design .
In this approach data is always stored in dimensional model.

Key points to be noted about this approach are

1. Data is always stored in the dimensional model.
2. Bottoms Approach
3. Data ware house is Virtual
4. Bottoms's up approach

Pro’s of Ralph Kimball approach Fast to build

Cons of Ralph Kimball approach Difficult to maintain because of redundancy of data across data marts

	Inmon	Kimball
Source Required	ü	ü
Staging	ü	ü
ETL	ü	ü
Data Marts	ü	ü
Business Requirements	ü	ü
Time attribute of data	ü	ü
Enterprise DW	ü	X
Dimensional Tools	X	ü
Relational Tool	ü	X
Process Oriented	X	ü
Normalised data Model	ü	X
Complex to Design	ü	X
Continous and Discrete Time Frame	ü	X
Slowly changing Time Frame	X	ü

Inmon vs. Kimball: Same or different?

"You can catch all the minnows in the ocean and stack them together and they still do not make a whale." ~Inmon

“The data warehouse is nothing more than the union of all the data marts" ~Kimball

Conclusion : In reality there is no right or wrong between these two approaches. In reality actual methodology implemented is combination of both.

My Profile

Unknown 2

Friday, 6 February 2009

2/06/2009 03:14:00 am

About me

I have 14+ years of IT experience.I love to write technical articles on all technologies related too Dataware housing especially informatica.I am trying to share all my learing and knowledge with everyone.I am keen to help you with any issue related to informatica.My solution have also been published in Informatica Market place.You can reach me at lalits77@gmail.com.I can also guide people who are new to informatica .I can help you in prepration for interviews and certification.

My articles have been published in Informatica market place
Please leave your comments below if you have any query/questions

https://community.informatica.com/solutions/dynamically_generate_workflow_parameter_files

https://community.informatica.com/solutions/mapping_email_formatting_errors

My linkedin profile is

https://www.linkedin.com/profile/view?id=30747327&trk=spm_pic

Please leave your comments below if you have any query/questions

Unknown 0

Tuesday, 3 February 2009

Informatica Best Practices 2/03/2009 11:12:00 am

Best Practices in Informatica (Part1)

We have tried to come up with some of best practices in informatica

Always try to add expression transformation after source qualifier and before Target. If source or target definition changes it is easier to reconnect the ports
Always use Cobol File for normaliser in binary format otherwise there are lot of issues specially with comp-3 fields
Remove unused ports, though unused ports do not have any effect on performance bit it is always better to remove them for more visibility
If possible try to do calculation in Output Ports instead of variable ports as variable ports are need to assign/reassign each time and it can slow down the performance
Try to avoid complex filter expression instead of that try to evaluate filter expression in upstream expression transformation and pass it to filter transformation. If you use too many complex calculations in filter condition expression it can slow down performance.
In workflow Source/Target directory Property take advantage of Unix links. Instead of hard coding path in source/target directory specify path with Unix link i.e. suppose in devt environment you are specifying Source directory path as /devserver/team/source and in prod server you specify it as /prodserver/team/source .You can get link created in $PMRootDir in as src_file_dir pointing to /devserver/team/source in dev server and /prodserver/team/source in prod server and in your source/Target file directory you can put path as $PMRootDir/src_file_dir In this case there is no need to change Source/Target directory every time you move between production and dev and testing
In sequence generator do not connect current value port to downstream (unless required) transformation as when we connect current value port from sequence generator transformation Informatica Server processes one row in each block. We can optimize performance by connecting only the NEXTVAL port in a mapping .
Improve lookup performance by putting all conditions that use the equality operator ‘=’ first in the list of conditions under the condition tab.
Always remember rule not to cache look tables having more than 550000 rows (Assuming row size 1024) .If your row size is less than or more than 1024 then adjust number of rows accordingly .
Avoid calculating same value again and again. Instead of that store it in a variable use it several times.
Limit the number of rows as early in the mapping as possible (SQ, aggregators, joiners, filters) - the more you limit at the beginning of the data flow, the less rows will be processed by all downstream transformations.
For generating IDs use IDENTITY columns or DB triggers
There should be no unused ports in any transformation (except Source Qualifier) - Source Qualifier should match Source Definition 1-to-1. But for any other transformation there should be no unused ports. Why fetch and process the data for half of the data flow and use extra memory, if you don’t need it?
Avoid implicit conversion, as it is slower and might lead to errors - There are number of issues with implicit conversion (e.g. when you link SQ string port to decimal port in the following expression):
Remove all the 'Error (Transformation error)' messages from default values in Expression transformations - you will avoid getting unwanted error messages in logs. It will not be checked at session initialization
Consider adding Expression transformation right after each SQ and right before each Target - pass-through expressions do not cause additional computation. But I couldn’t count how many times I had to make “little change” before or after the whole data flow in an existing mapping. Having this dummy expression transformation helps a lot. It’s also very useful when relinking all ports after some changes (using Autolink by name).

If you have further queries then please mail to support@itnirvanas.com

Unknown 7

Monday, 2 February 2009

Unix 2/02/2009 08:36:00 am

Unix Script to Automate FTP Process

Unix Script to Automate FTP Process
Normally we come across the situation when we have to FTP file daily.Doing this process daily for large no of files can be cumbersome.We can create Unix script to automate FTP Process

Currently we need to last day files from production to dev box for testing,so we created below mentioned script to FTP files daily

b=`TZ=CST+24 date +%y%m%d` ###To get last date
cd /export/home/mydir
ftp -v -n FTP_HOST_NAME EOF ##Please put two less than sign before EOF
user Userid Pwd
bin
cd /export/home/source_dir
get FILE_NAME.`echo $b`
bye
EOF

Please note replace FTP_HOST_NAME,Userid,Pwd with your actual userid,password and ftp host name.

Please put two less than sign before EOF in third line of script as i am not able to display those characters due to some restrictions.

If you have further queries then please mail to support@itnirvanas.com

Unknown 2

Sunday, 22 February 2009

Unix Interview Questions (Part1)

Wednesday, 18 February 2009

Informatica Metadata Queries (Part1)

Sunday, 15 February 2009

Oracle Deferred Constraints

Saturday, 7 February 2009

Kimball Vs inmon

Friday, 6 February 2009

About me

Tuesday, 3 February 2009

Best Practices in Informatica (Part1)

Monday, 2 February 2009

Unix Script to Automate FTP Process

Popular Posts

Popular

Comments

Contact Form

About me

Statcounter

Tags

Follow Us