Thursday 10 December 2015

Control M Character in File --Everything explained

This is the problem which every IT Person faces when dealing with files across platforms.If you haven't faced it I can guarantee you will face it one day.



There is lot of information on internet about this but it has not been clearly explained.I will try to put all information together about Control M character at one place.

First of all it is important to understand

"Line Endings" in different operating systems 

Line Ending means when you are at end of line and how Enter Key is interpreted in different operating systems.

All major operating system interpret Enter character differently

OS Line Ending
Windows CR/LF
Unix LF
Mac CR/LF

CR = \r = Carriage Return
LF = \n = Line Feed


Why Do Control M Character Occur

When we  open a text file that was created under Windows. The text characters in the first line are all displayed correctly. At the end of the line, we find a CR. This means nothing special to Unix so the unix attempts to display the character. CR is a non-printable character. Under Unix many of these non-printable characters are mapped to control characters when displayed. In the case of CR, its displayable equivalent is Control-M. This is displayed in most editors as ^M.
The next character is a LF. Unix is fine with it as it is the standard line end character. The editor therefore moves to the start of the next line on the display and starts to process the next line.

How to view Control M Character in Unix

Below will show all  tabs, vertical tabs, carriage returns, linefeeds and whatnot using the slash notation.
od -c Yourfile.txt

If you only want to see control M Character

cat -v filename.txt


Possible Solutions

  1. Remove character in VI Editor Open the VI Editor and do :1,$/^M//g .^M character can be produced by  holding down the control key whilst pressing M at the same time
  2. Different Unix Command : There are various commands availaible to remove control m character    1)  tr -d '\r' < infile.txt > outfile.txt 2) dos2unix < DOSfile.txt > Unixfile.txt
  3. Fix during FTP : Control M character will not appears if files is transferred using ASCII Mode
If you need to process such file in informatica you can call above commands in pre session to remove control M character

I Hope you are clear about why Control M character occur and possible solutions




No comments:

Post a Comment