UNIX Hints & Hacks |
|||||||||||||||||||||||||||||||||||||
Chapter 6: File Management |
|
||||||||||||||||||||||||||||||||||||
|
Here is a simple way to get rid of all the ^Ms at the end of lines.
Flavors: AT&T, BSD
Syntax:
tr -d string < infile > outfile sed 's/[regular expression]/[Replacement]/[flags]/g' infile > outfile
Many times files downloaded from PCs running DOS will have a Ctrl-M ( ^M)on the end of every line when viewed in an editor.
# vi /tmp/hosts.dos
206.19.11.10 pluto pluto.foo.com^M 206.19.11.203 star star.foo.com^M 206.19.11.161 moon moon.foo.com^M mars mars.foo.com^M
There are a couple of ways that the Ctrl-M ( ^M) can be stripped out of the file. The first is the tr command, which is used to translate characters. It is possible to tell tr to delete all Ctrl-M characters.
# tr -d "\015" < /tmp/hosts.dos > /tmp/hosts.unix
In this tr command, you delete ( -d) all occurrences of the \015 or Ctrl-M character from the file /tmp/hosts.dos and rewrite the output to /tmp/hosts.unix file. Another way to strip the Ctrl-M character is to pass the file through sed and have it perform a substitution:
# sed 's/^V^M//g' /tmp/hosts.dos > /tmp/hosts.unix
In this version, sed processes the file /tmp/hosts.dos searching for all occurrences of the Ctrl-M character. When it finds one, it performs a substitution of the character. In this case, you can swap the Ctrl-M with null and output the results into the /tmp/hosts.unix file.
The sed command can also be used from within the vi editor. The vi editor has two modes: the insert mode and the command mode. From within the command mode, sed can be executed to perform the substitution.
vi /tmp/hosts.dos
When inside the vi editor, make sure you are in the command mode by pressing the Esc key. Pressing the colon ( :) key allows you to input the sed command.
:%s/^V^M//g
This command behaves the same as using the sed command from a UNIX shell. It searches for all occurrences of the Ctrl-M character. When it finds one, it substitutes the character with nothing. You can then continue working in the file if you have more changes to make.
206.19.11.10 pluto pluto.foo.com 206.19.11.203 star star.foo.com 206.19.11.161 moon moon.foo.com 206.16.11.201 mars mars.foo.com
The result is a nice clean file with no Ctrl-M characters located anywhere throughout the file.
Some databases and applications require data from outside the UNIX world. These files can be imported from DOS and need to be straight, clean, plain text with no special control characters embedded in them.
Some new flavors of UNIX provide tools for handling cases such as these. These tools are called to_dos, to_unix, unix2dos and dos2unix and provide extended options. Check with your flavor to see whether it supports these commands.
Man pages:
sed, tr, to_dos, dos2unix
UNIX Hints & Hacks |
|||||||||||||||||||||||||||||||||||||
Chapter 6: File Management |
|
||||||||||||||||||||||||||||||||||||
|
© Copyright Macmillan USA. All rights reserved.