UNIX Hints & Hacks

ContentsIndex

Chapter 6: File Management

 

Previous ChapterNext Chapter

Sections in this Chapter:

   

6.1 Copy Files with Permissions and Time Stamps

 

6.5 Finding Files with grep

 

6.8 Moving and Renaming Groups of Files

 

6.11 Splitting Files

6.2 Copy Files Remotely

 

6.6 Multiple grep

 

6.9 Stripping the Man Pages

 

6.12 Limit the Size of the Core

6.3 Which tmp Is a Good Temp?

 

6.7 Executing Commands Recursively with find

 

6.10 Clean Up DOS Files

 

6.13 uuencode and uudecode

6.4 Dealing with Symbolic Links

 

 

 

 

 

 

 

6.11 Splitting Files

6.11.1 Description

6.11.1 Description

Here is what to do when you need to split files across floppy disks.

Example One: Splitting Files for Floppies

Flavor: AT&T

Syntax:

split -bn[k|m] [size] file [outfile]
split -line_count [file] [outfile]

The split command can be used to split a file into a predefined size that can be a number of bytes, kilobytes, or megabytes. Depending on the value you pass to the split program, it breaks the file up into multiple files. Each of the multiple files is then the exact size of the value passed to the split program.

rocket 64% ls -al samba.tar
-rw-r--r--    1 cvalenz user     4945920 Nov 27 15:40 samba.tar
rocket 68% file samba.T
samba.T:        tar

To split a 4.9MB tar-formatted file so it fits on four 3.5-inch floppy disks, use the split command with 1400000 as the split size of the file:

rocket 65% split -b 1400000 samba.tar
rocket 66% ls -al xa* -rw-r--r-- 1 cvalenz user 1400000 Nov 27 15:41 xaa -rw-r--r-- 1 cvalenz user 1400000 Nov 27 15:41 xab -rw-r--r-- 1 cvalenz user 1400000 Nov 27 15:41 xac -rw-r--r-- 1 cvalenz user 745920 Nov 27 15:41 xad

The file is split into four pieces. The names of the files always begin with xaa, xab, xac, and xad. Each file can now be copied to floppy disks and fill the disk to capacity. Now that the file is split, how does it get put back together?

rocket 67% cat xa* > samba.T

The file can be put back together by appending the pieces together with the cat command. Because all the split files are in alphabetical order, masking the files with xa* appends the files back into the order they were split from.

rocket 64% ls -al samba.T
-rw-r--r--    1 cvalenz user     4945920 Nov 27 15:47 samba.T
rocket 68% file samba.T
samba.T:        tar

split can be used on ASCII, binary, tar, dump, and even compressed files. The sequential order of the multiple files are important to the success of joining the files back. It is the only thing that keeps the split file from working when they are put back together.

Example Two: Splitting Log Files

There are various ways to split log files, as seen in section 4.4, "Cut the Log in Half." Here is a way to split the log files in half using the split command.

# ls -al SYSLOG
-rw-r--r--    1 cvalenz user    6945302 Nov 28 11:04 SYSLOG

If you have a log file that is almost 7MB in size and you want to cut it in half using split, take half the number of lines in the file and pass it through split.

# wc -l SYSLOG
       614768 SYSLOG

Find the total number of lines in the log file.

# expr 614769 / 2
307384

Take half the value of the total line in the log file.

# split -307384 SYSLOG syslog.

Pass the value of half the number of lines in the log file to split the log file in half. If you add the name of an output file, place a period ( .) at the end. split appends its standard naming scheme--aa, ab, ac, and so on--to the end. Placing the period at the end makes it more recognizable in reading the filenames and for parsing when used within a script.

# ls -al xa*
-rw-r--r--    1 cvalenz user    3338254 Nov 28 11:05 syslog.aa
-rw-r--r--    1 cvalenz user    3607046 Nov 28 11:05 syslog.ab

Even though the file sizes are different, the number of lines, as you see, is the same. This is because the number of bytes on a line can be different, so the file sizes are different.

# wc -l xa*
       307384 syslog.aa
       307384 syslog.ab
       614768 total

Reason

When files need splitting, whether because of lack of disk space or the need for more disk space, split is a useful tool that easily accomplishes splitting files in half or by a specific size.

Real World Experience

When tape devices are not available, you are stuck on an isolated network, and the only form of removable media is a floppy disk, the split command is a useful tool to have around. A file can be split onto multiple floppy disks and taken to another system and joined back together. Although it is possible to execute this on any file that has any size to it, it is not recommended to perform this command on files larger than 10MB. Not only will you be carrying a lot of disks, but the more disks, the greater the chance of that one of the disks will be corrupt.

Other Resources

Man pages:

cat, file, split, wc

UNIX Hints & Hacks

ContentsIndex

Chapter 6: File Management

 

Previous ChapterNext Chapter

Sections in this Chapter:

   

6.1 Copy Files with Permissions and Time Stamps

 

6.5 Finding Files with grep

 

6.8 Moving and Renaming Groups of Files

 

6.11 Splitting Files

6.2 Copy Files Remotely

 

6.6 Multiple grep

 

6.9 Stripping the Man Pages

 

6.12 Limit the Size of the Core

6.3 Which tmp Is a Good Temp?

 

6.7 Executing Commands Recursively with find

 

6.10 Clean Up DOS Files

 

6.13 uuencode and uudecode

6.4 Dealing with Symbolic Links

 

 

 

 

 

 

 

© Copyright Macmillan USA. All rights reserved.