UNIX Hints & Hacks |
|||||||||||||||||||||||||||||||||||||
Chapter 6: File Management |
|
||||||||||||||||||||||||||||||||||||
|
There is a way to use grep to quickly find files faster than find can process them.
Flavors: AT&T, BSD
Syntax:
find [dir] -print | grep [pattern]
This is one of the simplest things that is often overlooked in large installations. The need for searching files becomes a necessity. There are always cases where you or a user forget the location of a file. Using the find command to search for a pattern that matches within a file is a simple command to execute. There is the traditional way of searching for complete words with the find command:
rocket 36% find /disk2/data -name rout -print
There is a problem if there are no files or directories with the name rout. The traditional find command does not search for partial words. You must know the entire word. If there is a file somewhere on the system that contains vital routing information, and all you know is that at least part of the word rout is in the filename, you can grep from the output of the find command any part of the word rout.
rocket 37% find /disk2/data -print | grep rout /disk2/data/admin/route.gz /disk2/data/configs/routing.txt /disk2/data/docs/route.ps
This method is slower than applying the complete name to the command. Sometimes the entire name of the file or directory is unknown and you need the flexibility to pass only part of the actual word to the find command so that the file or directory can be found.
Flavors: AT&T, BSD
Syntax:
find [dir] -print > [file] grep [-i] [pattern] [file]
This method is for the servers and systems that maintain hundreds of thousands of files. When filesystems on one machine process through many levels of the structure, the best thing is to take nightly snapshots of all the files that the system contains. Then use a program that outputs the filename and the entire path into a file that can be searched.
# find / -print > /disk2/ADMIN/filelist.txt
Start by getting every file on the system. This find command starts at the root level and redirects ( >) all the output into the filelist.txt file. The output consists of the full path and filename of every file on the system.
All that has to be done now is to grep through the large list of files for the file you are looking for:
rocket 38% grep -i rout /disk2/ADMIN/filelist.txt /disk2/data/admin/route.gz /disk2/data/configs/routing.txt /disk2/data/docs/Router.ps /disk2/data/docs/route.ps
What would take a minimum of several minutes can now be done in several seconds. One problem that you face is that files change on a daily basis. What is not there one day might be there the next. To solve this problem, all you have to do is to make a crontab entry for the find command to execute in the early morning before anyone comes in to work.
30 2 * * * find / -print > /disk2/ADMIN/filelist.txt
Now every day at 2:30 a.m., a fresh list of files that are most current on the system is stored on the system. With this in place to run nightly, you only have to make it easier for the users to be able to access it. This can take the form of a simple script. Write the script ffind to search for a pattern that is passed to it or prompt for a pattern if one is not passed to it.
# vi /usr/local/ffind
#! /bin/sh FILELIST="/disk2/ADMIN/filelist.txt" PATTERN="$1" if [ -z "$PATTERN" ]; then echo -n "Search: " read PATTERN fi grep -I "$PATTERN" $FILELIST
Line 1: Define the shell.
Line 2: Define the variable for the file that will be searched.
Line 3: Get and search for patterns that might have been passed to the script.
Line 4: Test to see whether a search pattern was passed to the script.
Line 5: If no search pattern was passed to the script, notify the user to enter a pattern to search for.
Line 6: If no search pattern was passed to the script, accept input from the user for the pattern to search.
Line 7: Continue on in the script.
Line 8: The filelist.txt file is searched for anything matching the contents of the string in the variable PATTERN. If the pattern is found, it is sent to STDOUT and displayed to the user.
rocket 39% ffind rout /disk2/data/admin/route.gz /disk2/data/configs/routing.txt /disk2/data/docs/Router.ps /disk2/data/docs/route.ps
rocket 40% ffind Search: rout /disk2/data/admin/route.gz /disk2/data/configs/routing.txt /disk2/data/docs/Router.ps /disk2/data/docs/route.ps
Even fast high-end servers can take a while to process through hundreds of thousands of files. Over 500,000 files on a server with a dual CPU system and an attached raid array can take anywhere from 3 to 8 minutes to scan and search files for the files that match the pattern you grep for.
Man pages:
find, grep, cron, crontab
UNIX Hints & Hacks |
|||||||||||||||||||||||||||||||||||||
Chapter 6: File Management |
|
||||||||||||||||||||||||||||||||||||
|
© Copyright Macmillan USA. All rights reserved.