1. Shell Basics

What is a Shell?
Shell Types
Grouping Commands
I/O
Regular Expressions
Quoting

1.1 What is a Shell?

Simply stated, a shell provides an interface to the operating system facilities including files, printing, hardware devices, and applications. This is a very broad definition indeed. It applies from something as simple as a telnet interface to a complex windowing system such as the Common Desktop Environment (CDE). It also transcends operating system brands. For example, a ksh session on an HP UNIX workstation is as much a shell as a DOS window is to a Microsoft Windows 95 PC or a DEC windows session is to a VAX/VMS desktop server. In each case, there exist methods for reading files, queuing print jobs, or launching programs. On a Windows 95 PC, a user might read a file by invoking the Notepad program from the Start Menu. An operator could use the PRINT command in VMS to view the current jobs queued for a line printer. A programmer might open a terminal window from the CDE control panel in order to begin compiling and linking a program. In each case, the interface gives the user some way to command the operating system.

Another way to view a shell is to consider it in the context of a metaphor using its own name. Consider an egg shell. An egg shell is a hard, protective wrapper that encases a growing embryo. Needless to say, UNIX is hardly like a developing fetus, but the wrapper part fits. Figure 1.1-1 depicts the relationship. The UNIX shell can be treated as an operating system wrapper. It encapsulates the devices, applications, files, and communications that in turn access the kernel to perform all of the functions expected of computers. Once again, this is a simplistic explanation of how a user interoperates with a computing system, but it should help demystify the concept of a shell.

Figure 1.1-1. A Shell as an Operating System Wrapper

1.2 Shell Types

Naturally, this book deals with a very specific shell, namely, the UNIX Bourne shell. There are a number of similar shells that are worth mentioning, but first, it is important to note that each of these shells shares the following qualities:

Command line interface (CLI),
Rigid syntax,
High degree of flexibility,
Programmable.

The traditional UNIX shells are command line driven. A user enters textual input consisting of operating system commands with options and switches. The computer in turn provides textual output according to the command given
followed by a prompt for the next command. The richness of an application's options combined with the built-in interpreted programming language of the shell allows a user to easily command the operating system to perform very specific functions. In a graphical environment, this same flexibility is lost in a program whenever a user is forced to use check boxes, radio buttons, or selection lists. It is much more expedient to include all the options on one command line then it is to point-and-click through a pop-up window. On the other hand, this same flexibility loses the ease of use that GUIs provide. A GUI can very easily force a user to review all necessary options before running a program. A command line driven system requires the user to research the command thoroughly before issuing it. Still, the syntax maybe restrictive, but the fact that these same commands can be easily stored in an executable file and replayed at a later date once again point to the shell's customizability. Moreover, the branching and looping facilities of the shell allow it to make programmed choices and to repeat commands as necessary. This is an invaluable quality lost in the point-and-click world of a windowing system.

As mentioned previously, there are a number of shells to choose from, but it must be noted that there are only two basic types. The first is the Bourne shell. It has a unique syntax different enough from traditional programming languages to require concerted study by those willing to learn it, but it is does have some very familiar constructs that also keep it from being an entirely foreign language. The second type is based upon the C shell. It is named the C shell because its syntax derives from the C programming language. It is in fact similar enough syntactically to allow C programmers to quickly use the branching, looping, and assignment mechanisms. One might easily be able to write if statements and while loops, but the environment is different enough to cause problems if not treated carefully. A user might easily mistake a string for a variable dereference.

From these two shell types came a number of highly-interactive shells. The Korn shell, ksh, and the Bourne Again shell, bash, use Bourne shell syntax. In general, any command useable by the Bourne shell may be used by these others. The C shell's cousin is the tcsh. These derivatives are more user-friendly compared to their parents. They usually provide hot-keys for utilities like file name completion, directory searching, in-line editing, and an extended environment. In some cases, such as the Korn shell, they also allow complex data structures such as arrays. These shells are definitely preferrable for daily use. It is much nicer to hit the tab key and have the shell complete a directory name than it is to have to type the whole thing, especially when the exact name is unclear. And the beautiful part is when something like a loop is needed to allow a command to be repeated, the facilities are right there. Plus, there is often a history buffer that caches recently issued commands so the user does not have to reenter or restructure it. But it is the basics that are explained herein.

This book presents the syntax and use of the Bourne shell. It does so for a very good reason; namely, it is the most widely available of any shell. It is the first shell that came with UNIX. Consequently, it has survived and is distributed with every flavor of UNIX produced today. It can be found in every UNIX as /bin/sh. Because of its availability, any program written that uses it is nearly guaranteed to run on any other UNIX. Of course portability carries only as far as the actual operating system commands allow, but at least the basic programming tools and syntax of the Bourne shell are consistent. From this point forward, any mention of the shell refers to the Bourne shell unless explicitly stated otherwise.

Before diving right into the usual programming bag o'tricks such as variables, functions, if statements, loops, and other fun stuff, this chapter will conclude with some very important syntatical features. Without a thorough understanding of them, it is extremely difficult to read and write shell scripts. For those unfamiliar with the shell or UNIX in general, return to this chapter occassionally and reconsider them in the context of commands given at the prompt or within other shell scripts encountered.

1.3 Grouping Commands

Commands passed to the shell may be grouped in five ways with a semicolon (;), parentheses (()), curly braces ({}), double ampersands (&&), or double vertical bars (||). The first three may be simply viewed as punctuation for combining multiple commands on a single line. Of course there are subtle differences between them that will be discussed shortly, but in general, they are to the shell as periods are to prose. The last two, the double ampersand and vertical bars, are conditional conjunctives.

The semicolon allows users to string commands together on a single line as if one command was being issued. Each command within the semicolon separated list is executed by the shell in succession. For example, the line below has two commands joined with a semicolon:

$ cd /var/yp; make

When entered, the shell parses the command line up to the first semicolon. The words read to that point are executed. In the example, the current directory is changed to the /var/yp directory. The shell then continues parsing the command line until it reaches the end of the line. The rest of the command is then executed, and as every good UNIX administrator knows, the NIS maps are built as a result of the make that is issued. But this example is not a lesson in system administration. It is a demonstration of how the shell executes each semicolon separated command in the order it is listed after the prompt as if the user had typed the same set of commands on separate lines.

The parentheses, when grouped around a command, cause the command to be executed in a subshell. Any environmental changes occuring within the subshell do not persist beyond its execution; in other words, all changes to the working environment of the commands issued within the parentheses do not affect the working shell. Consider the following example:

$ cd; ls
bin etc include man tmp
$ cd bin; ls; ls
bash gunzip gzip zcat
bash gunzip gzip zcat
$ cd
$ (cd bin; ls); ls
bash gunzip gzip zcat
bin etc include man tmp

The first command ensures that the home directory is the current working directory and then lists its contents. This is followed by changing directories into bin and then listing its contents twice with the ls command. The next command then returns the shell to the home directory. The last command shows how parentheses behave. Two commands are issued within the scope of the working shell. The first is a compound command delimited by the parentheses. The compound command executes in a subshell. Within this subshell, the current directory is once again set to bin, and its contents are listed. As soon as the subshell terminates, the parent shell resumes control. It lists the contents of the working directory, and as is demonstrated by the resulting output, the shell's current working directory has not been modified by the subshell's directory change.

Like parentheses, curly braces can be used to group a set of commands; however, the commands execute within the context of the current shell. Continuing with the previous example, suppose the last command is reissued with curly braces instead of parentheses.

$ { cd bin; ls; }; ls
bash gunzip gzip zcat
bash gunzip gzip zcat

Notice that this time the bin directory is listed twice. Obviously, the directory change persisted. The shell executed the compound command instead of deferring it to a subshell as it did when parentheses were used. Note also the construction of commands with culry braces. White space precedes and follows the enclosed commands, and a semicolon terminates the last command. The terminating semicolon is required. Certainly, the use of curly braces can be unwieldy:

$ { cd bin; ls }; ls
> ^C
$ { cd bin; ls }; ls }
> ^C
$ { cd bin; ls }; ls; }
}: No such file or directory
bash gunzip gzip zcat

The first command is entered the same way it would be issued if parentheses were used. When entered, the shell responds with the secondary prompt meaning that it expects more input. The job is killed with a control-c, and the command is reentered with a different format. Here, an extra closing brace is appended to the command. The user assumes that the shell was expecting a terminating brace at the end of the line. Unfortunately, the results are the same. After killing the second attempt, the user tries a third time and appends a semicolon to the final ls command. As can be seen, the command becomes syntactically correct, but the results are probably not what the user desires. Focusing on the semicolons reveals what happened. The shell changes directories to bin. Then it tries to list the } directory. The shell interprets the closing brace to be an argument to the ls command. Since no such file exists, it complains. It finishes by listing the current directory.

Hopefully, from this example, the point is clear. Curly braces are, frankly, difficult to use for grouping commands, and when also using culry braces for accessing variable values, code becomes very difficult to follow. It is generally good practice not to use curly braces for command grouping. Use culry braces for variables instead.

The last two conjunctions, && and ||, combine commands conditionally. When the shell encounters either of these, it always executes the first half of the command. Then depending upon its results, the second half may be run. In the case of the double ampersand, the shell handles the second part if and only if the first part exited with a status of zero. This indicates that the first half executed with no errors.

$ cd /usr/bogus && ls
/usr/bogus: does not exist
$ cd /var/spool && ls
calendar cron locks lp mqueue pkg

Since /usr/bogus is not a valid directory, the cd command informs the user and returns with a non-zero status. The shell reads the status and, seeing it is non-zero, does not execute a directory listing. On the other hand, /var/spool is a real directory. The cd command, in this case, performs the change and returns with a zero exit status. Again, the shell reads the status, and because it is zero, the shell proceeds with the following command, the ls.

The double vertical bars function the same except that the second half executes if and only if the first half exits with a non-zero status.

$ ls -l
total 4
drwxr-xr-x 2 rsayle fvsw 512 Jul 15 14:51 bin
drwxr-xr-x 2 rsayle fvsw 512 Jul 15 14:51 man
-rw-r--r-- 1 rsayle fvsw   0 Jul 15 14:51 profile
$ (ls -l | grep "root") || echo "No files owned by root"
No files owned by root

The first listing above shows that all files in the current directory are owned by rsayle. Hence, the grep for root fails. A non-zero exit code is returned by the subshell that executed the ls and the grep. The parent shell, having encountered ||, acknowledges the result and prints the informational message accordingly.

1.4 I/O

The UNIX Bourne shell contains a couple methods for manipulating the input and output of commands. The first method employs pipes. It can be used to chain commands together so that each preceding command sends its output to the input of the following command. The second method is known as redirection, and it provides the ability to pull input from or place output into other sources besides the shell window. But before introducing these commands, a general understanding of the standard I/O facilities must be reached.

The input and output displayed on the console comes from three buffers: stdin, stdout, and stderr (pronounced "standard in," "standard out," and "standard error," respectively). These devices are collectively referred to as the standard I/O. Stdin is the terminal's default input. The shell reads commands and answers to prompts from stdin. The default output is stdout. The console prints anything it receives from stdout. In general, this includes print statements from the programs run at the command line and warning messages broadcast by system administrators. Like stdout, the shell also displays the error messages received from stderr on the console, but it is imperative to keep in mind that stderr is not the same as stdout. A good way to remember this is to think of stdin, stdout, and stderr as three different files. In fact, this is exactly what they are. Stdin is file descriptor zero; stdout, one; and stderr, two. As such, each may be referenced by its file descriptor number. Section 8.1, Advanced Shell Topics: More I/O, shows how to manipulate the standard I/O using the file descriptor numbers.

A pipe, represented by a single vertical bar (|), connects the output of its left operand to the input of its right operand. The operands of pipes may be any legal shell command. The goal of piping is to pass output through a series of commands that will massage the data into a more meaningful representation. Normally, the shell parses stdin and passes the input to the command given. The command operates on the input, and in turn, presents the output to stdout. The shell handles pipes as is shown in the figure below. Here, the shell begins as it always does. It parses stdin passing arguments to the command issued. The program runs and provides output. Seeing a pipe, the shell takes the resulting output and passes it as the input to the next command. The next command works on the new input and creates a new ouput which is passed on to the following command in the pipe chain. This continues until the shell reaches the final command. The last command receives the massaged data from the previous command, runs, and then places the result into stdout. The shell the displays stdout to the user.

Figure 1.4-1. A Graphical Depiction of How Pipes Combine Multiple Commands

The most common use of pipes is for filtering. Typically when filtering, a program prints text to stdout which is piped into a utility such as grep, sed, or awk. The utility selects data from the input according to a pattern specified as an argument. The result is usually posted to the screen for visual inspection, but it is possible to save the results in a file, store it as the value of a variable, or repipe it into another command.

$ ypcat passwd | grep "^[a-zA-Z0-9]*::" | cut -f1 -d:
jbeadle
build35
cwoodrum
dborwin
jbell
dfellbaum

The example above shows how grep and pipes can be used as a filter. The command is actually three programs separated by pipes. The first command, ypcat passwd, displays the network information services (NIS) map for the password database. The shell, as directed by the pipe, sends the map to grep. Grep takes the map and searches for all accounts that do not have a password; this is the filtering operation. Since the goal is to list the accounts, some further processing must be completed so the output is passed to cut. Cut collects the data and trims everything from each line except the account name. The final output is displayed as shown. The figure below depicts the example graphically.

Figure 1.4-2. A Graphical Depiction of a Pipe Chain

Whereas piping permits the chaining of commands such that the output of one serves as the input of another, redirection can be used to take the input for a program from a file or to place the output of a program into a file. Not surprisingly, these two techniques are known as input redirection and output redirection. Input redirection takes the form: command [input redirect symbol] input. It is equivalent to saying, "Execute a command using the data that follows." Redirecting the output says, "Execute a command and place the printed results into this file," and it has the form: command [output redirect symbol] destination.

$ cat < /etc/defaultdomain
fvo.arinc.com
$ ypcat hosts > hosts.bak
$ cat hosts.bak
127.0.0.1 localhost
144.243.92.13 blatz

The previous commands demonstrate input and output redirection. The first cat command takes as its argument the file /etc/defaultdomain. The argument is given to cat via input redirection, the left brace. Output redirection is shown with ypcat. The ypcat command, when applied to the NIS hosts map, lists all the IP addresses and names of each node in the NIS domain. Here, the output is redirected into the file hosts.bak.

Two types of input redirection exist. The first one uses one left brace. It redirects standard input to be taken from the file specified. For example, if an administrator wanted to know how many accounts were on a particular machine, input redirection of the /etc/passwd file into wc could be used.

$ wc -l < /etc/passwd
      12

The second type employs a double left brace. Instead of taking the input from a file, the input is taken at the command line between two delimiters.

$ wall << WARNING
> "itchy will reboot at 6PM this evening"
> WARNING
Broadcast Message from rsayle (pts/0) on itchy Wed Jul 24 14:06:57...
"itchy will reboot at 6PM this evening"

The delimiters may be any word. In this case the word "WARNING" is used. The user here issues the warn all users command, wall. The user also instructs the shell to use as the text of the message the lines that fall between the two WARNING delimiters.

Similarly, there are two types of output redirection. The symbols used are simply inverses of the input redirectors; right braces are used instead of left braces. And of course, the semantics are nearly opposite. In the case of the single right braces, stdout is redireced to a file. The file is created if it does not exist or it is overwritten if the file named already exists and globbing has been enabled (set glob). Below, the shell copies the file combo to the file combo.bak using output redirection. The output from the second cat command is redirected into the new file combo.bak.

$ cat combo
10 23 5
$ cat combo > combo.bak
$ cat combo.bak
10 23 5

Double right braces have the same effect as single braces except if the file already exists, the output is appended to the file instead of overwriting it.

$ cat combo >> combo.bak
$ cat combo.bak
10 23 5
10 23 5

Continuing with the previous example, if the user instructs the shell to once again display the contents of the file combo but uses double right braces, then combo.bak ends up with two lines, each being the single line from combo.

As a final I/O basics topic, it is helpful to mention that there are a few generic devices available for redirection besides the standard I/O. Some flavors of UNIX provide devices for the terminal, /dev/tty, the console, /dev/console, and the great bit bucket in the sky, /dev/null. All output sent to /dev/null is simply discarded. It is useful in shell scripts when output should be hidden from the user but should not be stored. There is a subtle difference between the console and the terminal. In a windowed environment, a tty terminal is associated with each shell session; in other words, each command line window such as an xterm is a separate terminal. Redirection to /dev/tty sends output to the active window. The console, on the other hand, is the screen. It is the monitor in general. Most windowing programs provide a special switch to the shell windows that will link /dev/console to the window instead of writing output directly on the screen. In any event, output can be redirected to these devices quite easily.

$ cat /etc/motd >>/dev/tty
Sun Microsystems Inc.   SunOS 5.4       Generic July 1994
$ cat /etc/motd >>/dev/null
$

1.5 Regular Expressions

Regular expressions are one of the keys to UNIX. To master them is to open the door to a new universe of commands. A user who has a firm grasp of regular expressions can perform multiple tasks with one command, filter text for processing, and edit multiple files nearly instantaneously. And these are but a few examples of the power regular expressions provide.

A regular expression is a sequence of characters specifying a textual pattern. They instruct programs to process only those items that match the pattern. Most regular expressions use metacharacters to express repetition, existence, or ranges of character patterns. It can be very difficult at first to decipher a regular expression that uses metacharacters. Nevertheless, it is extremely important to practice creating and reading regular expressions. It is truly the difference between writing robust scripts and simply browsing around through UNIX.

^$
[a-z][0-1][a-z0-1]*
^<tab>[A-Z][a-zA-Z0-9 ]*$
10\{3,6\}

At first glance, to the novice user, the examples above look like gibberish, but a patient and disciplined user can understand what each pattern means. In the first line, the carat represents the beginning of a line and the dollar sign the end of a line. Together, with nothing in between, they form a pattern describing an empty line. The second example shows how character classes can be used in regular expressions. The pattern represents a line that contains a word beginning with a lower case letter, followed by a zero or one and ending with any number of lower case letters, zero, or one. The square braces denote character classes, and the asterisk is a wildcard meaning any number including zero of the previous pattern. The third example is even more restrictive than the second. It matches patterns beginning with a tab (the tab normally is just typed; the <tab> is shown here for illustrative purposes), followed by an upper case letter, followed by any number of lower case, upper case, digit, or space character, and terminated by the end of the line. It is certainly a complex example, but it shows how by examining the pattern piece by piece, the meaning becomes clear. The last expression matches the number range 103 through 106, inclusive. It specifies a pattern beginning with ten followed by any number in the range three through six.

A list of the metacharacters and their usage is given in the table below. The five metacharacters at the end are part of an extended set. They only have meaning in certain programs.

*Table 1.5-1. Metacharacters*
Metacharacter	Usage	Used by
`.`	Matches any single character except newline.	All utilities
`*`	Matches zero or more occurrences of the single character that immediately pr ecedes it. The character may be specified by a regular expression.	All utilities
`[]`	Matches any one of the class of characters enclosed between the brackets. I f a caret (`^`) is the first character inside the brackets, then the match is reversed. A hyphen is used to specify a range. In order to match a cl ose bracket, it must be the first character in the list. Other metacharacters l ose their special meaning when enclosed within brackets so that they may be matc hed literally.	All utilities
`^`	As the first character of a regular expression, matches the beginning of a l ine.	All utilities
`$`	As the last character of a regular expression, matches the end of a line.	All utilities
`\`	Escapes the metacharater which immediately follows.	All utilities
`/{m,n/}`	Matches a range of occurrences of the single character that immediately prec edes the expression. If only `m` is given, then the pattern matches exactly `m` repetitions. If `m,` is specified then at lea st `m` occurrences are needed for a match. The use of `m,n matches any number of repetitions between m and n.`	sed, grep, egrep, awk
`+`	Matches one or more of the preceding regular expression.	egrep, awk
`?`	Matches zero or one instance of the preceding regular expression.	egrep, awk
`\|`	Matches either the preceding or following regular expression.	egrep, awk
`()`	Groups regular expressions.	egrep, awk

There are a number of utilities available that use regular expressions. In particular, it is useful to study the one built into the Bourne shell, filename completion. Filename completion assists in locating files or lists of files. It uses its own set of metacharacters: *, matches zero or more chracters; ?, matches any single character; [], matches any single character listed (a ! reverses the match); and \, removes the meaning of any special character including and especially the end of a line. * is typically called the wildcard, and \ is referred to as escape. Filename completion works by causing the shell to expand the pattern into all file names that match the expression.

$ ls as*.*
ascendmax.alert    ascendmax.emerg    ascendmax.info     ascendmax.warning
ascendmax.crit     ascendmax.err      ascendmax.notice
$ cd ..
$ ls log/as*.*
log/ascendmax.alert    log/ascendmax.err      log/ascendmax.warning
log/ascendmax.crit     log/ascendmax.info
log/ascendmax.emerg    log/ascendmax.notice

As part of its parsing routine, the shell expands regular expressions before executing the command listed. The expansion acts as if the user had typed a list of files at the command line. In the preceding example, the shell expands the pattern as*.* into all files in the current directory beginning with as and having a period somewhere in the middle. The result is substitued for the pattern and the command is issued to list the files. The second part shows how the shell can be directed to traverse directory trees and perform expansion on subdirectories.

Some of the other common programs that use regular expressions are given in the table below.

*Table 1.5-2. Programs that Use Regular Expressions*
Utility	Purpose	Reference for Further Reading
`grep`	Prints each line from its input that matches the pattern specified.	man
`sed`	Permits line by line editing of its input according to the script of regular expressions and commands given.	man, Sed & Awk (O'Reilly & Associates)
`awk`	Programming language for extracting and formatting input according to textua l patterns. Works best when the input already has a particular structure.	man, Sed & Awk (O'Reilly & Associates)
`cut`	Extracts text from the input given a list of field numbers and their field s eparator.	man
`uniq`	Filters out repeated lines from its input so that only unique lines remain i n the output.	man
`sort`	Sorts the lines of input given.	man

The most common forms of input to these utilities are files or pipes:

$ grep "defs.h" *.cpp | cut -f1 -d: | uniq
bench.cpp
bitvec.cpp
bstream.cpp
gimp.cpp
idendict.cpp
match.cpp
regexp.cpp
rwbag.cpp
rwbagit.cpp
rwfile.cpp
rwtimeio.cpp
strngcv.cpp
toolmisc.cpp
utility.cpp
wcsutil.cpp
wstring.cpp

This command uses files and pipes. The user wants to list all C++ source files that include the header file defs.h. The command begins by searching for the string defs.h in all files ending with the extension .cpp. The files found are passed to cut. Normally, grep prints the file name and every line in a file that contains the pattern. Here, cut is used to trim the lines down to the file name only. The result is in turn passed to uniq which filters out duplicated lines.

1.6 Quoting

Just as regular expressions separate UNIX wizards from users, the use of quoting differentiates shell scripters from shell hackers. Quoting is simple in form and powerful in semantics. With quoting users can make metacharacters act like regular text or nest commands within others. A long command can be easily separated onto multiple lines by using quotes. But with this powerful tool comes a price. Proper usage of quoting is at best subtle. Often it is the source of program error and is difficult to debug. This should not deter scripters from quotes. Instead, users must simply realize that quoting takes practice.

Bourne shell quotes come in four forms:

1. Double Quotes "": Removes the special meaning of all enclosed characters except for $, `, and \.
2. Single Quotes '': Removes the special meaning of all enclosed characters.
3. Back Quotes ``: Command substitution; instructs the shell to execute the command given between the quotes and substitute the resultant output.
4. Escape \: Removes the special meaning of the character that follows. Inside double quotes, \ also escapes $, ', newline, and escape itself.

The primary use of double and single quotes is to escape the meaning of metacharacters within blocks of text:

$ echo *
bin man profile
$ echo "*"
*
$ echo $USER
rsayle
$ echo "$USER"
rsayle
$ echo '$USER'
$USER

The first two commands show the difference between using quotes. Special characters are interpreted by the shell when they are not quoted as in the case where the echo command displays the contents of the current directory: bin, man, and profile. The same can be said of the third command in which the value of the environment variable USER is echoed. Moreover, it is important to remember that there are a few special characters that are still interpreted when used with double quotes. Single quotes must be used to perform the escape as is shown by the last few commands.

Double quotes and single quotes are especially useful for patterns containing white space. Suppose, for example, an administrator knows a person's name but can't remember the user's account name. Filtering the passwd database through grep might be a good way to figure out the account name.

$ ypcat passwd | grep Robert Sayle
grep: can't open Sayle

Grep balked on the user's name! Well, not exactly. Due to its syntax, grep tried to find the string "Robert" in the file "Sayle." The trick is to instruct the shell to ignore the white space between the first and last name.

$ ypcat passwd | grep "Robert Sayle"
rsayle:hOWajBzikAWRo:328:208:Robert Sayle:/home/rsayle:/bin/tcsh

The double quotes tell the shell interpreter to combine all the text contained into one argument, "Robert Sayle." Grep receives the argument and uses it as the key during its search through the output of the ypcat.

To search for quotes within text, single quotes escape double quotes and vice-versa. The following examples illustrate this. The first and third show how one cannot simply use a quotation mark in pattern matching. The shell assumes that more text follows the input. The other examples show how to do it correctly.

$ grep " sloc
> ^C
$ grep '"' sloc
# a given set of files; finds each ";" which denotes an
 echo "sloc: usage: sloc file1 [file2 ...]"
 LINES=`grep ";" ${1} | wc -l`
echo "${RESULT} lines of code"
$ grep ' tgzit
> ^C
$ grep "'" tgzit
# tgzit -- tar's and gzip's the filenames passed

Although these are simple examples, not enough can be said about the difference between single and double quotes. Double quotes function perfectly in most cases, but there are times when a programmer should use one over another. The realm of possibilities will be explored in succeeding subsections as new topics are presented. For now, a good rule of thumb is to use double quotes in order to group words into a single argument, to allow variable substitution as in "$USER", or to allow command substitution. Single quotes should be used when no substitution should occur. With that said, the discussion of quotes in general continues.

Back quotes are arguably the most powerful form of quoting since they permit a level of nesting called command substitution. In other words, they allow commands to be executed within other commands. When the shell scans a command line and finds an embedded command between back quotes, it runs the nested command. The output of the nested command is then substituted as part of the enclosing command which is subsequently executed. As an example, suppose a programming team places their source files in a common repository.

$ ls -l *.h
-r--r--r--   1 rsayle   fvsw        5373 Aug  3  1995 adminlink.h
-r--r--r--   1 rsayle   fvsw        5623 Aug  7  1995 agentadmin.h
-r--r--r--   1 rsayle   fvsw        4930 Aug 10  1995 agentadminexception.h
-rw-r--r--   1 lshar    fvsw       20264 Aug 14  1995 attribute.h
-rw-r--r--   1 lshar    fvsw        3346 Aug 14  1995 attributeexception.h
-rw-r--r--   1 lshar    fvsw        6819 Aug 14  1995 attributehelper.h
-rw-r--r--   1 lshar    fvsw        3424 Aug 14  1995 attributehelperexception.h
-rw-r--r--   1 lshar    fvsw        7446 Aug 14  1995 attributereference.h
-rw-r--r--   1 lshar    fvsw        3394 Aug 14  1995 attributerefexception.h
-rw-r--r--   1 rsayle   fvsw       35012 Aug 16  1995 attributevalue.h
-r--r--r--   1 rsayle   fvsw        4959 Jul 25  1995 avdictionary.h
-rw-r--r--   1 lshar    fvsw        4851 Aug 17  1995 avexception.h
-rw-r--r--   1 rsayle   fvsw        5024 Jul 25  1995 avtype.h
-rw-r--r--   1 lchang   fvsw        9106 Jul 19  1995 computeropstatemsg.h
-rw-r--r--   1 346      fvsw        8627 Jul 25  1995 elevation.h
-rw-r--r--   1 346      fvsw        9454 Jul 25  1995 latitude.h
-rw-r--r--   1 rsayle   fvsw        5025 Aug  3  1995 linkagentadmin.h
-rw-r--r--   1 lshar    fvsw        6260 Jun 19  1995 linkfactory.h
-rw-r--r--   1 rsayle   fvsw        4871 Jul 26  1995 linkmibloader.h
-rw-r--r--   1 346      fvsw        9512 Jul 25  1995 longitude.h
-rw-r--r--   1 lshar    fvsw       17087 Aug 14  1995 managedobject.h
-rw-r--r--   1 lshar    fvsw       14056 Aug 14  1995 mib.h
-rw-r--r--   1 lshar    fvsw        3268 Aug 14  1995 mibexception.h
-r--r--r--   1 rsayle   fvsw        5263 Aug  2  1995 mibloader.h
-r--r--r--   1 rsayle   fvsw        4910 Aug 10  1995 mibloaderexception.h
-rw-r--r--   1 lshar    fvsw        3255 Aug 14  1995 moexception.h
-rw-r--r--   1 lshar    fvsw        8101 Aug 23  1995 mofactory.h
-rw-r--r--   1 lshar    fvsw        3346 Aug 14  1995 mofactoryexception.h
-rw-r--r--   1 lshar    fvsw        6134 Aug 14  1995 mofactoryhelper.h
-rw-r--r--   1 lshar    fvsw        3424 Aug 14  1995 mofactoryhelperexception.h
-rw-r--r--   1 lchang   fvsw        5008 Aug  7  1995 msgsvccbhandler.h
-rw-r--r--   1 lchang   fvsw        3232 Aug  7  1995 msgsvccbhdlrexcep.h
-rw-r--r--   1 lchang   fvsw        7365 Aug  7  1995 msgsvchandler.h
-rw-r--r--   1 lchang   fvsw        3215 Aug  7  1995 msgsvchdlrexcep.h
-rw-r--r--   1 rsayle   fvsw        4823 Aug 10  1995 newavexception.h
-r--r--r--   1 rsayle   fvsw        3815 Aug 10  1995 nmmsgids.h
-rw-r--r--   1 346      fvsw        9342 Jul 25  1995 operationalstate.h
-rw-r--r--   1 rsayle   fvsw       10085 Aug  7  1995 opstatemsg.h
-r--r--r--   1 rsayle   fvsw        4790 Aug 10  1995 ovsexception.h
-rw-r--r--   1 lshar    fvsw        4174 Aug 14  1995 parserhelper.h
-rw-rw-rw-   1 lchang   fvsw        3207 Aug 17  1995 regidexception.h
-rw-rw-rw-   1 lchang   fvsw        9503 Aug 17  1995 regidserver.h
-rw-r--r--   1 346      fvsw       10514 Jul 25  1995 rollablecounter.h
-r--r--r--   1 tbass    fvsw        2423 Jul 12  1995 servicedefs.h
-r--r--r--   1 rsayle   fvsw        2785 Jul 26  1995 startup.h
-r--r--r--   1 rsayle   fvsw        5210 Aug  7  1995 svagentadmin.h
-rw-r--r--   1 lshar    fvsw        7909 Aug 17  1995 svfactory.h
-r--r--r--   1 rsayle   fvsw        5544 Aug  2  1995 svmibloader.h
-r--r--r--   1 rsayle   fvsw       10938 Aug 24  1995 svovshandler.h

The ls command output can be simply piped to grep with the account name, but in order to generalize the script, command substitution could be used.

$ whoami                   
rsayle
$ ls -l *.h | grep `whoami`
-r--r--r--   1 rsayle   fvsw        5373 Aug  3  1995 adminlink.h
-r--r--r--   1 rsayle   fvsw        5623 Aug  7  1995 agentadmin.h
-r--r--r--   1 rsayle   fvsw        4930 Aug 10  1995 agentadminexception.h
-rw-r--r--   1 rsayle   fvsw       35012 Aug 16  1995 attributevalue.h
-r--r--r--   1 rsayle   fvsw        4959 Jul 25  1995 avdictionary.h
-rw-r--r--   1 rsayle   fvsw        5024 Jul 25  1995 avtype.h
-rw-r--r--   1 rsayle   fvsw        5025 Aug  3  1995 linkagentadmin.h
-rw-r--r--   1 rsayle   fvsw        4871 Jul 26  1995 linkmibloader.h
-r--r--r--   1 rsayle   fvsw        5263 Aug  2  1995 mibloader.h
-r--r--r--   1 rsayle   fvsw        4910 Aug 10  1995 mibloaderexception.h
-rw-r--r--   1 rsayle   fvsw        4823 Aug 10  1995 newavexception.h
-r--r--r--   1 rsayle   fvsw        3815 Aug 10  1995 nmmsgids.h
-rw-r--r--   1 rsayle   fvsw       10085 Aug  7  1995 opstatemsg.h
-r--r--r--   1 rsayle   fvsw        4790 Aug 10  1995 ovsexception.h
-r--r--r--   1 rsayle   fvsw        2785 Jul 26  1995 startup.h
-r--r--r--   1 rsayle   fvsw        5210 Aug  7  1995 svagentadmin.h
-r--r--r--   1 rsayle   fvsw        5544 Aug  2  1995 svmibloader.h
-r--r--r--   1 rsayle   fvsw       10938 Aug 24  1995 svovshandler.h

The whoami command returns the current account name. When used in the script, the shell executes whoami and substitutes the result for the argument to grep. Hence, anyone using the script would get results specific to their account name.

Finally, the last quoting mechanism is the escape represented by a backslash character, \. It removes the special meaning of the character immediately following. The most common use is for line continuation:

$ /usr/bin/dump 0usbdf 6000 126 54000 /dev/nrst0 \
> /dev/sd0h >> backup.log 2>&1

The long backup command above could not fit within one line so an escape was entered before the end of the line. It escaped the newline causing the shell to wait for more input before parsing the command.

Actually, a backslash escapes the special meaning of any character immediately following it. Returning to the first example in which single and double quotes were used to escape metacharacters, a back slash can provide some similar and some not so familiar results.

$ echo *; echo "*"; echo \*
bin man profile
*
*
$ echo $USER; echo "$USER"; echo '$USER'; echo \$USER; echo \$$USER
rsayle
rsayle
$USER
$USER
$rsayle
$ echo hello; echo "hello"; echo hell\o
hello
hello
hello

As can be seen, the escape functions most like single quotes, but it is imperative to keep in mind that it does so for exactly one character following the back slash. The last two echo permutations of the USER variable demonstrate this fact. As an aside, the last command string shows how escapes can precede characters that have no hidden meaning to the shell. There is no effect.

For the final example, double quotes and escapes are used to set a variable's value for which the directory list is too long to fit upon one line:

$ DIRS2ARCHIVE=" \
> $HOME/tmp/bin \
> $HOME/tmp/man \
> $HOME/tmp/profile \
> "
$ echo $DIRS2ARCHIVE
/home/rsayle/tmp/bin /home/rsayle/tmp/man /home/rsayle/tmp/profile