2. Introduction to Scripting

Customizing UNIX with Shell Scripts
A Simple Example

2.1 Customizing UNIX with Shell Scripts

Why the heck would anyone want to write shell scripts? After all, a script is just a bunch of commands that could be issued from the shell prompt anyway. Well, sure. But how many times does a user have to reenter the same command before his fingers become knotted and arthritic? Wouldn't it be easier to store those commands in a file that can be executed by the computer? Certainly the CPU can execute commands faster than a human can type. Besides, isn't it smarter to modularize tasks so they can be combined and reused in an efficient manner? The fact of the matter is that shell scripts can provide this and more.

Shell scripts can help organize repeated commands. An administrator who often repeats a series of operations can record the command set into a file. The administrator can then instruct the shell to read the file and execute the commands within it. Thus, the task has been automated; one command now does the work of tens, hundreds, even thousands of others.

Shell scripting helps users handle the operating system smarter. The techniques presented in this discussion can be applied at the command line prompt. They are, after all, commands themselves; albeit, they are built into the Bourne shell. Still, by developing a habit of scripting at the command line, users can save time and effort by applying a script to a task rather than executing it by brute force. A good example is processing a set of files with the same command. A user can type the same command with a different file name over and over, or the user may write a loop that iterates over the files and automatically generates the same command set.

Furthermore, shell scripting is simpler than conventional programming. Unlike compiled languages such as C and C++, scripts are interpreted. There is no compiling statements into machine code and linking it with libraries to form an executable. The script is the executable. Commands entered into it are immediately ready to run. On the other hand, scripts run slower than compiled programs and are often harder to debug mainly because no debugging tools are available. Even so, shell scripts are easier to develop and modify.

Hopefully, the point is clear. Scripting is powerful, simple, and fast. Anything that someone can think of entering at the command line can be automated by a shell script:

Reformat a file set so that the only white space in every file are spaces.
Translate a file into another format for printing.
Startup and shutdown a system.
Extract and insert information from and into a database.
Clean up a UNIX account.
Collect and present system performance measurements.
Monitor a remote workstation or accounts.
Prepend a template prologue to source code for documentation.
And all the other thinks you can think!

2.2 A Simple Example

At the risk of sounding cliche, a good shell script reads like a well written story. All right, maybe not a story, but it does take a form similar to a classical essay. It has an introduction, a body, and a conclusion. The introduction prepares the script much like the reader of an essay is prepared for the paper's contents by its introduction. A script's body presents commands to the shell just as an author presents an argument through an essay's body. And, in both cases, the conclusion wraps up loose ends. The shell script listed below illustrates this point. It is read from top to bottom with each line encountered being executed in sequence. Normally, the text is stored within a file that has execute permissions set. A user simply enters the file's name at the command line prompt, and the program is run. For illustrative purposes, the script is shown.

#!/bin/sh
# cleandir -- remove all the large, unused files from the current directory

echo "Removing all object files..."
rm *.o              # all object files have a .o extension
echo "done."
echo "Removing core dumps..."
rm core             # there's only ever one core
echo "done."
exit 0

Before analyzing the script, a quick note about comments is deserved. Comments are text that is ignored by the interpreter. Programmers provide comments to describe their code to readers. Comments begin with a hash mark (#). They may start anywhere within a line, and the comment always extends from the hash mark to the end of the line. A comment spans one line only; multiple line comments are not allowed. The script above shows the two types of commenting. The file prologue comment extends across a whole line:

# cleandir -- remove all the large, unused files from the current directory

On the other hand, the comments following the remove statements are inlined. They share the same line with the command:

rm *.o              # all object files have a .o extension
...
rm core             # there's only ever one core

The first line of the script is not a comment. Instead, it specifies which shell shall be used to process the succeeding commands. Considering the comparisson of a script to an essay, it can be said that the first line introduces the script's shell. It is the introduction. The basic syntax for declaring the shell has the form: #![full path to shell program]. The exclamation point following the hash mark signals the interpreter that the line is not a comment. It tells the interpreter that the text that follows is the path to a program in which to execute the script. In the example above, the Bourne shell is used, /bin/sh.

It is not mandatory for a script to specify what shell program to use. If no such specification is given, then by default, the script is executed in a subshell of the calling shell. In other words, the script uses the same shell as the invoking shell. This is potentially dangerous. Suppose a user operates from a different shell than the Bourne shell. The tcsh is a good example. It has a much more user-friendly command line interface than sh, and it uses the same syntax as the csh. This syntax looks like C source code. It is very different from sh syntax. Suppose also that the user writes a script with Bourne shell syntax but neglects to make #!/bin/sh the first line. When the user invokes the script from the tcsh prompt, the interpreter reads the script and determines what shell to use. Since no shell is specified, it invokes a subshell of the tcsh, and executes the script. Chances are the program fails miserably. Section 9, Debugging Hints contains an example that demonstrates such a failure.

A shell script's body is the list of commands following the shell declaration. It tells the program's story: what is it doing and how is it doing it. The shell processes the command body sequentially with branching, looping, and function constructs being exceptions to the rule. Any command that can be issued from a shell prompt may be included in a script's body. The example's body is the set of echo and rm statements:

echo "Removing all object files..."
rm *.o              # all object files have a .o extension
echo "done."
echo "Removing core dumps..."
rm core             # there's only ever one core
echo "done."

This script tells the story of cleaning up a directory. It begins by informing the invoker with the echo that it will remove the object files. The next line actually performs this action with the rm on all files containing a .o extenstion in their name. This command in turn is followed by two more echo commands. The first indicates that the previous action completed. The second informs the user what it will do next. A second remove follows the two echo statements. It deletes any local core dumps. After completing, control passes onto the last echo which informs that user that the core dump removal finished.

By default, a script concludes as soon as it has reached the last statement in its body. At this point, it returns control to its invoker. It also returns an exit status code. The exits code of a script is the return status of the last command run by the script. It is possible, however, to control this value by using the exit command. It takes the form exit status where status can be any non-negative integer. Zero usually indicates no errors were encountered by the script. Non-zero values indicate detection of various faults defined by the script. It is up to the programmer to determine what these faults are and to assign exit codes to them. The example terminates gracefully hen the body has been executed. It signals success by returning a zero status code:

...
echo "done."
exit 0

It must also be noted that unlike an essay, a script can provide a conclusion anywhere within its body. To terminate the execution of a script prior to reaching its end, the exit command must be used. This is generally done when a program detects an unrecoverable error such as insufficient parameterization. Once again, scripters are responsible for determining errant conditions and deciding whether to terminate the program or not.