7. Handling Command Line Options

  1. UNIX Command Line Option Styles
  2. Getopts Statements

7.1 UNIX Command Line Option Styles

Options are switches to commands.  They instruct the program to perform specific processing.  Options appear on the command line following the command itself.  They are recognizable by a preceding dash and come in two forms, separated and stacked:
  1. separated: rdist -b -h -q -c /etc ren:/etc
  2. stacked: ls -alg
It should be noted that not all programs require their options to be preceded by a dash character.  In such cases, the options are stacked and immediately follow the command.  The ps and tar programs are such exceptions: ps waux and tar cvf /home/user /tmp/backup.

Some options take mandatory arguments.  When the program reads the option, it searches for the associated argument in order to process it accordingly.  Usually, these options are listed separately with its argument as the next command line entry.  An example is the make command used to compile a C program: make all -f build.mk -targetdir ./proj/bin.  Here, the command line instructs make to build all parts of the program.  It then tells make to take its build rules from the file build.mk as shown by the -f option.  It also tells make to place the results in the directory ./proj/bin as given by the -targetdir option.  The last switch shows that options need not just be single letters.  Somtimes, programs interpret whole words as options.

Not all commands require the separation of options and arguments. For some programs it may be perfectly fine to stack the options and then list their arguments respectively.  Tar is a perfect example of this: tar cvbf 120 /home/user /tmp/backup.  In the example, tar's first two arguments cause the program to create a new tape archive and use verbose mode so that the user can see what it is doing.  Then the options with arguments are listed.  The b option specifies the blocking factor to use, and the f option specifies the file, or directory in this case, to tar. The options' arguments then follow in order.  Tar uses a blocking factor of 120 bytes per block and stores the contents of the user's home directory.  Finally, the last argument to tar is the archive that it creates, /tmp/backup.

Careful examination of a program's man page shows its legal option and argument list.


7.2 Getopts Statements

Considering the programming facilities presented thus far, it is plain to see that options are easy to handle using the positional parameters, a loop, some shifts, and a case statement:

#!/bin/sh
#
# setether: set an Ethernet interface's IP configuration
#

while [ $# -gt 1 ]
do
  case ${1}
    a) ARP="arp"
       shift
       ;;
    b) BROADCAST=${2}
       shift 2
       ;;
    i) IPADDRESS=${2}
       shift 2
       ;;
    m) NETMASK=${2}
       shift 2
       ;;
    n) NETWORK=${2}
       shift 2
       ;;
    *) echo "setether: illegal option: ${1}"
       exit 1
       ;;
  esac
done

INTERFACE=${1}

ifconfig ${INTERFACE} ${IPADDRESS} netmask ${NETMASK} broadcast ${BROADCAST} ${ARP}
route add -net ${NETWORK}

The setether script processes a number of options and arguments in order to properly configure an Ethernet interface for use on an IP network.  It uses a while loop and a case statement to do this.  The scripts switches on the options using the case block.  The options are the one letter switches and are expected by the script to be in the first position on the command line.  Each option's argument is expected to follow the option in the second position.  As setether processes the options, it shifts the positional parameters by two in order to correctly handle the next one.  The only exception to this is the -a option which toggles on ARP for the interface.  Because it has no argument, the option requires the script to shift by one parameter only.  After reading all the options and their arguments, the script takes the last argument, which it expects to be the interface to be configured.  Lastly, the script does its real work; namely, it sets the interface using the ifconfig command, and then it adds the network route.

Processing options in this manner works but is rigid.  The programmer must pay careful attention to the shift pattern in order to correctly handle options with arguments versus those without arguments.  Another limitation is the fact that it does not handle stacked options.  Although stacked options are not an absolute necessity, it is best to give the user the most flexibility with the command.  The user can choose which method of specifying the arguments is most comfortable.  By assuming that the arguments follow the options, the user does not gain this advantage.  Luckily, there is a better way to deal with options.

The shell provides a special utility specifically designed for processing command line options and their arguments called getoptsIts syntax is: getopts options variable.  Getopts takes two arguments.  The first is a list of valid command line options.  If an option requires an argument, a scripter places a colon directly after the option.  The second is a temporary variable used for parsing through the arguments.

To use getopts, the programmer employs a loop.  Each time getopts is called within the loop, it processes the next command line argument.  If it finds a valid option, then the option is stored in the variable.  If it encounters an invalid option, getopts stores a ? inside variable for error handling.  When an option takes an argument, getopts stores the associated argument in the special variable OPTARG.  If it cannot locate the option's argument, getopts sets OPTARG to ?.

From a user's point of view, a script using getopts follows one simple rule.  Options must always be preceded by a dash.  If listed separately, each option takes a dash.  If stacked, the option list begins with a dash.  This is required for getopts to differentiate between options and their arguments.  It is hardly a burden for users to do this and will help beginners of UNIX to learn its syntax.

Returning to the previous example, setether can be rewritten with getopts:

#!/bin/sh
#
# setether: set an Ethernet interface's IP configuration
#

while getopts ab:e:i:m:n: option
do
  case "${option}" in
    a) ARP="arp"
    b) BROADCAST=${OPTARG};;
    e) INTERFACE=${OPTARG};;
    i) IPADDRESS=${OPTARG};;
    m) NETMASK=${OPTARG};;
    n) NETWORK=${OPTARG};;
    *) echo "setether: illegal option: ${option}"
       exit 1
       ;;
  esac
done

ifconfig ${INTERFACE} ${IPADDRESS} netmask ${NETMASK} broadcast ${BROADCAST} ${ARP}
route add -net ${NETWORK}

Now setether loops upon the results of getopts.  The script lists the valid command line options in the string ab:i:m:n:.  The colons trailing the last four arguments force getopts to find the options' arguments.  Then when the case statement tests for the current option processed, the script reads the argument from OPTARG.  Upon completion of processing the arguments, the program sets the interface's IP configuration.

The new version improves upon the original in a number of ways. For one, it assumes no special order of the command line arguments.  The script even paraterizes the target interface so that it can be specified in any position.  The real advantage to all of this is that the program lets getopts do the work.  The programmer need not be concerned with any special shifting patterns necessary to get to the next argument.  Another advantage is that the options can be stacked.  If the user so desires, the options can be listed together rather than separately with their arguments.  Also if an option is missing its argument, getopts performs error checking to make sure that one is present.  There is now some built-in exception handling that was not present before.  In the origial version, it is quite possible that the user could have forgotten an argument.  Then the script would have happily read the next option as the previous option's argument.  This would have deleterious results on the script's operation because the shift pattern would be incorrect.  Finally, the new version is compact.  It is smaller in code size and much easier to read since it focuses on the task at hand, namely, the processing of command line arguments.

The only disadvantage to getopts is the fact that it can only handle single character options.  It will not properly process options that are words as some programs do.  If a script has so many options that it runs out of characters to handle them, then the scripter should consider processing positional parameters.  On the other hand, the scripter should also consider that the program being written could possibly use a redesign in order to make it easier to use.