An Introduction To The UNIX Shell
An Introduction To The UNIX Shell
S. R. Bourne
ABSTRACT
The shell is a command programming language that provides an interface to the UNIX†
operating system. Its features include control-flow primitives, parameter passing, vari-
ables and string substitution. Constructs such as while, if then else, case and for are avail-
able. Two-way communication is possible between the shell and commands. String-val-
ued parameters, typically file names or flags, may be passed to a command. A return
code is set by commands that may be used to determine control-flow, and the standard
output from a command may be used as shell input.
The shell can modify the environment in which commands run. Input and output can be
redirected to files, and processes that communicate through ‘pipes’ can be invoked.
Commands are found by searching directories in the file system in a sequence that can be
defined by the user. Commands can be read either from the terminal or from a file, which
allows command procedures to be stored for later use.
† UNIX is a registered trademark of AT&T Bell Laboratories in the USA and other countries.
An Introduction to the UNIX Shell
S. R. Bourne
1.0 Introduction
The shell is both a command language and a programming language that provides an interface to the UNIX
operating system. This memorandum describes, with examples, the UNIX shell. The first section covers
most of the everyday requirements of terminal users. Some familiarity with UNIX is an advantage when
reading this section; see, for example, "UNIX for beginners". unix beginn kernigh 1978 Section 2 describes
those features of the shell primarily intended for use within shell procedures. These include the control-
flow primitives and string-valued variables provided by the shell. A knowledge of a programming language
would be a help when reading this section. The last section describes the more advanced features of the
shell. References of the form "see pipe (2)" are to a section of the UNIX manual. seventh 1978 ritchie thompson
.SH 1.1 Simple commands
Simple commands consist of one or more words separated by blanks. The first word is the name of the
command to be executed; any remaining words are passed as arguments to the command. For example,
who
is a command that prints the names of users logged in. The command
ls −l
prints a list of files in the current directory. The argument −l tells ls to print status information, size and the
creation date for each file.
characters, words and lines found. If only the number of lines is required then
wc −l <file
could be used.
This mechanism is useful both to save typing and to select names according to some pattern. It may also be
used to find files. For example,
echo /usr/fred/*/core
finds and prints the names of all core files in sub-directories of /usr/fred . (echo is a standard UNIX com-
mand that prints its arguments, separated by blanks.) This last feature can be expensive, requiring a scan of
all sub-directories of /usr/fred .
There is one exception to the general rules given for patterns. The character ‘.’ at the start of a file name
must be explicitly matched.
echo *
will therefore echo all file names in the current directory not beginning with ‘.’ .
echo .*
will echo all those file names that begin with ‘.’ . This avoids inadvertent matching of the names ‘.’ and ‘..’
which mean ‘the current directory’ and ‘the parent directory’ respectively. (Notice that ls suppresses infor-
mation for the files ‘.’ and ‘..’ .)
1.6 Quoting
Characters that have a special meaning to the shell, such as < > * ? & , are called metacharacters. A
complete list of metacharacters is given in appendix B. Any character preceded by a \ is quoted and loses
its special meaning, if any. The \ is elided so that
echo \\?
will echo a single ? , and
echo \\\\
will echo a single \ . To allow long strings to be continued over more than one line the sequence \newline is
ignored.
\ is convenient for quoting single characters. When more than one character needs quoting the above mech-
anism is clumsy and error prone. A string of characters may be quoted by enclosing the string between sin-
gle quotes. For example,
echo xx´****´xx
will echo
xx****xx
The quoted string may not contain a single quote but may contain newlines, which are preserved. This
quoting mechanism is the most simple and is recommended for casual use.
A third quoting mechanism using double quotes is also available that prevents interpretation of some but
not all metacharacters. Discussion of the details is deferred to section 3.4 .
1.7 Prompting
When the shell is used from a terminal it will issue a prompt before reading a command. By default this
prompt is ‘$ ’ . It may be changed by saying, for example,
PS1=yesdear
that sets the prompt to be the string yesdear . If a newline is typed and further input is needed then the shell
will issue the prompt ‘> ’ . Sometimes this can be caused by mistyping a quote mark. If it is unexpected
then an interrupt (DEL) will return the shell to read another command. This prompt may be changed by
saying, for example,
PS2=more
-4-
1.9 Summary
• ls
Print the names of files in the current directory.
• ls >file
Put the output from ls into file.
• ls wc −l
Print the number of files in the current directory.
• ls grep old
Print those file names containing the string old.
• ls grep old wc −l
Print the number of files whose name contains the string old.
• cc pgm.c &
Run cc in the background.
-5-
UNIX files have three independent attributes, read, write and execute. The UNIX command chmod (1) may
be used to make a file executable. For example,
chmod +x wg
will ensure that the file wg has execute status. Following this, the command
wg fred
is equivalent to
sh wg fred
This allows shell procedures and programs to be used interchangeably. In either case a new process is cre-
ated to run the command.
As well as providing names for the positional parameters, the number of positional parameters in the call is
available as $# . The name of the file being executed is available as $0 .
A special shell parameter $* is used to substitute for all positional parameters except $0 . A typical use of
this is to provide some default arguments, as in,
nroff −T450 −ms $*
which simply prepends some arguments to those already given.
case $# in
*) . . . ;;
*) . . . ;;
esac
Another example of the use of the case construction is to distinguish between different forms of an argu-
ment. The following example is a fragment of a cc command.
for i
do case $i in
−[ocs]) . . . ;;
−*) echo \’unknown flag $i\’ ;;
* .c) /lib/c0 $i . . . ;;
* ) echo \’unexpected argument $i\’ ;;
esac
done
To allow the same commands to be associated with more than one pattern the case command provides for
alternative patterns separated by a . For example,
case $i in
−x −y). . .
esac
is equivalent to
case $i in
−[xy]) . . .
esac
ed $3 <<%
g/$1/s//$2/g
w
%
The call
edg string1 string2 file
is then equivalent to the command
ed file <<%
g/string1/s//string2/g
w
%
and changes all occurrences of string1 in file to string2 . Substitution can be prevented using \ to quote the
special character $ as in
ed $3 <<+
1,\\$s/$1/$2/g
w
+
(This version of edg is equivalent to the first except that ed will print a ? if there are no occurrences of the
string $1 .) Substitution within a here document may be prevented entirely by quoting the terminating
string, for example,
grep $i <<\\#
...
#
The document is presented without modification to grep. If parameter substitution is not required in a here
document this latter form is more efficient.
echo $user
and is used when the parameter name is followed by a letter or digit. For example,
tmp=/tmp/ps
ps a >${tmp}a
will direct the output of ps to the file /tmp/psa, whereas,
ps a >$tmpa
would cause the value of the variable tmpa to be substituted.
Except for $? the following are set initially by the shell. $? is set after executing each command.
$? The exit status (return code) of the last command executed as a decimal string. Most com-
mands return a zero exit status if they complete successfully, otherwise a non-zero exit sta-
tus is returned. Testing the value of return codes is dealt with later under if and while com-
mands.
$# The number of positional parameters (in decimal). Used, for example, in the append com-
mand to check the number of parameters.
$$ The process number of this shell (in decimal). Since process numbers are unique among
all existing processes, this string is frequently used to generate unique temporary file
names. For example,
ps a >/tmp/ps$$
...
rm /tmp/ps$$
$! The process number of the last process run in the background (in decimal).
$− The current shell flags, such as −x and −v .
Some variables have a special meaning to the shell and should be avoided for general use.
$MAIL When used interactively the shell looks at the file specified by this variable before it issues
a prompt. If the specified file has been modified since it was last looked at the shell prints
the message you have mail before prompting for the next command. This variable is typi-
cally set in the file .profile, in the user’s login directory. For example,
MAIL=/usr/mail/fred
$HOME The default argument for the cd command. The current directory is used to resolve file
name references that do not begin with a / , and is changed using the cd command. For
example,
cd /usr/fred/bin
makes the current directory /usr/fred/bin .
cat wn
will print on the terminal the file wn in this directory. The command cd with no argument
is equivalent to
cd $HOME
This variable is also typically set in the the user’s login profile.
$PATH A list of directories that contain commands (the search path ). Each time a command is
- 10 -
executed by the shell a list of directories is searched for an executable file. If $PATH is not
set then the current directory, /bin, and /usr/bin are searched by default. Otherwise $PATH
consists of directory names separated by : . For example,
PATH=:/usr/fred/bin:/bin:/usr/bin
specifies that the current directory (the null string before the first : ), /usr/fred/bin, /bin and
/usr/bin are to be searched in that order. In this way individual users can have their own
‘private’ commands that are accessible independently of the current directory. If the com-
mand name contains a / then this directory search is not used; a single attempt is made to
execute the command.
$PS1 The primary shell prompt string, by default, ‘$ ’.
$PS2 The shell prompt when further input is needed, by default, ‘> ’.
$IFS The set of characters used by blank interpretation (see section 3.4).
The value tested by the while command is the exit status of the last simple command following while.
Each time round the loop command-list1 is executed; if a zero exit status is returned then command-list2 is
executed; otherwise, the loop terminates. For example,
while test $1
do . . .
shift
done
is equivalent to
for i
do . . .
done
shift is a shell command that renames the positional parameters $2, $3, . . . as $1, $2, . . . and loses $1 .
Another kind of use for the while/until loop is to wait until some external event occurs and then run some
commands. In an until loop the termination condition is reversed. For example,
- 11 -
An example of the use of if, case and for constructions is given in section 2.10 .
A multiple test if command of the form
if . . .
then . . .
else if . . .
then . . .
else if . . .
...
fi
fi
fi
may be written using an extension of the if notation as,
if . . .
then ...
elif ...
then ...
elif ...
...
fi
The following example is the touch command which changes the ‘last modified’ time for a list of files. The
command may be used in conjunction with make (1) to force recompilation of a list of files.
- 12 -
flag=
for i
do case $i in
−c) flag=N ;;
* ) if test −f $i
then ln $i junk$$; rm junk$$
elif test $flag
then echo file \\´$i\\´ does not exist
else >$i
fi
esac
done
The −c flag is used in this command to force subsequent files to be created if they do not already exist.
Otherwise, if the file does not exist, an error message is printed. The shell variable flag is set to some non-
null string if the −c argument is encountered. The commands
ln . . .; rm . . .
make a link to the file and then remove it thus causing the last modified date to be updated.
The sequence
if command1
then command2
fi
may be written
command1 && command2
Conversely,
command1 command2
executes command2 only if command1 fails. In each case the value returned is that of the last simple com-
mand executed.
In the first command-list is simply executed. The second form executes command-list as a separate process.
For example,
(cd x; rm junk )
executes rm junk in the directory x without changing the current directory of the invoking shell.
The commands
cd x; rm junk
have the same effect but leave the invoking shell in the directory x.
- 13 -
cd /usr/man
for i
do case $i in
[1−9]*) s=$i ;;
−t) N=t ;;
−n) N=n ;;
−*) echo unknown flag \\´$i\\´ ;;
*) if test −f man$s/$i.$s
then ${N}roff man0/${N}aa man$s/$i.$s
else : ´look through all manual sections´
found=no
for j in 1 2 3 4 5 6 7 8 9
do if test −f man$j/$i.$j
then man $j $i
found=yes
fi
done
case $found in
no) echo \’$i: manual page not found\’
esac
fi
esac
done
Figure 1. A version of the man command
- 15 -
echo ${d=.}
which substitutes the same string as
echo ${d−.}
and if d were not previously set then it will be set to the string ‘.’ . (The notation ${. . .=. . .} is not available
for positional parameters.)
If there is no sensible default then the notation
echo ${d?message}
will echo the value of the variable d if it has one, otherwise message is printed by the shell and execution of
the shell procedure is abandoned. If message is absent then a standard message is printed. A shell proce-
dure that requires some parameters to be set might start as follows.
: ${user?} ${acct?} ${bin?}
...
Colon (:) is a command that is built in to the shell and does nothing once its arguments have been evalu-
ated. If any of the variables user, acct or bin are not set then the shell will abandon execution of the proce-
dure.
The entire string between grave accents (`. . .`) is taken as the command to be executed and is replaced with
the output from the command. The command is written using the usual quoting conventions except that a `
must be escaped using a \ . For example,
ls `echo "$1"`
is equivalent to
ls $1
Command substitution occurs in all contexts where parameter substitution occurs (including here docu-
ments) and the treatment of the resulting text is the same in both cases. This mechanism allows string pro-
cessing commands to be used within shell procedures. An example of such a command is basename which
removes a specified suffix from a string. For example,
basename main.c .c
will print the string main . Its use is illustrated by the following fragment from a cc command.
case $A in
...
*.c) B=`basename $A .c`
...
esac
that sets B to the part of $A with the suffix .c stripped.
Here are some composite examples.
- 17 -
The following table gives, for each quoting mechanism, the shell metacharacters that are evaluated.
metacharacter
\ $ * ` " ´
´ n n n n n t
` y n n t n n
" y y n y t n
t terminator
y interpreted
n not interpreted
In cases where more than one evaluation of a string is required the built-in command eval may be used. For
example, if the variable X has the value $y, and if y has the value pqr then
eval echo $X
will echo the string pqr .
In general the eval command evaluates its arguments (as do all commands) and treats the result as input to
the shell. The input is read and the resulting command(s) executed. For example,
wg=\’eval who grep\’
$wg fred
is equivalent to
who grep fred
In this example, eval is required since there is no interpretation of metacharacters, such as , following
substitution.
Those signals marked with an asterisk produce a core dump if not caught. However, the shell itself ignores
quit which is the only external signal that can cause a dump. The signals in this list of potential interest to
shell programs are 1, 2, 3, 14 and 15.
flag=
trap ´rm −f junk$$; exit´ 1 2 3 15
for i
do case $i in
−c) flag=N ;;
* ) if test −f $i
then ln $i junk$$; rm junk$$
elif test $flag
then echo file \\´$i\\´ does not exist
else >$i
fi
esac
done
The trap command appears before the creation of the temporary file; otherwise it would be possible for the
process to die without removing the file.
Since there is no signal 0 in UNIX it is used by the shell to indicate the commands to be executed on exit
from the shell procedure.
A procedure may, itself, elect to ignore signals by specifying the null string as the argument to trap. The
following fragment is taken from the nohup command.
trap ´´ 1 2 3 15
which causes hangup, interrupt, quit and kill to be ignored both by the procedure and by invoked com-
mands.
Traps may be reset by saying
trap 2 3
which resets the traps for signals 2 and 3 to their default values. A list of the current values of traps may be
obtained by writing
trap
The procedure scan (Figure 5) is an example of the use of trap where there is no exit in the trap command.
scan takes each directory in the current directory, prompts with its name, and then executes commands
typed at the terminal until an end of file or an interrupt is received. Interrupts are ignored while executing
the requested commands but cause termination when scan is waiting for input.
d=`pwd`
for i in *
do if test −d $d/$i
then cd $d/$i
while echo "$i:"
trap exit 2
read x
do trap : 2; eval $x; done
fi
done
read x is a built-in command that reads one line from the standard input and places the result in the variable
- 21 -
ed file &
would allow both the editor and the shell to read from the same input at the same time.
The other modification to the environment of a background command is to turn off the QUIT and INTER-
RUPT signals so that they are ignored by the command. This allows these signals to be used at the terminal
without causing background commands to terminate. For this reason the UNIX convention for a signal is
that if it is set to 1 (ignored) then it is never changed even for a short time. Note that the shell command
trap has no effect for an ignored signal.
Acknowledgements
The design of the shell is based in part on the original UNIX shell unix command language thompson and the
PWB/UNIX shell, pwb shell mashey unix some features having been taken from both. Similarities also exist
with the command interpreters of the Cambridge Multiple Access System cambridge multiple access system hartley
and of CTSS. ctss .LP I would like to thank Dennis Ritchie and John Mashey for many discussions during
the design of the shell. I am also grateful to the members of the Computing Science Research Center and to
Joe Maranzano for their comments on drafts of this document.
$LIST$ .bp
Appendix A - Grammar
- 23 -
item: word
input-output
name = value
simple-command: item
simple-command item
command: simple-command
( command-list )
{ command-list }
for name do command-list done
for name in word . . . do command-list done
while command-list do command-list done
until command-list do command-list done
case word in case-part . . . esac
if command-list then command-list else-part fi
pipeline: command
pipeline command
andor: pipeline
andor && pipeline
andor pipeline
command-list: andor
command-list ;
command-list &
command-list ; andor
command-list & andor
file: word
& digit
&−
pattern: word
pattern word
empty:
digit: 0123456789
- 24 -
b) patterns
* match any character(s) including none
? match any single character
[...] match any of the enclosed characters
c) substitution
${...} substitute shell variable
`...` substitute command output
d) quoting
\ quote the next character
´...´ quote the enclosed characters except for ´
"..." quote the enclosed characters except for $ ` \ "
e) reserved words
if then else elif fi
case in esac
for while until do done
{ }