Sed - Important SED Commands and Help
Sed - Important SED Commands and Help
Rod Lovett
3.1
Introduction
A non-interactive editor designed to be useful in three areas:
, I "
1) To edit files
too large
2) To edit any size file when the sequence of editing commands is too complicated to be comfortably typed in interactive mode. 3) To perform multiple 'global' editing functions efficiently in one pass through the input (which may be a sequence of files).
Since only a few lines of input reside in core at one time, and no temporary files are used, the effective 'Size of the file(s) that can be edited is limited only by the requirement that the input and output fit simultaneously into available secondary storage. Complicated editing scripts can be created separately and given to 'sed' as a command file. The principal loss of functionality compared to an interactive editor are 'lack of relative addressing (so that it is not possible to go backwards in a file) because of the line-at-a-time operation and lack of immediate verification that a command has done what was intended. Further, only a limited amount of 'internal memory' is available and hence it is hard to remember text from one line to the next. Sed is a direct descendant of 'ed'. The most striking family resemblance between the two editors is in the class of patterns (regular expressions) they recognize.
3.2
By default, 'sed' copies the standard input to the standard output, perhaps performing one or more editing commands on each line before writing it to the output. Pictorially we have:
command 1
command 2
"
command 3
Standard Output
3.3
Each line in turn is passed successively over each command. Thus line 1 is passed successively to command1, command2 and command3. After command1 has operated on line 1 the (possibly) modified line 1 is passed to command2. After passing through all the commands the (possibly) modified line is sent to Standard Output. Thus the commands are applied one at a time; the input to each command is the output from the previous command. (This behaviour may be modified by flags on the command line - see shortly). The default linear order of application of editing commands can be changed by the flow-of-control-commands, 't' and 'b' (see later). However, even in this case the input line to each command is the output line of the previous command . . Pattern matching
The range over which an attempt is made to match patterns is called the "pattern space". Ordinarily the pattern space is one line of input text. Examples are developed throughout the text. Many of them are based on the following lines in file 'text':
In Xanadu did Kubla k /fJ0t,V\ A stately pleasure dome decree; Where Alph, the sacred river, ran Through caverns measureless to man Down to a sunless sea.
3.4
Usage
Sed can be called in many different ways: 1. sed 'ed command' filenamets)
. t:,,"'j ~ ~,,\-I~
c::..o~
~'-
_L..tb-v
"
e,
l,
~\::: ~
\..\
~.J
Ie:-
oVV'\o.a-...--
in/' text
Or the editor commands on the command line can be introduced by the '-e' flag:' .
,_?~e:~--_c~
~ sed -e /e/d -e Irs/In/Down in/" text .'-~'-')' kz~oA~ ::!:-__-""--" .. "" J whereany command containing a white space must be put in quotes.
4. sed -f edcommands text> newtext where 'edcommands' contains the editor commands, e.g. 'edcommands' contains: /e/d s/In/Down
in/
3.5
Usage
"
~ 6. sed
or , sed -n 'list of commands' text sed~-Th:/an/p text,
(;'-I'-~j~S~
b~1c~d
(?,-J-~r
<?'
command text
'
Addres.ses
t .
;_
~.
'to
'Ii'
One or both addresses may be omitted; the format of addresses is given shortly. Any number of blanks lor tabs may separate the addresses from the function. , .' The function must be present.
',,' "
"
.'
The arqumentsmay.be required or optional; aceordlnq to which function is used. "" . . ' , Blanks and spaces at the beginning of lines are ignored. The addresses select the lines for editing. Addresses may be either line numbers or context addresses ( regular expressions ).
3.6
Addresses
The application of a group of commands can be controlled by one address (or address pair) by grouping the commands with curly braces (see later)
A line number is a decimal integer. As each line is read from the input, a line-number counter is incremented; a line-number address matches (selects) the input line when the internal counter equals the address line number. The counter runs cumulatively through multiple input files; it is not reset when a new input file is opened. As a special case, the character '$' matches the last line of the last input file:
A context address is a pattern (regular expression) enclosed in forward slashes. The regular expressions recognized by 'sed' are given in Appendix A. If a command has no addresses, it is applied to every line in the input. If a command has one address, it is applied to all lines which match that address. If a command has two addresses, it is applied to the first line which matches the first address, and to all subsequent lines until (and including) the first subsequent line which matches the second address. Then an attempt is made on subsequent lines to again match the first address, and the process is repeated.
3.7
oke)
(\~~~
."''''
C'\:
(~
sed '/did/,/an/d'
text
Here Lines 4 and 5 reach Standard Output. To delete a blank line use:
I
sed '/A$/d' text Now we will look at 'quit'. sed '2q' text The output is Line 1 and Line 2. All lines up to and including the matching line number are read in and output to Standard Output. who
I I
sed 'lOq'
is equivalent to:
,
who
head
3.8
Outputs Lines 1 -3 unchanged and Through caverns measureless by man Down by a sunless sea.
Line 4 Line 5
The first occurrence of 'to' (in this case the only occurrence) on all lines is substituted by 'by'.
'---:-7
text
In XZnZdu did KublZ KhZn A stltely pleZsure dome decree; Where Alph, the sZcred river, rZn Through cZverns meZsureless to mZn Down to Z sunless seZ.
/Where/,s/a/Z/' text
In XZnadu did Kubla Khan A stZtely pleasure dome decree; Where Alph, the sZcred river, ran Lines 4 and 5 unchanged.
3.9
-';:.:J
r
"',-,
1'0 '\
I ..c-"
sed '/In/,
_,,,\
cr
IWhere/s/a/z/fV
text
In XZnZdu did KublZ KhZn A stZtely pleZsure dome decree; Where Alph, the sZcred river, rZn Lines 4 and 5 unchanged.
.,
sed 'S/AI<TAB>I'
text
indents all lines by one tab stop (even blank lines). A better method is: ~~ C~V'(A..o,;.Ii:;-(.j
j :-.-
.'
,'>
'" t exn
t ' ",
.
I
sed 'I.
\~
Is/('
/<TAB>1"
e-;
t~ -i.:;_",_ c.............!::-r~..::..... d..."OvoJ,~c\-e..- 'th"" P"'.... ..-\o r",..._ '",,~":,~-rv...I:!.'Ir-~ as above but doesn't indent blankltnes (the '.' matches any character) . Alternatively.
, r.
<
~.
r-
sed -f
eddmds text;
A
Where eacmds contains I $ / I:! S I A./t <TAB> I' " produces the same effect. Here, !s'means "don't substitute" i.e. don't perform the substitution on the blank lines.
,
n,
J!
(.;
,..
.f~
doesn't work since !s is interpreted as an event. However, vie. could have used:
3.10
sed
-f double text
s/ $ / \
/
outputs a line at the end of each line, ,Le. gives us double line spacing. Alternatively: sed 'r blankline' text
where "blankline" contains a blank line, does the same thing. 'r' is the read command and causes the specified file to be read in when the specified pattern (in this case there is no pattern and hence all lines are referenced) is found. sed -f. single text where single contains:
s/[
u.
.i
,.,
<TAB>] [ <TAB>J*/\
/ g 0$... ~o !
Q'ilL ty..L
.~
replaces each string of blank or tabs with a <CR> and thus splits its input into one word per line. The regular expression [ <TAB>] matches a blank or <TAB> and [ <TAB>]* matches any number (including zero) blanks or tabs. So the whole expression matches one or more blanks and/or tabs.
I~
(~~,,~J
,
V\...
c)r..4f{fo.j~
It I
3.11
n~"::.
J,vti~~\
v~amuel .,~Poet
~~~~~~~'
.:.;J
produces:
t
jl
,, In Xanadu did kubla khan A stately pleasure dome decree; Down to a sunless 'sea. Samuel Taylor Coleridge Poet Line 1 Une 2 Line 5
.. ,
'-
'
Here we append to the last (note the $) line the given text. All newlines in the appended text except the last newline must be escaped.
t. .:
'.'
'"
,1
,y ,
3.12
~ 1/$
.~
s
~Ar';;
~r
~-.
t.s I)
is
VV\cdcJ.,..
('-1:1'-----
~evu""J-
0Ni-lJ-Lo,-,-s\,
I~
~yr<?_ ~~
(,v--Se,v~
-tL:.~=-~~:
~&\ <)~
L
CO
Coleridge
(_.:>(t(_l'-"~~
e-r:
produces:
A poem by Samuel Taylor Coleridge In Xanadu did Kubla Khan A stately pleasure dome decree;
Line 1 Line 2
A more sophisticated use of 'i' and 'a' is as follows: Suppose we have a list of login names and uids of the form: cs_s010 5791 cs_s41 2343 cs_s405 1946 ( obtained,say, from: awk -F: ) in file data then sed -f dobox data ' printf(l%s\t%d\n",$1,$3)' /etc/passwd
etc.
3.13
s/"/l
sl
s/$1
II
[0-9]1
1&1
ii\
\
-------------------/fA
~-----------I--------I
c-ti;"J..-v ~ ce.... , JL,;,.A.
a\
.c:
_--
I-~~~~~~~---I---;~~~-I 1-----------1--------1
1 cs_s016 1 2062 1 1-----------1--------1
sl [0-9]1 1&1 Note the use of the' &' in the above line, viz:
cs_s002
2050
3.14
If we want only the changed lines to be made available we can write, e.g.
",~:-..
text
.
Now Standard Output lAa~"tea~sc.'exP~t;d, In Xanadu did Kubla Khan A stately pleasure dome decree; Where Alph, the sacred river, ran Through caverns measureless by man Down by a sunless sea. and the file 'changes' contains the lines:'
~~
cJ'PA-:)-:.s'
Line 4 Line 5
3.15
text
I.
,
IIv\""Ac,t-..~.J
,.... 'vfr~
~~
d ~<2...
''40''-A;..
~,d,"",'
~'.-"
.'-'"
'.c
-~
"'~. ,*:>1;-"
~-
...... '-'<..01
would duplicate the matching lines on Sfandatd Output and we would get:
In Xanadu did Kubla Khan A stately pleasure dome decree; Where Alph, the sacred river, ran Through caverns measureless by man Through caverns measureless by man Down by a sunless sea. Down by a sunless sea.
This is because we get automatic output of our lines to Standard Output. However, sometimes it is convenient to turn off automatic output ( with the -n flag). In this case we only 'g~t lines explicitly requested with the 'p' command. Thus: sed -n 's/to/by/p' produces: Through caverns measureless by man Down by a sunless sea.
,f
text
Thus
sed -n '/pattern/p' text grep 'pattern' text sed -n '/pattern/!p' text sed '/pattern/d' text grep -v 'pattern' text
and
is equivalent to is equivalent to
3.16
Let us write a shell script which outputs all _files.that are younger than' file 'file'. . ' Remembering that '1 s -1t' gives us the files in reverse chronological order we write script 'newer' as follows: . .
,
# Usage newer filename if ($#argv != 1) then echo Usage: $0 filename exit 1 endi'f ' Is -It I sed 'I' $1 '/q' exit 0
Notice that $1 is not inside the 'sed' command.
The Grouping of Commands Commands may be grouped using the curly braces thus:
{\ w temp\ },
q\
will quit after the matching line has been written to "temp".
3.17
c.;, e~
6\
'(
,~""-Ot')"':>v."- \.~ ~
~\< ...... ~
1'.
>{'02...C~<::h
w temp
'b' by itself causes a branch to the end of the program (bypassing the 'q' and the 'w temp') and then the next line is fetched. 'temp'will thus hold all lines NOT rnatchlnq [IW], l.e, A stately pleasure dome decree; Through caverns measureless to man Down to a sunless sea. Line 2 Line 4 Line 5
and Standard Output will contain all lines matching [IW], i.e In Xanadu did Kubla Khan Where Alph, the sacred river, ran Line 1 Line 3
} w temp
:again
where here we are branching to the specified label.
3.18
/[IW]/
outputs: In XAnadu In XAnAdu In XAnAdu In XAnAdu did did did did Kubla Khan Kubla Khan KublA Khan KublA KhAn Line 1 Line 1 Line 1 Line 1
The Next function Copies the current input line to Standard Ouput and brings the next input line into the pattern space. Consider:
text
which produces the following output. In Xanadu did Kubla Khan Where Alph, the sacred river, ran LJOWn. to a sunless sea. , Line 1 Line 3 Line 5
3.19
3.20
The regular expressions recognised by 'sed' are as follows: 1. An ordinary character (not one of those discussed below) is a regular expression, and matches that character. 2. A circumflex, '1\', at the beginning of a regular expression matches the null character at the beginning of a line. 3. A dollar-sign, '$', at the end of a regular expression matches the null character at the end of a line. 4. The character, '\n', matches an embedded newline character, but not the newline at the end of the pattern space. 5. A period, '.', matches any character except the terminating newline at the end of the pattern space.
,
'
6. A regular expression followed by the character ,*, matches any number (including zero) of adjacent occurrences of the regular expression it follows. 7. A string of characters in square brackets, '[]', matches any character in the string, and no others. If, however, the first character is a circumflex, '1\', the regular expression matches any charcter except the characters in the string and the terminal newline of the pattern space. 8. A concatenation of regular expressions is a regular expression which matches the concatenation of strings matched by the components of the regular expression.
To use one of the special characters (1\ $ . * [ precede the special character by a backslash, ,\'.
] \ I) as itself,
3.22