[go: up one dir, main page]

0% found this document useful (0 votes)
29 views39 pages

Lec 05

Uploaded by

daasebreseth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views39 pages

Lec 05

Uploaded by

daasebreseth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 39

awk, sed, tr, cut

Objectives

After studying this lesson, you should


be able to:
– awk: a pattern scanning and processing
language
– sed: stream editor
– tr: translate characters
– cut: cut specific columns vertically
awk

• awk, a pattern scanning and processing


language, helps to produce reports that
look professional.
• Named after its developers Aho,
Weinberger, and Kernighan.
• Search files to see if they contain lines
that match specified patterns and then
perform associated actions.
awk

awk [–Fsep] ‘pattern{action}’ filenames

• awk checks to see if the input records in the


specified files satisfy the pattern.
• If they do, awk executes the action associated with
it.
• If no pattern is specified, the action affects every
input record.
awk

awk [–Fsep] ‘pattern{action}’ filenames

• -Fsep options allows you to specify the field


separator. By default this is set to whitespace
(SPACE and TAB). –F: means the field separator
is a colon.

• A common use of awk is to process input files by


formatting them, and then output the results in the
chosen form.
Different Way to Run awk

• awk –f awkFile inputFile


– Since awk itself can be a complex
language, you can store all the
commands in a file and run it with the –f
flag.
– We will not cover it in this lecture.
Important awk Concepts

• Record
– Every line of an input file is a record.
• The current record can be referenced with $0.
• awk operates on one record at a time.
• Field
– A record consists of fields, which by default are
separated by any number of SPACES or TABS.
– Each field is numbered and can be referred to
• $1 is the first field, $2 is the second, etc.
awk Example

• A sample data file named countries.


Canada:3852:25:North America
USA:3615:237:North America
Brazil:3286:134:South America
England:94:56:Europe
France:211:55:Europe
Japan:144:120:Asia
Mexico:762:78:North America
China:3705:1032:Asia
India:1267:746:Asia

• country name, area (thousands of km^2),


population density (millions), continent
awk Example

We could use awk to format it:


awk -F: '{ printf "%-10s\t%d\t%d\t%15s\n",$1,$2,$3,$4 }'
countries
Output:
Some build-in Variables

• NF - Number of fields in current record


• $NF - Last field of current record
• NR - Number of records processed so far
• FILENAME - Name of current input file
• FS - Field separator (default: SPACE or TAB)
• $0 - Entire line
• $1, $2, …, $n - Field 1, 2, …, n
PRINTF - formatting output

• The awk version of printf is similar to that of the


C language.
printf "control-string“, arg1, arg2, ... , argn
• The control-string determines how printf will
format arg1 - argn. Within the control-string,
you can use “\n" to indicate a NEWLINE and “\t"
to indicate a TAB.
• The control-string contains conversion
specifications, one for each argument.
PRINTF - formatting output

A conversion specification has the following format:


%[-][x[.y]]conv
- causes printf to left justify the argument.
x is the minimum field width.
.y is the number of places to the right of a decimal
point in a number.
conv is a letter from the following list:
d decimal o unsigned octal
e exponential notation s string of characters
f floating point number x unsigned hexadecimal
g use f or e, whichever is shorter
awk Example Revisited
awk -F: '{ printf "%-10s\t%d\t%d\t%15s\n",$1,$2,$3,$4 }'
countries
• -F: option instructed awk to separate the input data
into fields delimited by colons.
• No particular pattern was specified so awk performed
the print action for every line of the file.
• $1,$2,$3,$4 means printing four fields. Each field has
a conversion specification. e.g.,
– %-10s\t indicates that field $1 is to appear as a string. –
specifies the string is to be left-justified. The minimum field
width is 10. \t indicates a TAB.
Selecting Records

• awk opens a file and reads it serially, one line at a


time.
• By specifying a pattern, we can select only those
lines that contain a certain string of characters.
• A string of characters placed between forward
slashes (//) is called a regular expression. Any
occurrence of that pattern within a line will cause it to
be selected.
• awk '/Europe/' countries
– display all countries which are situated within
Europe.
Selecting Records (Contd)

• If you want to select records on the basis of


data in a particular field, you can use a
matching operator such as the equal signs.
• awk -F: '$3 == 55' countries
– The third field (which tell us each countries
population) is tested against the value 55, and one
record is selected.
• Matching operators are :
== equal to != not equal to
> greater than < less than
>= greater than or equal to <= less than or
equal to
Using Logic Operators

• We can use logical operators to


combine several conditions.
– To select a record from the results file that
satisfies more than one condition awk uses
the symbols && as the and operator.
– || indicates the or operator.
Questions

Sample file named cars:

• Q1: How to select all the cars which were made after
or during 1991 (column 3) and cost less than $6,250
(column 4)?
• Q2: How to select cars made either by ford, or buick?
Data processing & Arithmetic

Sample file named wages:

The three field titles are:


Employee, rates of pay per hour, weekly hours
Data processing & Arithmetic
(Contd)
• If the tax is 25%, we can calculate and
display each employee’s GROSS pay and
TAX like this:
awk '{ printf "%-10s\t%.2f \t%d\t%.2f \t%.2f\n",
$1,$2,$3,$2*$3,$2*$3*0.25 }' wages
Questions

What is the output of the following


commands?
• Q3: awk -F: '{ print $1 }' /etc/passwd |
sort
• Q4: awk -F: '{ print "username: " $1 "\t\
tuid:" $3 }' /etc/passwd
sed

• sed stands for stream editor, works as a filter


processing input line by line.
• sed is a non-interactive editor used to make
global changes to entire files at once.
• An interactive editor like vi would be too
cumbersome to replace large amounts of
information at once.
• sed command is primarily used to substitute
one pattern for another.
sed

• Syntax:
sed ‘command’ file(s)
sed –e ‘command’ –e ‘command’ … file(s)
sed –f scriptfile file(s)
Useful sed Commands

Commands Example Explanation


d 4,8d Delete the 4th through 8th
lines

s s/old/new/ Replace old with new


Patterns Revisited
^ beginning of the line
$ end of the line
. any single character
(character)* arbitrarily many occurrences of
(character)
(character)? 0 or 1 instance of (character)
[abcdef] Match any character enclosed in [ ]
(in this instance, a b c d e or f)
[^abcdef] Match any character NOT enclosed
in [ ] (in this instance, any
character other than a b c d e or f)
sed Substitute
• SUBSTITUTE(s)
[address1[ , address2]]s/pattern/replacement/[flags]

Flags:
n replace nth instance of pattern with replacement
g replace all instances of pattern with replacement
p write to STDOUT if a successful substitution takes
place
w file write to file if a successful substitution takes place
sed Substitute (Contd)
[address1[ , address2]]s/pattern/replacement/[flags]

• An address can be
– a regular expression enclosed by forward slashes
/regex/ , or
– a line number .
• The $ symbol can be used to denote the last line.
• If one address is given, then the substitution is
applied to lines containing that address.
• If two addresses are given separated by a
comma, then the substitution is applied to all
lines between the two lines that match the
pattern.
Questions
• Q5: What does the following command do?
sed 's/Tx/Texas/' foo

• Q6: What is the output of following


command?
cat animal
I have three dogs and two cats.
sed -e 's/dog/cat/g' -e 's/cat/elephant/g' animal
Questions
• Q7: What is the output of following
command?

cat animal1
The black cat was chased by the brown dog.
The black cat was not chased by the brown dog.
sed -e '/not/s/black/white/g' animal1
sed Delete

• DELETE(d)
[address1[, address2] ]d
• sed 6d foo
– deletes line 6
• Q8: How to delete lines 1-10 from the file
foo
• Q9: How to delete lines 11 through the end
of the file foo
Questions
• Q10: What does the following command do?
sed ‘/^Co*t/,/[0-9]$/d’ foo
• Q11: What is the output of the following
command?
cat linefile
line 1 (one)
line 2 (two)
line 3 (three)
sed -e '/^line.*one/s/line/LINE/' -e '/line/d' linefile
Questions

• Q12: How to deletes every line in the


file log that contains the string warning?

• Sed can delete a string, not the entire


line, substitute text with nothing.
• Q13: How to removes the string draft
everywhere it occurs in the file foo?
tr

• translates characters from stdin to stdout.


• tr [options] string1 [string2]
Options:
-c complement the set of characters
specified by string1. The complement is the
set of all characters not in string1
-d delete all occurrences of input
characters specified by string1
-s replace instances of repeated
characters with a single character
tr Examples

• tr '[a-z]' '[A-Z]' < trfile


– replaces all lower case characters with
upper case in file trfile
• tr ' ' '\012' < trfile
– turn spaces into newlines (ASCII code 012)
• Q14: How to translates only lower case a
through m to upper case A though M in a
file?
Questions

• tr -d string1 lets you delete any


character matched in string1.
• tr -d '[a-z]'
– deletes all lower case characters
• Q15: How to delete all vowels?
• Q16: How to delete all characters
except vowels?
cut

• cut - cut out selected fields of each line


of a file
• The cut command has a very narrow
set of capabilities, but when you’re
extracting specific columns of
information, it’s a winner.
cut Examples
• cut -d: -f1 /etc/passwd
– Extract usernames from /etc/passwd
– -d option is used to specify : as the field separator,
default is a TAB
– -f option specifies the first field
– -d may only be used with -f
• Q17: What is the output of the following
command?
who am i | cut -f1 -d' '
Questions
• cat cutfile
Line number 1
Line number 2
Line number 3
Line number 4
• Q18: How can you get the following output
using cut?
1
2
3
4
Lecture Summery

• awk: a pattern scanning and processing


language
• sed: stream editor
• tr: translate one character to another
• cut: cut specific columns vertically
Next Lecture

• Shell programming (Chap 17)


• Quiz #2

You might also like