0% found this document useful (0 votes)

2 views10 pages

ICB Lecturenote 7

Chapter 7 focuses on working with text and data files in Linux/Unix systems, covering commands for viewing, sorting, and editing files. Key tools discussed include 'cat' for file manipulation, 'sort' for organizing data, and 'uniq' for filtering duplicates. By the end of the chapter, readers will be proficient in handling various file operations efficiently.

Uploaded by

Papun Kumar Sahoo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views10 pages

ICB Lecturenote 7

Uploaded by

Papun Kumar Sahoo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Chapter 7

Playing with Text and Data Files

Working with text and data files is an essential part of using Linux/Unix
systems.

These systems store most of their configuration, logs, and data in text files.
Knowing how to view, analyze, edit, and manipulate such files quickly
can save time and improve efficiency.

In this chapter, we will learn:

 How to view and analyze text files in different ways.
 Commands to sort, search, split, and compare file contents.
 Basic and advanced editing tools like Pico and Vim.
By the end of this chapter, you will be able to:
 Open and read files in various formats.
 Filter and organize data.
 Create and edit files efficiently.

7.1.1 A Quick Start: cat

Definition:
cat (short for concatenate) is a Linux command used to view the contents
of files, combine multiple files, and even create new files.
Common Uses:
1. View the content of a file
cat filename.txt

Displays the full content of filename.txt on the terminal.

2. Combine and display multiple files
cat file1.txt file2.txt

Shows the contents of file1.txt followed by file2.txt.

3. Create a new file
cat > newfile.txt

 Type your content.

 Press CTRL+D to save and exit.
4. Append text to an existing file
cat >> existing.txt

 Type new content.

 Press CTRL+D to save without overwriting.
Example:
cat fruits.txt

Output:
Apple
Banana
Mango
Orange

7.1.2 Text Sorting

Definition:
sort is a Linux command used to arrange the lines of a text file in alphabetical or
numerical order. It can also reverse the order or sort based on specific fields.
Common Uses:
1. Sort alphabetically
sort filename.txt

Arranges lines in ascending (A–Z) order.

2. Sort in reverse order

sort -r filename.txt

Arranges lines in descending (Z–A) order.

3. Sort numerically

sort -n numbers.txt

Sorts lines as numbers instead of text.

4. Sort by a specific column (useful for tables)

sort -k2 data.txt

Sorts based on the second column.

Example:
cat fruits.txt
Mango
Apple
Orange
Banana

Command:
sort fruits.txt

Output:
Apple
Banana
Mango
Orange

Example 2 – Sorting and Removing Duplicates

File: names.txt
Rahul
Anita
Rahul
Suman
Anita
Command:

sort -u names.txt

Output:

Anita
Rahul
Suman

You can combine options, e.g., sort -nr for numeric sorting in reverse order.

7.1.3 Extract Unique Lines

Definition:
The uniq command in Linux is used to filter out repeated lines from a file.
However, it only removes consecutive duplicates, so files should be sorted first for
best results.

Common Uses:
1. Remove consecutive duplicates

uniq filename.txt

Displays the file content with consecutive duplicates removed.

2. Count occurrences of each line

uniq -c filename.txt
Shows how many times each line appears.

3. Show only duplicate lines

uniq -d filename.txt

Displays only the lines that appear more than once.

4. Show only unique lines

uniq -u filename.txt

Displays lines that appear exactly once.

Example – Removing Duplicates
File: names.txt
Anita
Anita
Rahul
Rahul
Suman

Command:
uniq names.txt

Output:
Anita
Rahul
Suman

Example – Counting Occurrences

uniq -c names.txt

Output:
2 Anita
2 Rahul
1 Suman

For accurate results with all duplicates removed, combine sort with uniq:
sort names.txt | uniq

Search Commands
• /regex→ Searches forward for regex.
Example: /apple → moves cursor to first "apple".

• ?regex→ Searches backward for regex.

Example: ?apple → searches upward for "apple".

• n→ Jumps to the next match.

Example: after /apple, press n to go to the next "apple".

• Shift + n → Jumps to the previous match.

Substitute Commands
1. Current line only
• :s/regex/xyz/
→ Replaces first occurrence of regex in the current line.
Example: :s/apple/orange/

o Line: apple is red → becomes → orange is red.

• :s/regex/xyz/g
→ Replaces all occurrences of regex in the current line.
Example: :s/apple/orange/g

o Line: apple apple pie → becomes → orange orange pie.

• :s/regex/xyz/c
→ Asks for confirmation before each replacement in the line.
Example: :s/apple/orange/c

o Vim will ask replace with orange? (y/n/a/q/l) for each match.

2. Whole file
• :%s/regex/xyz/g
→ Replaces all occurrences of regex in the whole file.
Example: :%s/apple/orange/g

o File becomes:
o orange is red
o orange is sweet
o banana is yellow
o orange pie is tasty
• :%s/regex/xyz/gc
→ Same as above, but asks for confirmation before replacing each one.

3. Between specific lines

• :x,ys/regex/xyz/g
→ Replace between line x and line y.
Example: :2,3s/apple/orange/g

o Only lines 2 and 3 will be checked:

o apple is red
o orange is sweet
o banana is yellow
apple pie is tasty

Finding Matching Lines of Text using grep, egrep

1. Using grep
grep is used for basic text matching with normal regex.
Example 1: Find lines containing "human"
grep "human" genomes.txt
output: H. sapiens (human) - 3,400,000,000 bp - 30.000 genes
Example 2: Find lines containing "human" with line no
grep -n "human" genomes.txt
Output: 1: H. sapiens (human) - 3,400,000,000 bp - 30.000 genes
2. Using egrep (or grep -E)
egrep allows extended regex patterns (like |).

Example 1: Match multiple patterns (|)

egrep 'bacteria|human' genomes.txt
Output: H. sapiens (human) - 3,400,000,000 bp - 30.000 genes
E. coli (bacteria) - 4,670,000 bp - 3237 genes

Text File Comparisons using diff command

Suppose we have two files:

file1.txt
apple
banana
grapes
mango
file2.txt
apple
banana
orange
mango

1.Basic Comparison using “diff”

diff file1.txt file2.txt
Output:
3c3
< grapes
---
> orange
Meaning:
• Line 3 changed (c)

• In file1.txt it was grapes

• In file2.txt it is orange

2. less command
• It’s faster and safer than opening large files in editors like nano or vi.
• It doesn’t modify files — read-only view.
• Allows scrolling, searching, and navigation easily.
Ex: less filename

Key / Command Action

/word Search forward for a word
?word Search backward for a word
n Repeat the last search in the same direction
N Repeat the last search in the opposite direction
g Go to the beginning of the file
G Go to the end of the file
q Quit less
• Open multiple files:
less file1.txt file2.txt
Then use:
• :n → Next file

• :p → Previous file

Search for a word:

Inside less, type:
Ex: /any word

3. Counting Characters, words and Lines

Ex: wc genomes.txt

6 lines, 42 words and 246 characters

It also includes invisible lines break.
4. Splitting Files into pieces
Split command is used to split the file into a series files.
split [options] filename [prefix]
filename → the file you want to split
prefix → (optional) name prefix for output files (default: x)

Ex: split -l 2 genomes.txt genomes.

Explanation

split The command to split files

-l 2 Split into chunks of 2 lines per file
genomes.txt The input file to be split
genomes. The prefix for output files

Lecture 5
No ratings yet
Lecture 5
35 pages
Linuxsuite 6
No ratings yet
Linuxsuite 6
55 pages
Module 5
No ratings yet
Module 5
13 pages
ExpNo5 Updated
No ratings yet
ExpNo5 Updated
7 pages
Linux Command Cheat Sheet
No ratings yet
Linux Command Cheat Sheet
5 pages
Unix Commands: Text Processing Guide
No ratings yet
Unix Commands: Text Processing Guide
66 pages
Windows Command Line Basics
No ratings yet
Windows Command Line Basics
5 pages
Module 5
No ratings yet
Module 5
14 pages
Linux SED, SORT, UNIQ, and User Management
No ratings yet
Linux SED, SORT, UNIQ, and User Management
6 pages
UNIX Filters
No ratings yet
UNIX Filters
18 pages
Linux Lab CSC 371L 2 - Merged
No ratings yet
Linux Lab CSC 371L 2 - Merged
23 pages
02 Advanced Unix Commands Notes - px4D2Ov
No ratings yet
02 Advanced Unix Commands Notes - px4D2Ov
8 pages
Piping and Filter Unix
No ratings yet
Piping and Filter Unix
7 pages
Files:: Ls Ls - L Ls - A Esc K More Filename
No ratings yet
Files:: Ls Ls - L Ls - A Esc K More Filename
9 pages
Bash Ch01
No ratings yet
Bash Ch01
14 pages
UNIX Tutorial Two
No ratings yet
UNIX Tutorial Two
6 pages
Linux Ex
No ratings yet
Linux Ex
3 pages
Linux Basics: Commands & Environment
No ratings yet
Linux Basics: Commands & Environment
43 pages
Top Unix Interview Questions - Part 1
No ratings yet
Top Unix Interview Questions - Part 1
37 pages
UnixCommands Day1
No ratings yet
UnixCommands Day1
20 pages
Basic Filters & Pipes
No ratings yet
Basic Filters & Pipes
33 pages
Unix Commands
No ratings yet
Unix Commands
15 pages
Scripting Language Lab 2
No ratings yet
Scripting Language Lab 2
8 pages
Linux Commands
No ratings yet
Linux Commands
33 pages
Linux/UNIX Shell Scripting Lab Manual
No ratings yet
Linux/UNIX Shell Scripting Lab Manual
21 pages
Unix Commands
No ratings yet
Unix Commands
76 pages
Head
No ratings yet
Head
44 pages
Filer Command
No ratings yet
Filer Command
38 pages
Slide 02 2 File and Directory Commands
No ratings yet
Slide 02 2 File and Directory Commands
35 pages
Unit-2 Part 4
No ratings yet
Unit-2 Part 4
24 pages
Unix Command Guide for Beginners
No ratings yet
Unix Command Guide for Beginners
5 pages
LINUX Exercises 5 To 10 Cavimbi Alfeu
No ratings yet
LINUX Exercises 5 To 10 Cavimbi Alfeu
19 pages
OS Lab
No ratings yet
OS Lab
173 pages
OS Lab by Raushan Sir
No ratings yet
OS Lab by Raushan Sir
173 pages
0x02 Shell, IO Redirections and Filters
No ratings yet
0x02 Shell, IO Redirections and Filters
49 pages
System Cheetsheet
No ratings yet
System Cheetsheet
4 pages
Pipingfile
No ratings yet
Pipingfile
11 pages
Introduction To UNIX-Workshop On Genomics 2024 Fix
No ratings yet
Introduction To UNIX-Workshop On Genomics 2024 Fix
41 pages
Linux Command Line Lab Guide
No ratings yet
Linux Command Line Lab Guide
10 pages
Advanced Unix Commands-Tmp
No ratings yet
Advanced Unix Commands-Tmp
30 pages
SW LAB 10 Filter
No ratings yet
SW LAB 10 Filter
45 pages
Unix Commands
No ratings yet
Unix Commands
13 pages
Linux Commands for Beginners
No ratings yet
Linux Commands for Beginners
26 pages
Commands in UNIX
No ratings yet
Commands in UNIX
29 pages
UNIT-4: Filters
No ratings yet
UNIT-4: Filters
30 pages
Text Streams and Filters
No ratings yet
Text Streams and Filters
7 pages
Unit 2
No ratings yet
Unit 2
26 pages
Linux Unit 3
No ratings yet
Linux Unit 3
9 pages
Unix Commands: Simple UNIX Commands File Related Commands Directory Related Commands
No ratings yet
Unix Commands: Simple UNIX Commands File Related Commands Directory Related Commands
29 pages
How To Perform Command
No ratings yet
How To Perform Command
8 pages
Unit V
No ratings yet
Unit V
269 pages
Bash System Commands Cheat Sheet
No ratings yet
Bash System Commands Cheat Sheet
15 pages
UNIX Shell Scripting: Y.V.S Prasad
No ratings yet
UNIX Shell Scripting: Y.V.S Prasad
114 pages
Module 2
No ratings yet
Module 2
38 pages
UNIT 9 (Commands Set2)
No ratings yet
UNIT 9 (Commands Set2)
59 pages
Exercise 1.listing Files and Directories: Command Meaning Ls Ls - A Mkdir CD Directory CD CD CD .
No ratings yet
Exercise 1.listing Files and Directories: Command Meaning Ls Ls - A Mkdir CD Directory CD CD CD .
11 pages
Linux Commands Hamza
No ratings yet
Linux Commands Hamza
5 pages
Linux Practical2
No ratings yet
Linux Practical2
12 pages
Linux Basics
No ratings yet
Linux Basics
25 pages
Format Assignment ISTL
No ratings yet
Format Assignment ISTL
4 pages
ISTL Assignment 1
No ratings yet
ISTL Assignment 1
1 page
Tech Rays
No ratings yet
Tech Rays
1 page
Remote Connection (Chapter 6)
No ratings yet
Remote Connection (Chapter 6)
7 pages
SEA Assignmet 1
No ratings yet
SEA Assignmet 1
1 page
DLWP Assignment 3
No ratings yet
DLWP Assignment 3
2 pages
Experiment Using Postgresql DBMS: Exercise 1
No ratings yet
Experiment Using Postgresql DBMS: Exercise 1
3 pages
Techniques of The Observer - Jonathan Crary PDF
No ratings yet
Techniques of The Observer - Jonathan Crary PDF
15 pages
Vision CP12170 Spec
No ratings yet
Vision CP12170 Spec
2 pages
Advanced Vector Rotations
No ratings yet
Advanced Vector Rotations
4 pages
Perioperative Nutrition in Surgery
No ratings yet
Perioperative Nutrition in Surgery
19 pages
Was Were
No ratings yet
Was Were
32 pages
Module 3 - Public Financial Management-New Batch
No ratings yet
Module 3 - Public Financial Management-New Batch
112 pages
Anatomical Evidences Project 10 TH STD
No ratings yet
Anatomical Evidences Project 10 TH STD
7 pages
XI-IIT - State Wide - Weekend Results - 09.03.2025
No ratings yet
XI-IIT - State Wide - Weekend Results - 09.03.2025
13 pages
Retailing Management, 11e ISE Michael Levy PDF Version
40% (5)
Retailing Management, 11e ISE Michael Levy PDF Version
112 pages
IEO English Important Questions Class 11
No ratings yet
IEO English Important Questions Class 11
18 pages
Instagram Caption Templates Guide
No ratings yet
Instagram Caption Templates Guide
30 pages
Answer
No ratings yet
Answer
10 pages
Profile
No ratings yet
Profile
3 pages
HEALTH MELCs Grade 3
No ratings yet
HEALTH MELCs Grade 3
3 pages
Software Engineering
No ratings yet
Software Engineering
10 pages
William Gropp, Torsten Hoefler, Rajeev Thakur, Ewing Lusk Using Advanced MPI Modern Features of The Message-Passing Interface
No ratings yet
William Gropp, Torsten Hoefler, Rajeev Thakur, Ewing Lusk Using Advanced MPI Modern Features of The Message-Passing Interface
376 pages
Cell Structure & Function Guide
No ratings yet
Cell Structure & Function Guide
3 pages
Accounting Concepts & Standards Guide
No ratings yet
Accounting Concepts & Standards Guide
3 pages
Equipments of LLE
No ratings yet
Equipments of LLE
16 pages
2 3 2016 Used Element Bill Nye Water Cycle Graphic Organizer Video
No ratings yet
2 3 2016 Used Element Bill Nye Water Cycle Graphic Organizer Video
3 pages
Aec Sem Ii
No ratings yet
Aec Sem Ii
11 pages
HRD Vs PM
No ratings yet
HRD Vs PM
7 pages
Uber's Business Model Unveiled
No ratings yet
Uber's Business Model Unveiled
37 pages
MATLAB Image Processing Projects
No ratings yet
MATLAB Image Processing Projects
7 pages
Pros and Cons Chart
No ratings yet
Pros and Cons Chart
4 pages
Mep - Mar - 2025
No ratings yet
Mep - Mar - 2025
3 pages
Rosa Parks ShowFile
No ratings yet
Rosa Parks ShowFile
2 pages
Arts 7 - Q3 - M6 - Carving Out Your Niche Architectures, Sculptures, and Everyday Objects of Mindanao
No ratings yet
Arts 7 - Q3 - M6 - Carving Out Your Niche Architectures, Sculptures, and Everyday Objects of Mindanao
32 pages
Atestat Engleza 1
No ratings yet
Atestat Engleza 1
12 pages

ICB Lecturenote 7

Uploaded by

ICB Lecturenote 7

Uploaded by

Chapter 7

Playing with Text and Data Files

In this chapter, we will learn:

7.1.1 A Quick Start: cat

Displays the full content of filename.txt on the terminal.

Shows the contents of file1.txt followed by file2.txt.

 Type your content.

 Type new content.

7.1.2 Text Sorting

Arranges lines in ascending (A–Z) order.

Arranges lines in descending (Z–A) order.

Sorts lines as numbers instead of text.

sort -k2 data.txt

Sorts based on the second column.

Example 2 – Sorting and Removing Duplicates

7.1.3 Extract Unique Lines

Displays the file content with consecutive duplicates removed.

3. Show only duplicate lines

Displays only the lines that appear more than once.

Displays lines that appear exactly once.

Example – Counting Occurrences

• ?regex→ Searches backward for regex.

• n→ Jumps to the next match.

• Shift + n → Jumps to the previous match.

o Line: apple is red → becomes → orange is red.

o Line: apple apple pie → becomes → orange orange pie.

3. Between specific lines

o Only lines 2 and 3 will be checked:

Finding Matching Lines of Text using grep, egrep

Example 1: Match multiple patterns (|)

Text File Comparisons using diff command

Suppose we have two files:

1.Basic Comparison using “diff”

• In file1.txt it was grapes

Key / Command Action

Search for a word:

3. Counting Characters, words and Lines

6 lines, 42 words and 246 characters

Ex: split -l 2 genomes.txt genomes.

split The command to split files

You might also like