[go: up one dir, main page]

0% found this document useful (0 votes)
58 views29 pages

DSA - Unit VI File Organization

Uploaded by

bsjadhav.foe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views29 pages

DSA - Unit VI File Organization

Uploaded by

bsjadhav.foe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

DSA Unit VI File Organization

File Management
A file management system is used for file maintenance (or management) operations. It
is a type of software that manages data files in a computer system.

File management, in its simplest form, refers to the process of organizing and
controlling data files stored on the cloud, or a device, such as a computer or
smartphone. Virtual filing systems involve the way files are named, stored, retrieved,
and organized.

File Handling
File Handling is the storing of data in a file using a program. In C programming
language, the programs store results, and other data of the program to a file using file
handling in C. Also, we can extract/fetch data from a file to work with it in the
program.
File handling in C enables us to create, update, read, and delete the files stored on the
local file system through our C program. The following operations can be performed
on a file.
o Creation of the new file
o Opening an existing file
o Reading from the file
o Writing to the file
o Deleting the file
o Moving to a specific location in a file
o Closing a file

Why file handling in C


There are occasions when a program‟s output, after it has been compiled and run, does not
meet our objectives. In these circumstances, the user would want to verify the program‟s
output several times. It takes a lot of work for any programmer to compile and run the same
program repeatedly. This is the precise situation where file handling is helpful.
 Reusability: The process of File handling helps the user by allowing him to preserve all
the information or the data that is generated after he runs the designed program.
 Saves Time: In some cases, the programs need a large amount of user input. Here, file
handling allows the user to access part of the code using individual commands easily.
 Commendable storing capacity: While storing data in the files, the user can leave behind
the stress of storing information in bulk in the program.
 Portability: The availability of contents in any file is easily transferable to another
without any hassle and data loss in the system. Hence, saving a lot of time and effort along
with minimizing the risk of flawed coding.

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 1


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

What is File?
A file a container in a computer system that stores data, information, settings, or commands,
which are used with a computer program. In graphical user interface (GUI), such as
Microsoft operating systems, represent the files as icons, which associate to the program that
opens the file. For instance, the picture is shown as an icon;

Functions for file handling


There are many functions in the C library to open, read, write, search and close the file. A list
of file functions are given below:

No. Function Description

1 fopen() opens new or existing file

2 fprintf() write data into the file

3 fscanf() reads data from the file

4 fputc() writes a character into the file

5 fgetc() reads a character from file

6 fclose() closes the file

7 fseek() sets the file pointer to given position

8 fputw() writes an integer to file

9 fgetw() reads an integer from file

10 ftell() returns current position

11 rewind() sets the file pointer to the beginning of the file

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 2


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

Opening File: fopen()


We must open a file before it can be read, write, or update. The fopen() function is used to
open a file. The syntax of the fopen() is given below.
FILE *fopen( const char * filename, const char * mode );
The fopen() function accepts two parameters:
o The file name (string). If the file is stored at some specific location, then we must
mention the path at which the file is stored. For example, a file name can be
like "c://some_folder/some_file.ext".
o The mode in which the file is to be opened. It is a string.
We can use one of the following modes in the fopen() function.

Mode Description

r opens a text file in read mode

w opens a text file in write mode

a opens a text file in append mode

r+ opens a text file in read and write mode

w+ opens a text file in read and write mode

a+ opens a text file in read and write mode

rb opens a binary file in read mode

wb opens a binary file in write mode

ab opens a binary file in append mode

rb+ opens a binary file in read and write mode

wb+ opens a binary file in read and write mode

ab+ opens a binary file in read and write mode


The fopen function works in the following way.
o Firstly, It searches the file to be opened.
o Then, it loads the file from the disk and place it into the buffer. The buffer is used to
provide efficiency for the read operations.
o It sets up a character pointer which points to the first character of the file.
Consider the following example which opens a file in write mode.

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 3


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

Closing File: fclose()


The fclose() function is used to close a file. The file must be closed after performing all the
operations on it. The syntax of fclose() function is given below:
int fclose( FILE *fp );

Concept of Fields, Records and Files


Fields:
A combination of one or more characters is called field. It is the smallest unit of data that can
be accessed by the user. The name of each field in a record is unique. The data type of a field
indicates the type of data that can be stored in the field. Each field contains one specific piece
of information. A field size defines the maximum number of characters that can be stored in a
field.
For example, Employee Number, Employee Name, Grade and Designation are fields.
Employee Number 0007

Employee Name Umar akmal

Grade 1/4

Designation Senior Manager

Records:
A collection of related fields treated as a single as a single unit is called a record. For
example a employee‟s record includes a set of fields that contains Employer Number,
Employee Name, Grade and designation etc.

Files:
A collection of related records treated as a single unit is called file. File is also known as data
set. Files are stored in disk like hard disk, CD-ROM or DVD-ROM etc. A Student file may
contain the records of hundreds of students. Each student‟s record consists of same fields but
each field contains different data.

Types of file:
1) Text File: A text file is the one in which data is stored in the form of ASCII characters
and is normally used for storing a stream of characters. Text files are organized around lines,
each of which ends with a newline character („\n‟). The source code files are themselves text
files.

2) Binary File
A binary file is the one in which data is stored in the file in the same way as it is stored in the
main memory for processing. It is stored in binary format instead of ASCII characters. It is
normally used for storing numeric information (int, float, double). Normally a binary file can
be created only from within a program and its contents can be read only by a program.

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 4


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

File Organization:
File organization ensures that records are available for processing. It is used to determine an
efficient file organization for each base relation.
File organization defines the way the data record is stored and retrieved from the files. File
organization provides the data record relationship specifying how the data records are
mapped to the disk blocks. We can either store fixed length record in the files or use flexible
file structure to support data records of varying length.
There are multiple ways of file organization such as sequential, direct, indexed sequential and
others. We can select the file organization type based on the use case, information retrieval
performance, ease of information management and others.
Given below are the most commonly used file organizations:
 Sequential files – The data records are stored in a pre-defined order. For instance, the data
records are sequenced based on the key values. In this file organization type we need to
sequentially access the record.
 Relative files – The data record is stored at a fixed location that can be accessed using a
static key value. As we can access the data record using a fixed value, we can retrieve or
insert the data record randomly in this file organization type.
 Direct files – Similar to the relative file that supports non-integer key value.
 Indexed sequential files – In this file organization we add an index to provide the random
access to the sequential files.
 Indexed files – The data records are indexed to improve the performance of the data
retrieval. Figure 1 provides various file organizations that are most commonly used.

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 5


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

Types of File Organization


There are three types of organizing the file:
1. Sequential access file organization
2. Direct access file organization
3. Indexed sequential access file organization

1. Sequential access file organization


 A sequential file is a file made up of a sequence of individual records. Each record contains
some logically related information, while all the records taken together contain the sum total
of information in the file. One of the distinguishing features of sequential files is that the
records are arranged in a particular sequence and can only be accessed in that order. The file
is written one record at a time, from beginning to end, and afterwards can only be read in the
same order.
 Storing and sorting in contiguous block within files on tape or disk is called as sequential
access file organization.
 In sequential access file organization, all records are stored in a sequential order. The
records are arranged in the ascending or descending order of a key field.
 Sequential file search starts from the beginning of the file and the records can be added at
the end of the file.
 In sequential file, it is not possible to add a record in the middle of the file without rewriting
the file.
 To perform file processing, include the <fstream> header, which provides classes for file
input (ifstream), file output (ofstream), and both input and output (fstream).

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 6


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

Primitive operations on sequential file:


1. Creating a File:
o To create a file, you can use the ofstream class.
o Specify the filename and open mode (e.g., ios_base::out) during object creation.
o If the file already exists, you can choose to overwrite it or append data to it.
2. Reading from a File:
o Use the ifstream class to read data from a file.
o Open the file with the appropriate mode (e.g., ios_base::in).
o Read data using methods like getline(), >>, or read().
3. Writing to a File:
o For writing data to a file, use the ofstream class.
o Open the file with the mode ios_base::out.
o Write data using methods like << or write().
4. Other Operations:
o Apart from reading and writing, you can perform additional operations like seeking within
the file, checking for end-of-file, and error handling.

Advantages of sequential file


 It is simple to program and easy to design.
 Sequential file is best use if storage space.
Disadvantages of sequential file
 Sequential file is time consuming process.
 It has high data redundancy.
 Random searching is not possible.

2. Direct access file organization


 Direct access file is also known as random access or relative file organization.
 In direct access file, all records are stored in direct access storage device (DASD), such as
hard disk. The records are randomly placed throughout the file.
 The records does not need to be in sequence because they are updated directly and rewritten
back in the same location.

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 7


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

 This file organization is useful for immediate access to large amount of information. It is
used in accessing large databases.
 It is also called as hashing.

Advantages of direct access file organization


 Direct access file helps in online transaction processing system (OLTP) like online railway
reservation system.
 In direct access file, sorting of the records are not required.
 It accesses the desired records immediately.
 It updates several files quickly.
 It has better control over record allocation.
Disadvantages of direct access file organization
 Direct access file does not provide back up facility.
 It is expensive.
 It has less storage space as compared to sequential file.

3. Indexed sequential access file organization


 Indexed sequential access file combines both sequential file and direct access file
organization.
 In indexed sequential access file, records are stored randomly on a direct access device such
as magnetic disk by a primary key.
 This file have multiple keys. These keys can be alphanumeric in which the records are
ordered is called primary key.
 The data can be access either sequentially or randomly using the index. The index is stored
in a file and read into memory when the file is opened.

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 8


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

Advantages of Indexed sequential access file organization


 In indexed sequential access file, sequential file and random file access is possible.
 It accesses the records very fast if the index table is properly organized.
 The records can be inserted in the middle of the file.
 It provides quick access for sequential and direct processing.
 It reduces the degree of the sequential search.
Disadvantages of Indexed sequential access file organization
 Indexed sequential access file requires unique keys and periodic reorganization.
 Indexed sequential access file takes longer time to search the index for the data access or
retrieval.
 It requires more storage space.
 It is expensive because it requires special software.
 It is less efficient in the use of storage space as compared to other file organizations.

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 9


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

Types of indexing
1. Primary Index
 In primary indexing, the data file is sorted or clustered based on the primary key as
shown in the below figure.
 An index file (also known as the index table) is created alongside the data file.
 The index file contains pairs of primary key values and pointers to the corresponding
data records.
 Each entry in the index file corresponds to a block or page in the data file.

Types of Primary Indexing


Dense Indexing:
 In Dense Index has an index entry for every search key value in the data file. This
approach ensures efficient data retrieval but requires more storage space.
 It contains an index record for every search key value in the file. This will result in
making searching faster. The total number of records in the index table and main table
are the same. It will result in the requirement for more space to store the index of
records itself.
No of Index Entry = No of DB Record

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 10


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

Advantages
1. Gives quick access to records, particularly for Small datasets.
2. Effective for range searches since each key value has an entry.
Disadvantages
1. Can be memory-intensive and may require a significant amount of storage space.
2. Insertions and deletions result in a higher maintenance overhead because the index must
be updated more frequently.

Sparse Indexing:
Sparse indexing involves having fewer index entries than records. The index entries point
to blocks of records rather than individual records. While it reduces storage overhead, it
may require additional disk accesses during retrieval.
OR
Sparse index contains an index entry only for some records. In the place of pointing to all
the records in the main table index points records in a specific gap. This indexing helps you
to overcome the issues of dense indexing in DBMS.
No of Index Entry = No of Block

Advantages
1. Uses less storage space than thick indexes, particularly for large datasets.
2. Lessens the effect that insertions and deletions have from index maintenance
operations.
Disadvantages
1. Since there may not be an index entry for every key value, access may involve
additional steps.
2. might not be as effective as dense indexes for range queries.

Difference Between Dense Index and Sparse Index


Dense index Sparse index

The index size is larger in dense index. In sparse index, the index size is smaller.

Time to locate data in index table is less. Time to locate data in index table is more.

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 11


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

Dense index Sparse index

There is more overhead for insertions and Sparse indexing have less overhead for
deletions in dense index. insertions and deletions.

Records in dense index need not to be In case of sparse index, records need to be
clustered. clustered.

Computing time in RAM (Random access In sparse index, computing time in RAM is
memory) is less with dense index. more.

Data pointers in dense index point to each In sparse index, data pointers point to
record in the data file. fewer records in data file.

In sparse index, search performance may


Search performance is generally faster in
require additional steps, which will result
dense index.
in slowing down the process.

Advantages of Primary Indexing


 Primary indexing allowed fast retrieval of records based on their primary key values.
 Primary indexing reduces the need for full-table scans or sequential lookups, due to
which it reduce disk I/O operations.
 The data file is organized based on the primary key, ensuring that records are stored in a
logical order. This organization simplifies range queries and other operations.
 Each block in the data file corresponds to an entry in the primary index. Therefore, the
average number of blocks accessed using the primary index can be estimated as
approximately log₂(B + 1), where B represents the number of index blocks.
Disadvantages of Primary Indexing
 Primary indexing requires additional space to store the index field alongside the actual
data records. This extra storage can impact overall system costs, especially for large
datasets.
 When records are added or deleted, the data file needs reorganization to maintain the
order based on the primary key.
 After record deletion, the space used by the deleted record must be released for reuse by
other records. Managing this memory cleanup can be complex.

2. Secondary Index
Secondary indexing is a database management technique used to create additional
indexes on data stored in a database. The main purpose of secondary indexing is to
improve the performance of queries and to simplify the search for specific records
within a database. A secondary index provides an alternate means of accessing data in a
database, in addition to the primary index. The primary index is typically created when
the database is created and is used as the primary means of accessing data in the
database. Secondary indexes, on the other hand, can be created and dropped at any
time, allowing for greater flexibility in managing the database.

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 12


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

Benefits
 Improved Query Performance: Secondary indexes can improve the performance of
queries by reducing the amount of data that needs to be scanned to find the desired
records.
 Flexibility: Secondary indexes provide greater flexibility in managing a database, as
they can be created and dropped at any time.
 Simplified Search: Secondary indexes simplify the search for specific records within a
database, making it easier to find the desired data.
 Reduced Data Storage Overhead: Secondary indexes use a compact data structure
that requires less space to store compared to the original data.
Types of Secondary Indexes
 B-tree Index: A B-tree index is a type of index that stores data in a balanced tree
structure. B-tree indexes are commonly used in relational databases and provide
efficient search, insert, and delete operations.
 Hash Index: A hash index is a type of index that uses a hash function to map data to a
specific location within the index. Hash indexes are commonly used in non-relational
databases, such as NoSQL databases, and provide fast access to data.
 Bitmap Index: A bitmap index is a type of index that uses a bitmap to represent the
data in a database. Each bit in the bitmap represents a specific record in the database,
and the value of the bit indicates whether the record is present or not. Bitmap indexes
are commonly used in data warehousing and business intelligence applications, as they
provide efficient access to large amounts of data.
When to Use Secondary Indexing
Secondary indexing should be used in database management systems when there is a need
to improve the performance of data retrieval operations that search for data based on
specific conditions. Secondary indexing is particularly useful in the following scenarios:
 Queries with Complex Search Criteria: Large Data Sets
 Frequently Accessed Data:
 Sorting and Aggregating Data:
 Data Structure:
3. Clustering Index
What is Clustering Indexing?
Clustering indexing is a database indexing technique that is used to physically arrange the
data in a table based on the values of the clustered index key. This means that the rows in
the table are stored on disk in the same order as the clustered index key. With a clustered
index, the database can more efficiently retrieve data because it doesn‟t have to scan the
entire table to find the data it needs. Instead, it can use the clustered index to quickly locate
the data, resulting in faster query execution times and improved overall performance.
Advantages
 Improved Query Performance: Clustering indexing results in faster query
performance, as the data is stored in a way that makes it easier to retrieve the desired
information. This is because the index is built based on the clustered data, reducing the
number of disk I/Os required to retrieve the data.
 Reduced Disk Space Usage: Clustering indexing reduces the amount of disk space
required to store the index. This is because the index contains only the information
necessary to retrieve the data, rather than storing a copy of the data itself.
 Better Handling of Complex Queries: Clustering indexing provides better
performance for complex queries that involve multiple columns. This is because the
data is stored in a way that makes it easier to retrieve the relevant information.

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 13


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

 Improved Insert Performance: Clustering indexing can result in improved insert


performance, as the database does not have to update the index every time a new record
is inserted.
 Improved Data Retrieval: Clustering indexing can also improve the efficiency of data
retrieval operations. In a clustered index, the data is stored in a logical order, which
makes it easier to locate and retrieve the data. This can result in faster data retrieval
times, particularly for large databases.
Disadvantages
 Increased Complexity: Clustering indexing is a more complex technology compared to
other indexing mechanisms, such as B-Tree indexing.
 Reduced Update Performance: Clustering indexing can result in reduced update
performance, as the database must reorganize the data to reflect the changes.
 Limited to One Clustered Index: A table can have only one clustered index, as having
multiple clustered indexes would result in conflicting physical orderings of the data.
When to Use Clustering Indexing
Clustering indexing is a useful technique for improving the performance of database
queries and data storage. However, it‟s important to understand when to use clustering
indexing and when it may not be the best choice.
Here are some guidelines on when to use clustering indexing:
 When Data is Often Retrieved in a Specific Order: If your queries often retrieve data
in a specific order, clustering indexing can be a great choice. By physically arranging
the data in the table according to the clustered index key, the database can quickly
locate and retrieve the data it needs.
 When Query Performance is a Concern: If query performance is a concern,
clustering indexing can be a great option. By using the clustered index to quickly locate
the data
 When Disk Space Utilization is a Concern: Clustering indexing can also help to
improve disk space utilization.

File Handling through C++ Classes


File handling is used to store data permanently in a computer. Using file handling we can
store our data in secondary memory (Hard disk).
How to achieve the File Handling
For achieving file handling we need to follow the following steps:-
STEP 1-Naming a file
STEP 2-Opening a file
STEP 3-Writing data into the file
STEP 4-Reading data from the file
STEP 5-Closing a file.

Streams in C++ :-
We give input to the executing program and the execution program gives back the output.
The sequence of bytes given as input to the executing program and the sequence of bytes
that comes as output from the executing program are called stream. In other words, streams
are nothing but the flow of data in a sequence.

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 14


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

The input and output operation between the executing program and the devices like
keyboard and monitor are known as “console I/O operation”. The input and output
operation between the executing program and files are known as “disk I/O operation”.
Classes for File stream operations :-
The I/O system of C++ contains a set of classes which define the file handling methods.
These include ifstream, ofstream and fstream classes. These classes are derived from
fstream and from the corresponding iostream class. These classes, designed to manage the
disk files, are declared in fstream and therefore we must include this file in any program
that uses files.
1. ios:-
 ios stands for input output stream.
 This class is the base class for other classes in this class hierarchy.
 This class contains the necessary facilities that are used by all the other derived
classes for input and output operations.
2. istream:-
 istream stands for input stream.
 This class is derived from the class „ios‟.
 This class handle input stream.
 The extraction operator(>>) is overloaded in this class to handle input streams from
files to the program execution.
 This class declares input functions such as get(), getline() and read().

3. ostream:-
 ostream stands for output stream.
 This class is derived from the class „ios‟.
 This class handle output stream.
 The insertion operator(<<) is overloaded in this class to handle output streams to files
from the program execution.
 This class declares output functions such as put() and write().
4. streambuf:-
 This class contains a pointer which points to the buffer which is used to manage the
input and output streams.
5. fstreambase:-
 This class provides operations common to the file streams. Serves as a base for fstream,
ifstream and ofstream class.
 This class contains open() and close() function.
6. ifstream:-
 This class provides input operations.
 It contains open() function with default input mode.
 Inherits the functions get(), getline(), read(), seekg() and tellg() functions from the
istream.
7. ofstream:-
 This class provides output operations.
 It contains open() function with default output mode.
 Inherits the functions put(), write(), seekp() and tellp() functions from the ostream.
8. fstream:-
 This class provides support for simultaneous input and output operations.
 Inherits all the functions from istream and ostream classes through iostream.
9. filebuf:-
 Its purpose is to set the file buffers to read and write.

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 15


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

 We can also use file buffer member function to determine the length of the file.

In C++, files are mainly dealt by using three classes fstream, ifstream, ofstream available in
fstream headerfile.
ofstream: Stream class to write on files
ifstream: Stream class to read from files
fstream: Stream class to both read and write from/to files.

Example 1:
/* File Handling with C++ using ifstream & ofstream class object*/
/* To write the Content in File*/
/* Then to read the content of file*/
#include <iostream>

/* fstream header file for ifstream, ofstream,


fstream classes */
#include <fstream>

using namespace std;

// Driver Code
int main()
{
// Creation of ofstream class object
ofstream fout;

string line;

// by default ios::out mode, automatically deletes


// the content of file. To append the content, open in ios:app

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 16


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

// fout.open("sample.txt", ios::app)
fout.open("sample.txt");

// Execute a loop If file successfully opened


while (fout) {

// Read a Line from standard input


getline(cin, line);

// Press -1 to exit
if (line == "-1")
break;

// Write line in file


fout << line << endl;
}

// Close the File


fout.close();

// Creation of ifstream class object to read the file


ifstream fin;

// by default open mode = ios::in mode


fin.open("sample.txt");

// Execute a loop until EOF (End of File)


while (getline(fin, line)) {

// Print line (read from file) in Console


cout << line << endl;
}
// Close the file
fin.close();
return 0;
}

Example 2:
/* File Handling with C++ using fstream class object */
/* To write the Content in File */
/* Then to read the content of file*/
#include <iostream>

/* fstream header file for ifstream, ofstream,


fstream classes */

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 17


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

#include <fstream>

using namespace std;

// Driver Code
int main()
{
// Creation of fstream class object
fstream fio;

string line;

// by default openmode = ios::in|ios::out mode


// Automatically overwrites the content of file, To append
// the content, open in ios:app
// fio.open("sample.txt", ios::in|ios::out|ios::app)
// ios::trunc mode delete all content before open
fio.open("sample.txt", ios::trunc | ios::out | ios::in);

// Execute a loop If file successfully Opened


while (fio) {

// Read a Line from standard input


getline(cin, line);

// Press -1 to exit
if (line == "-1")
break;

// Write line in file


fio << line << endl;
}

// Execute a loop until EOF (End of File)


// point read pointer at beginning of file
fio.seekg(0, ios::beg);

while (fio) {

// Read a Line from File


getline(fio, line);

// Print line in Console


cout << line << endl;
}

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 18


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

// Close the file


fio.close();

return 0;
}

Example 3:
Q: write a single file handling program in c++ to reading and writing data on a
file.

#include<iostream>
#include<fstream>

using namespace std;


main()
{
int rno,fee;
char name[50];

cout<<"Enter the Roll Number:";


cin>>rno;

cout<<"\nEnter the Name:";


cin>>name;

cout<<"\nEnter the Fee:";


cin>>fee;

ofstream fout("d:/student.doc");

fout<<rno<<"\t"<<name<<"\t"<<fee; //write data to the file student

fout.close();

ifstream fin("d:/student.doc");

fin>>rno>>name>>fee; //read data from the file student

fin.close();

cout<<endl<<rno<<"\t"<<name<<"\t"<<fee;

return 0;
}

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 19


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

Example 4:
// Q: WA C++ file handling program to read data from the file called student.doc

#include<iostream>
#include<fstream>

using namespace std;

main()
{
int rno,fee;
char name[50];
ifstream fin("d:/student.doc");
fin>>rno>>name>>fee; //read data from the file student
fin.close();
cout<<endl<<rno<<"\t"<<name<<"\t"<<fee;
return 0;
}

Explain different file opening modes in C++


Here, we have opened the file using an instance of the ofstream or fstream class using two
arguments: filename and opening mode. A file can be opened in different modes, as shown in
the following table.

File Opening Description


mode

ios::in Open file for reading data.

ios::out Creates file if it does not exist. All existing data is erased from the file as new data is
written into the file.

ios::app Creates file if it does not exist. New data is written at the end of the file without
disturbing the existing data.

ios::ate Open file for output. Data can be written anywhere in the file, similar to append.

ios::trunc Discard file content, which is the default action for ios::out.

ios::binary Open file in binary (non-text) mode.

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 20


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

Linked organization
In a linked organization, the elements or components of the system are connected through
relationships, but the order in which they occur may not be fixed. In Linked Organization,
elements might or might not be stored in consecutive memory locations and the order is
determined by the links between elements. This makes it easy to insert and delete elements
without requiring any movement of other elements and it can be extended or reduced
according to requirements.
Linked organizations differ from sequential organizations essentially in that the logical
sequence of recordsis generally different from the physical sequence.
• In sequential i th record is placed at location li, then the i+1strecord is placed at li + c
where c is the length of i th record or some fixed constant.
• In linked organization the next logical record is obtained by following link value from
present record. Linking in order of increasing primary key eases insertion deletion.
• Searching for a particular record is difficult since no index is available, so only sequential
search possible.
• We can facilitate indexes by maintaining indexes corresponding to ranges of employee
numbers eg. 501-700, 701-900. all records with same range will be linked together in a list.
• We can generalize this idea for secondary key level also. We just set up indexes for each
key and allow recordsto be in more than one list. This leads to the multilist structure for file
representation.

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 21


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

Difference Between Sequential Organization and Linked


Organization:

S.N Linked Organization Sequential Organization

1. Data is stored in nodes that are linked Data is stored in a linear sequence
together

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 22


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

S.N Linked Organization Sequential Organization

2. Each node contains data and a pointer to Each element is stored one after the other
the next node

3. Allows for fast insertions and deletions Allows for fast traversal and access

4. More complex to implement Simpler to implement

5. Can be used for implementing data Can be used for implementing data
structures like linked lists and trees structures like arrays and stacks

6. Requires more memory for pointers Less memory is required as no pointers


needed

7. Can be used for dynamic data structures Suitable for static data structures

8. Flexible in terms of size and structure Fixed-size and structure

9. Random access is not possible Random access is possible

10. Pointers may be pointing to null or non- All elements are stored in contiguous
existent memory in case of broken links memory

11. Can have multiple pointers in case of Only one pointer is required to traverse
bidirectional links the list

12. Can have a circular linked list Only a linear sequential list is possible

13. Can have multiple head and tail pointers Only one head and tail pointer is required

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 23


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

S.N Linked Organization Sequential Organization

14. Can have variable length of data in each Fixed length of data in each element
node

15. Can be used for implementing more Can be used for simple data structures like
complex data structures like graphs. queues and stacks.

Coral Rings
This doubly linked multilist structure is used. Each list is circular list with headnode •
„Alink‟field is used to link all records with same key value
• „Blink‟ is used for some records back pointer and for others it is pointer to head
• node. Owing to these back pointers, deletion is easy without going to start.
• Indexes are maintained as per multilists.

Inverted Files
• Inverted files are similar to multilists. Multilists records with the same key value are linked
together with link information being kept in individual record. In case of inverted files the
link information is kept in index itself.
• EG. We assume that every key is dense. Since the index entries are variable length, index
maintenance becomes complex fro multilists. Benefits being Boolean queries require only
one access per record satisfying the query. Queries of type k1=xx and k2=yy can be handled
similarly by intersecting two lists.

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 24


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

• The retrieval works in two steps. In the first step, the indexes are processed to obtain a list
of records satisfying the query and in the second, these records are retrieved using the list.
The no. of disk accesses needed is equal to the no. of records being retrieved + the no. to
process the indexes.
• Inverted files represent one extreme of file organization in which only the index structures
are important. The records themselves can be stored in any way.
• Inverted files may also result in space saving compared with other file structures when
record retrieval doesn‟t require retrieval of key fields. In this case key fields may be deleted
from the records unlike multilist structures

Cellular Partitions
To reduce the file search times, the storage media may be divided into cells. A cell may be an
entire disk pack or it may simply be a cylinder.
• Lists are localized to lie within a cell.
• A multilist organization in which the list for key1=prog list included records on several
different cylinders then we could break the list into several smaller lists where each prog list
included only those records in the same cylinder.
• The index entry for prog will now contain several entries of the type (addr, length) where
addr is a pointer to start of a list of records with key1=prog and length is the no. of records on
the list.
• By doing this all records of the same cell may be accessed without moving the read/write
heads.

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 25


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

Multiway Merging
 Merging is a process of combining two or more types of structures into one single
structure, which is an important component of algorithms such as merge sort.
 If we have two or generally more arrays, we can merge them and get a single array or
list.
 Practically speaking, the size of the merged array is the same as the total size of the
merged arrays.
Two-Way Merge
A two-way merging, also known as binary merging, is generally an algorithm that takes two
sorted lists and merges them into one list in sorted order. It‟s a widely used approach in
merge sort that outputs the minimum item in each step given list in sorted order and builds a
sorted list with all items in any of the input lists in proportion to the total of the lengths of the
input lists.
If we are merging two arrays size and respectively, then the merged array size will be the
sum of and . Furthermore, merging requires maximum comparisons to get the merged array.
The detailed implementation of the two-way merge is shown in the diagrams below.
We compare the first items in two arrays in sorted order and then append the smaller element
to the output array:

After appending the smaller item to our output array, we continue comparing and appending
them to until the input arrays are empty:

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 26


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

At this point, is empty. The other input array must still be empty, so we take the smaller item
in and append it to . We repeat this process until is empty.

k-way merge
The k-way merge pattern involves merging k sorted arrays into a single sorted array. It
exploits the sorted input to achieve merged order efficiently.
-way merge algorithms, also known as multiway merges, are algorithms that take -sorted lists
as input and produce one sorted list as an output with size equal to the sum of sizes of all
input arrays.
The term “-way merge algorithms” refers to merging approaches that accept more than two
sorted lists:

Pseudocode for the -way merging algorithm:

The detailed implementation of the – way merge is shown in the diagram below:
Use k-way merge when:
 You need to combine multiple sorted arrays or lists
 Finding the kth smallest/largest element in sorted arrays

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 27


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

We first create tournament trees by comparing the list‟s index items and append the winner
with the least value to the output list:

We continue the process till the lists are empty:

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 28


© Copyright 2023-2024 www.javatpoint.com.
DSA Unit VI File Organization

Subject Teacher : Ms. Atole S. M. 210252: Data Structures and Algorithms 29


© Copyright 2023-2024 www.javatpoint.com.

You might also like