[go: up one dir, main page]

0% found this document useful (0 votes)
3 views13 pages

Module 1_DBMS (1)

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 13

Module 1

What is data?
Data is a collection of a distinct small unit of information. It can be used in a variety of forms like text,
numbers, media, bytes, etc. it can be stored in pieces of paper or electronic memory, etc.Word 'Data'
originates from the word 'datum' that means 'single piece of information.' It is plural of the word datum.In
computing, Data is information that can be translated into a form for efficient movement and processing.
Data is interchangeable.

What is a database?
A database is an organized collection of data, so that it can be easily accessed and managed. You can
organize data into tables, rows, columns, and index it to make it easier to find relevant information. The
main purpose of the database is to operate a large amount of information by storing, retrieving, and
managing data.

What is database management?

A database-management system (DBMS) is a collection of interrelated data and a set of programs to


access that data. This is a collection of related data with an implicit meaning and hence is a database. The
collection of data, usually referred to as the database, contains information relevant to an enterprise. The
primary goal of a DBMS is to provide a way to store and retrieve database information that is both
convenient and efficient. By data, we mean known facts that can be recorded and that have implicit
meaning.

The management system is important because without the existence of some kind of rules and regulations
it is not possible to maintain the database. We have to select the particular attributes which should be
included in a particular table; the common attributes to create a relationship between two tables; if a new
record has to be inserted or deleted then which tables should have to be handled etc. These issues must be
resolved by having some kind of rules to follow in order to maintain the integrity of the database.

Database systems are designed to manage large bodies of information. Management of data involves both
defining structures for storage of information and providing mechanisms for the manipulation of
information. In addition, the database system must ensure the safety of the information stored, despite
system crashes or attempts at unauthorized access. If data are to be shared among several users, the
system must avoid possible anomalous results. Because information is so important in most organizations,
computer scientists have developed a large body of concepts and techniques for managing data.

Database Management System (DBMS) and Its Applications:


A Database management system is a computerized record-keeping system. It is a repository or a container
for collection of computerized data files. The overall purpose of DBMS is to allow the users to define,
store, retrieve and update the information contained in the database on demand. Information can be
anything that is of significance to an individual or organization.
Databases touch all aspects of our lives. Some of the major areas of application are as follows:
1. Banking
2. Airlines
3. Universities
4. Manufacturing and selling
5. Human resources
Enterprise Information
◦ Sales: For customer, product, and purchase information.
◦ Accounting: For payments, receipts, account balances, assets and other accounting information.
◦ Human resources: For information about employees, salaries, payroll taxes, and benefits, and for
generation of paychecks.
◦ Manufacturing: For management of the supply chain and for tracking production of items in factories,
inventories of items in warehouses and stores, and orders for items. Online retailers: For sales data noted
above plus online order tracking,generation of recommendation lists,and maintenance of online product
evaluations.
Banking and Finance
◦ Banking: For customer information, accounts, loans, and banking transactions.
◦ Credit card transactions: For purchases on credit cards and generation of monthly statements.
◦ Finance: For storing information about holdings, sales, and purchases of financial instruments such as
stocks and bonds; also for storing real-time market data to enable online trading by customers and
automated trading by the firm.
• Universities: For student information, course registrations, and grades (in addition to standard enterprise
information such as human resources and accounting).
• Airlines: For reservations and schedule information. Airlines were among the first to use databases in a
geographically distributed manner.
• Telecommunication: For keeping records of calls made, generating monthly bills, maintaining balances
on prepaid calling cards, and storing information about the communication networks.

Advantages of DBMS:
● Controlling of Redundancy: Data redundancy refers to the duplication of data (i.e storing the
same data multiple times). In a database system, by having a centralized database and centralized
control of data by the DBA the unnecessary duplication of data is avoided. It also eliminates the
extra time for processing the large volume of data. It results in saving the storage space.
● Improved Data Sharing : DBMS allows a user to share the data in any number of application
programs.
● Data Integrity : Integrity means that the data in the database is accurate. Centralized control of
the data helps in permitting the administrator to define integrity constraints to the data in the
database. For example: in customer databases we can enforce an integrity that must accept the
customer only from Noida and Meerut city.
● Security : Having complete authority over the operational data, enables the DBA in ensuring that
the only means of access to the database is through proper channels. The DBA can define
authorization checks to be carried out whenever access to sensitive data is attempted.
● Data Consistency : By eliminating data redundancy, we greatly reduce the opportunities for
inconsistency. For example: if a customer address is stored only once, we cannot have
disagreement on the stored values. Also updating data values is greatly simplified when each
value is stored in one place only. Finally, we avoid the wasted storage that results from redundant
data storage.
● Efficient Data Access : In a database system, the data is managed by the DBMS and all access to
the data is through the DBMS providing a key to effective data processing
● Enforcements of Standards : With the centralized of data, DBA can establish and enforce the
data standards which may include the naming conventions, data quality standards etc.
● Data Independence : Ina database system, the database management system provides the
interface between the application programs and the data. When changes are made to the data
representation, the meta data obtained by the DBMS is changed but the DBMS continues to
provide the data to the application program in the previously used way. The DBMs handles the
task of transformation of data wherever necessary.
● Reduced Application Development and Maintenance Time : DBMS supports many important
functions that are common to many applications, accessing data stored in the DBMS, which
facilitates the quick development of applications.

Database Architecture:
The architecture of a database system is greatly influenced by the underlying computer system on which
the database system runs. Database systems can be centralized, or client-server, where one server machine
executes work on behalf of multiple client machines. Database systems can also be designed to exploit
parallel computer architectures. Distributed databases span multiple geographically separated machines

A database system is partitioned into modules that deal with each of the responsibilities of the overall
system. The functional components of a database system can be broadly divided into the storage manager
and the query processor components. The storage manager is important because databases typically
require a large amount of storage space. The query processor is important because it helps the database
system simplify and facilitate access to data.

Database applications are usually partitioned into two or three parts, as in Figure.In a two-tier
architecture, the application resides at the client machine, where it invokes database system functionality
at the server machine through query language statements. Application program interface standards like
ODBC and JDBC are used for interaction between the client and the server. In contrast, in a three-tier
architecture, the client machine acts as merely a front end and does not contain any direct database calls.
Instead, the client communicates with an application server, usually through a forms interface.

The application server in turn communicates with a database system to access data. The business logic of
the application, which says what actions to carry out under what conditions, is embedded in the
application server, instead of being distributed across multiple clients. Three-tier applications are more
appropriate for large applications, and for applications that run on the World Wide Web.
Two-tier architecture Three-tier architecture

● Query Processor:The query processor components include:


○ DDL interpreter, which interprets DDL statements and records the definitions in the
data dictionary.
○ DML compiler, which translates DML statements in a query language into an evaluation
plan consisting of low-level instructions that the query evaluation engine understands.A
query can usually be translated into any of a number of alternative evaluation plans that
all give the same result. The DML compiler also performs query optimization, that is, it
picks the lowest cost evaluation plan from among the alternatives.
○ Query evaluation engine, which executes low-level instructions generated by the DML
compiler.
● Storage Manager: A storage manager is a program module that provides the interface between
the low level data stored in the database and the application programs and queries submitted to
the system. The storage manager is responsible for the interaction with the file manager. The raw
data are stored on the disk using the file system, which is usually provided by a conventional
operating system. The storage manager translates the various DML statements into low-level
file-system commands. Thus, the storage manager is responsible for storing, retrieving, and
updating data in the database.The storage manager components include:

● Authorization and integrity manager, which tests for the satisfaction of integrity
constraints and checks the authority of users to accessdata.·
● Transaction manager, which ensures that the database remains in a consistent (correct)
state despite system failures, and that concurrent transaction executions proceed without
conflicting.·
● File manager, which manages the allocation of space on disk storage and the data
structures used to represent information stored on disk.·
● Buffer manager, which is responsible for fetching data from disk storage into main
memory, and deciding what data to cache in main memory. The buffer manager is a
critical part of the database system, since it enables the database to handle data sizes that
are much larger than the size of main memory
Transaction Manager:A transaction is a collection of operations that performs a single logical function
in a database application. Each transaction is a unit of both atomicity and consistency. Thus, we require
that transactions do not violate any database-consistency constraints. That is, if the database was
consistent when a transaction started, the database must be consistent when the transaction successfully
terminates. Transaction -manager ensures that the database remains in a consistent (correct) state despite
system failures (e.g., power failures and operating system crashes) and transaction failures.

Data Abstraction
For the system to be usable, it must retrieve data efficiently. The need for efficiency has led designers to
use complex data structures to represent data in the database. Since many database-system users are not
computer trained, developers hide the complexity from users through several levels of abstraction, to
simplify users’ interactions with the system:

● Physical level (or Internal View / Schema): The lowest level of abstraction describes how the
data are actually stored. The physical level describes complex low-level data structures in detail.
● Logical level (or Conceptual View / Schema): The next-higher level of abstraction describes
what data are stored in the database, and what relationships exist among those data. The logical
level thus describes the entire database in terms of a small number of relatively simple structures.
Although implementation of the simple structures at the logical level may involve complex
physical-level structures, the user of the logical level does not need to be aware of this
complexity. This is referred to as physical data independence. Database administrators, who must
decide what information to keep in the database,use the logical level of abstraction.
● View level (or External View / Schema): The highest level of abstraction describes only part of
the entire database. Even though the logical level uses simpler structures, complexity remains
because of the variety of information stored in a large database.Many users of the database
system do not need all this information; instead, they need to access only a part of the database.
The view level of abstraction exists to simplify their interaction with the system. The system may
provide many views for the same database. Figure shows the relationship among the three levels
of abstraction.
Data Independence
Data independence is the ability to modify the scheme without affecting the programs and the application
to be rewritten. Data is separated from the programs, so that the changes made to the data will not affect
the program execution and the application.

We know the main purpose of the three levels of data abstraction is to achieve data independence. If the
database changes and expands over time, it is very important that the changes in one level should not
affect the data at other levels of the database. This would save time and cost required when changing the
database.

There are two levels of data independence based on three levels of abstraction. These are as follows −

● Physical Data Independence


● Logical Data Independence

Physical Data Independence means changing the physical level without affecting the logical level or
conceptual level. Using this property, we can change the storage device of the database without affecting
the logical schema.

The changes in the physical level may include changes using the following −

● A new storage device like magnetic tape, hard disk, etc.


● A new data structure for storage.
● A different data access method or using an alternative files organization technique.
● Changing the location of the database.

Logical view of data is the user view of the data. It presents data in the form that can be accessed by the
end users.

Codd’s Rule of Logical Data Independence says that users should be able to manipulate the Logical View
of data without any information of its physical storage. Software or the computer program is used to
manipulate the logical view of the data. Database administrator is the one who decides what information
is to be kept in the database and how to use the logical level of abstraction. It provides the global view of
Data. It also describes what data is to be stored in the database along with the relationship.

The data independence provides the database in simple structure. It is based on application domain
entities to provide the functional requirement. It provides abstraction of system functional requirements.
Static structure for the logical view is defined in the class object diagrams. Users cannot manipulate the
logical structure of the database.

The changes in the logical level may include −

● Change the data definition.


● Adding, deleting, or updating any new attribute, entity or relationship in the database.
Data-Manipulation Language
A data-manipulation language (DML) is a language that enables users to access or manipulate data as
organized by the appropriate data model. The types of access are:

● Retrieval of information stored in the database


● Insertion of new information into the database
● Deletion of information from the database
● Modification of information stored in the database

There are basically two types:

● Procedural DMLs require a user to specify what data is needed and how to get that data.
● Declarative DMLs (also referred to as nonprocedural DMLs) require a user to specify what data
is needed without specifying how to get those data. Declarative DMLs are usually easier to learn
and use than are procedural DMLs. However, since a user does not have to specify how to get the
data, the database system has to figure out an efficient means of accessing data. A query is a
statement requesting the retrieval of information.

Data-Definition Language (DDL)

We specify a database schema by a set of definitions expressed by a special language called a


data-definition language (DDL). The DDL is also used to specify additional properties of the data.We
specify the storage structure and access methods used by the database system by a set of statements in a
special type of DDL called a data storage and definition language. These statements define the
implementation details of the database schemas, which are usually hidden from the users.

The data values stored in the database must satisfy certain consistency constraints.For example, suppose
the university requires that the account balance of a department must never be negative. The DDL
provides facilities to specify such constraints. The database system checks these constraints every time the
database is updated. In general, a constraint can be an arbitrary predicate pertaining to the database.
However, arbitrary predicates may be costly to test. Thus, database systems implement integrity
constraints that can be tested with minimal overhead.

Data Models
Data models define how the logical structure of a database is modeled. Data Models are fundamental
entities to introduce abstraction in a DBMS. Data models define how data is connected to each other and
how they are processed and stored inside the system.

The very first data model could be flat data-models, where all the data used are to be kept in the same
plane. Earlier data models were not so scientific, hence they were prone to introduce lots of duplication
and update anomalies.
1. Entity-Relationship Model

Entity-Relationship (ER) Model is based on the notion of real-world entities and relationships among
them. While formulating real-world scenarios into the database model, the ER Model creates entity set,
relationship set, general attributes and constraints. ER Model is best used for the conceptual design of a
database.

ER Model is based on −

● Entities and their attributes.


● Relationships among entities.

● Entity − An entity in an ER Model is a real-world entity having properties called attributes. Every
attribute is defined by its set of values called domain. For example, in a school database, a student
is considered as an entity. Students have various attributes like name, age, class, etc.
● Relationship − The logical association among entities is called relationship. Relationships are
mapped with entities in various ways. Mapping cardinalities define the number of associations
between two entities.

● Mapping cardinalities −

○ one to one
○ one to many
○ many to one
○ many to many

2. Relational Model
The most popular data model in DBMS is the Relational Model. It is more scientific a model than
others. This model is based on first-order predicate logic and defines a table as an n-ary relation.
The main highlights of this model are −

● Data is stored in tables called relations.


● Relations can be normalized.
● In normalized relations, values saved are atomic values.
● Each row in a relation contains a unique value.
● Each column in a relation contains values from the same domain.

3. Network Model

The network model was created to represent complex data relationships more effectively when compared
to hierarchical models, to improve database performance and standards.

It has entities which are organized in a graphical representation and some entities are accessed through
several paths. A User perceives the network model as a collection of records in 1:M relationships.

The features of a Network Model are as follows −

● Ability to Merge Relationships − In this model, because of more relationships the data is more
related. It has an ability to manage one-to-one relationships as well as many-to-many
relationships.
● Many paths − There can be more than one path to the same record because of more relationships.
It makes data access fast and simple.
● Circular Linked List − The operations in this model are done with the help of the circular linked
list. The current position is maintained with the help of a program and navigates through the
records based on relationships.

4. Object Oriented Data Modeling

● Objects: The real world entities and situations are represented as objects in the Object oriented
database model.
● Attributes and Method: Every object has certain characteristics. These are represented using
Attributes. The behavior of the objects is represented using Methods.
● Class: Similar attributes and methods are grouped together using a class. An object can be called
as an instance of the class.
● Inheritance: A new class can be derived from the original class. The derived class contains
attributes and methods of the original class as well as its own.

Integrity Constraints
● Integrity constraints are a set of rules. It is used to maintain the quality of information.
● Integrity constraints ensure that the data insertion, updating, and other processes have to be
performed in such a way that data integrity is not affected.
● Thus, integrity constraint is used to guard against accidental damage to the database.

Types of Integrity Constraints

1. Domain constraints

● Domain constraints can be defined as the definition of a valid set of values for an attribute.
● The data type of domain includes string, character, integer, time, date, currency, etc. The value of
the attribute must be available in the corresponding domain.
2. Entity integrity constraints

● The entity integrity constraint states that primary key value can't be null.
● This is because the primary key value is used to identify individual rows in relation and if the
primary key has a null value, then we can't identify those rows.
● A table can contain a null value other than the primary key field.

3. Referential Integrity Constraints

● A referential integrity constraint is specified between two tables.


● In the Referential integrity constraints, if a foreign key in Table 1 refers to the Primary Key of
Table 2, then every value of the Foreign Key in Table 1 must be null or be available in Table 2.
4. Key constraints

● Keys are the entity set that is used to identify an entity within its entity set uniquely.
● An entity set can have multiple keys, but out of which one key will be the primary key. A primary
key can contain a unique and null value in the relational table.

You might also like