Introduction to data
bases
Lecturer#1
Chapter#1
DATABASE/DBMS
Components of DBMS Environment
Five major components
1.Hardware: DBMS and applications require hardware to run e.g. personal
computer single personal computer to single main frame or network of computers.
Depends on organization requirements. Client server is an important architecture
2.Software: It includes DBMS software it self and application programs together
with operating systems
Application programs are written in third generation languages e.g. C,C+
+,C# ,java etc.
3.Data: The most important component of the DMS is the Data. It act as a bridge
between machine component and human component.
Data
• Perhaps the most important component of the DBMS environment—certainly from
the end-users’ point of view—is the data. In Figure 1.8, we observe that the data acts
as a bridge between the machine components and the human components. The
database contains both the operational data and the metadata, the “data about data.”
The structure of the database is called the schema. In Figure 1.7, the schema consists
of four files, or tables, namely: PropertyForRent, PrivateOwner, Client, and Lease. The
PropertyForRent table has eight fields, or attributes, namely: propertyNo, street, city,
zipCode, type (the property type), rooms (the number of rooms), rent (the monthly
rent), and ownerNo. The ownerNo attribute models the relationship between
PropertyForRent and PrivateOwner: that is, an owner Owns a property for rent, as
depicted in the ER diagram of Figure 1.6. For example, in Figure 1.2 we observe that
owner CO46, Joe Keogh, owns property PA14. The description of the data is known as
the system catalog (or data dictionary or metadata—the “data about data”).
Components of DBMS Environment:
4. Procedures
Procedures: refer to the instructions and rules that govern the design and use of the
database. The users of the system and the staff who manage the database require
documented procedures on how to use or run the system. These may consist of
instructions on how to:
• Log on to the DBMS.
• Use a particular DBMS facility or application program.
• Start and stop the DBMS.
• Make backup copies of the database.
• Handle hardware or software failures. This may include procedures on how to
identify the failed component, how to fix the failed component (for example,
telephone the appropriate hardware engineer), and, following the repair of the
fault, how to recover the database.
• Change the structure of a table, reorganize the database across multiple disks,
improve performance, or archive data to secondary storage.
5.Roles in Database
environment :People
• Data Administrator (DA): is responsible for the management of the
data resource,
including database planning; development and maintenance of
standards, policies
and procedures; and conceptual/logical database design. The DA
consults with and
advises senior managers, ensuring that the direction of database
development will
ultimately support corporate objectives.
• The Database Administrator (DBA): is responsible for the physical
realization of the database, including physical database design and
implementation, security and integrity control, maintenance of the
operational system, and ensuring satisfactory performance of the
applications for users. The role of the DBA is more technically
oriented than the role of the DA, requiring detailed knowledge of the
target DBMS and the system environment. In some organizations
there is no distinction between these two roles; in others, the
importance of the corporate resources is reflected in the allocation of
teams of staff dedicated to each of these roles. We discuss data and
database administration in more detail in Section 20.15
5.Roles in Database
environment :People
• Database Designers: In large database design projects, we can
distinguish between two types of designer: logical database
designers and physical database designers. The logical database
designer is concerned with identifying the data (that is, the
entities and attributes), the relationships between the data, and
the constraints on the data that is to be stored in the database.
The logical database designer must have a thorough and complete
understanding of the organization’s data and any constraints on
this data (the constraints are sometimes called business rules).
The physical database designer
• The physical database designer decides how the logical database
design is to be physically realized.
• This involves: mapping the logical database design into a set of tables
and integrity constraints;.
• selecting specific storage structures and access methods for the data
to achieve good performance; designing any security measures
required on the data
5.Roles in Database
environment :People
• Application Developers: Once the database has been implemented,
the application programs that provide the required functionality for
the end-users must be implemented. This is the responsibility of the
application developers. Typically, the application developers work
from a specification produced by systems analysts. Each program
contains statements that request the DBMS to perform some
operation on the database, which includes retrieving data, inserting,
updating, and deleting data. The programs may be written in a third-
generation or fourth-generation programming language, as discussed
previously.
End Users
• The end-users are the “clients” of the database, which has been designed and implemented and
is being maintained to serve their information needs.
• End-users can be classified according to the way they use the system:
• Naïve users are typically unaware of the DBMS. They access the database through specially
written application programs that attempt to make the operations as simple as possible. They
invoke database operations by entering simple commands or choosing options from a menu. This
means that they do not need to know anything about the database or the DBMS. For example,
the checkout assistant at the local supermarket uses a bar code reader to find out the price of the
item. However, there is an application program present that reads the bar code, looks up the
price of the item in the database, reduces the database field containing the number of such items
in stock, and displays the price on the till. • Sophisticated users. At the other end of the spectrum,
the sophisticated end-user is familiar with the structure of the database and the facilities offered
by the DBMS.
• Sophisticated end-users may use a high-level query language such as SQL to perform the required
operations. Some sophisticated end-users may even write application programs for their own use.
History of dbms
Time Frame Development Comments
1960 onwards Files based Early stage of the database system. Decentralized approach:
systems each department stored and controlled its own data.
Mid-1960s hierarchical and Represents first-generation DBMSs. Main hierarchical system
network data is IMS from IBM and the main network system is IDMS/R from
models Computer Associates. Lacked data independence and required
complex programs to be developed to process the data.
1970 Relational model Publication of e. F. Codd’s seminal paper “A relational model
proposed of data for large shared data banks,” which addresses the
weaknesses of first-generation systems.
1970s Prototype RDBMSs During this period, two main prototypes emerged: the Ingres
developed project at the University of California at Berkeley (started in
1970) and the System R project at IBM’s San José Research
Laboratory in California (started in 1974), which led to the
development of SQL.
Time Frame Development Comments
1976 ER Model Proposed Publication of Chen’s paper “The entity-
Relationship model— Toward a unified view of
data.” eR modeling becomes a significant
component in methodologies for database
design.
1979 Commercial RDBMSs appear Commercial RDBMSs like Oracle, Ingres, and DB2
appear. These represent the second generation of
DBMSs.
1987 ISO SQL standard SQL is standardized by the ISO (International
Standards Organization). There are subsequent
releases of the standard in 1989, 1992 (SQL2),
1999 (SQL:1999), 2003 (SQL:2003), 2008
(SQL:2008), and 2011 (SQL:2011).
1990 OODBMS and ORDBMSs This period initially sees the emergence of
appear OODBMSs and later ORDBMSs (Oracle 8, with
object features released in 1997).
1990s Data warehousing systems This period also see releases from the major
appear DBMS vendors of data warehousing systems and
thereafter data mining products.
Time Frame Development Comments
Mid-1990 Web database integration The first Internet database applications appear.
DBMS vendors and third-party vendors recognize
the significance of the Internet and support web–
database integration.
1998 XML XML 1.0 ratified by the W3C. XML becomes
integrated with DBMS products and native XML
databases are developed.
Advantages of dbms
• Control of data redundancy: As we discussed in Section 1.2, traditional file-
based systems waste space by storing the same information in more than
one file. For example, in Figure 1.5, we stored similar data for properties for
rent and clients in both the Sales and Contracts Departments. In contrast,
the database approach attempts to eliminate the redundancy by integrating
the files so that multiple copies of the same data are not stored. However,
the database approach does not eliminate redundancy entirely, but
controls the amount of redundancy inherent in the database. Sometimes it
is necessary to duplicate key data items to model relationships; at other
times, it is desirable to duplicate some data items to improve performance.
The reasons for controlled duplication will become clearer as you read the
next few chapters.
Advantages of DBMS
• Data consistency: By eliminating or controlling redundancy, we
reduce the risk of inconsistencies occurring. If a data item is stored
only once in the database, any update to its value has to be
performed only once and the new value is available immediately to all
users. If a data item is stored more than once and the system is aware
of this, the system can ensure that all copies of the item are kept
consistent. Unfortunately, many of today’s DBMSs do not
automatically ensure this type of consistency.
Advantages of dbms
• More information from the same amount of data: With the
integration of the operational data, it may be possible for the
organization to derive additional information from the same data. For
example, in the file-based system illustrated in Figure 1.5, the
Contracts Department does not know who owns a leased property.
Similarly, the Sales Department has no knowledge of lease details.
When we integrate these files, the Contracts Department has access
to owner details and the Sales Department has access to lease details.
We may now be able to derive more information from the same
amount of data.
Advantages of dbms
• Sharing of data: Typically, files are owned by the people or
departments that use them. On the other hand, the database belongs
to the entire organization and can be shared by all authorized users.
In this way, more users share more of the data. Furthermore, new
applications can build on the existing data in the database and add
only data that is not currently stored, rather than having to define all
data requirements again. The new applications can also rely on the
functions provided by the DBMS, such as data definition and
manipulation, and concurrency and recovery control, rather than
having to provide these functions themselves.
Advantages of dbms
• Improved data integrity: Database integrity refers to the validity and
consistency of stored data. Integrity is usually expressed in terms of
constraints, which are con sistency rules that the database is not
permitted to violate. Constraints may apply to data items within a
single record or to relationships between records. For example, an
integrity constraint could state that a member of staff’s salary cannot
be greater than $40,000 or that the branch number contained in a
staff record, representing the branch where the member of staff
works, must correspond to an existing branch office. Again,
integration allows the DBA to define integrity constraints, and the
DBMS to enforce them.
Advantages of DBMS
• Improved security: Database security is the protection of the database
from unauthorized users. Without suitable security measures, integration
makes the data more vulnerable than file-based systems. However,
integration allows the DBA to define database security, and the DBMS to
enforce it. This security may take the form of user names and passwords to
identify people authorized to use the database. The access that an
authorized user is allowed on the data may be restricted by the operation
type (retrieval, insert, update, delete). For example, the DBA has access to
all the data in the database; a branch manager may have access to all data
that relates to his or her branch office; and a sales assistant may have
access to all data relating to properties but no access to sensitive data such
as staff salary details.
Advantages of dbms
• Enforcement of standards: Again, integration allows the DBA to
define and the DBMS to enforce the necessary standards. These may
include departmental, organizational, national, or international
standards for such things as data formats to facilitate exchange of
data between systems, naming conventions, documentation
standards, update procedures, and access rules.
Advantages of dbms
• Economy of scale: Combining all the organization’s operational data
into one database and creating a set of applications that work on this
one source of data can result in cost savings. In this case, the budget
that would normally be allocated to each department for the
development and maintenance of its file-based system can be
combined, possibly resulting in a lower total cost, leading to an
economy of scale. The combined budget can be used to buy a system
configuration that is more suited to the organization’s needs. This may
consist of one large, powerful computer or a network of smaller
computers.
Advantages of Dbms
• Balance of conflicting requirements: Each user or department has
needs that may be in conflict with the needs of other users. Because
the database is under the control of the DBA, the DBA can make
decisions about the design and operational use of the database that
provide the best use of resources for the organization as a whole.
These decisions will provide optimal performance for important
applications, possibly at the expense of less-critical ones.
Advantages of dbms
• Improved data accessibility and responsiveness: Again, as a result of
integration, data that crosses departmental boundaries is directly accessible
to the end users. This provides a system with potentially much more
functionality that can, for example, be used to provide better services to
the end-user or the organization’s clients. Many DBMSs provide query
languages or report writers that allow users to ask ad hoc questions and to
obtain the required information almost immediately at their terminal,
without requiring a programmer to write some software to extract this
information from the database. For example, a branch manager could list all
flats with a monthly rent greater than £400 by entering the following SQL
command at a terminal:
• SELECT * FROM PropertyForRent WHERE type 5 ‘Flat’ AND rent . 400;
Advantages of DBMS
• Increased productivity: As mentioned previously, the DBMS provides
many of the standard functions that the programmer would normally
have to write in a file based application. At a basic level, the DBMS
provides all the low-level file-handling routines that are typical in
application programs. The provision of these functions allows the
programmer to concentrate on the specific functionality required by
the users without having to worry about low-level implementation
details. Many DBMSs also provide a fourth-generation environment,
consisting of tools to simplify the development of database
applications. This results in increased programmer productivity and
reduced development time (with associated cost savings).
Advantages of DBMS
• Improved maintenance through data independence: In file-based
systems, the descriptions of the data and the logic for accessing the data
are built into each application program, making the programs dependent
on the data. A change to the structure of the data—such as making an
address 41 characters instead of 40 characters, or a change to the way
the data is stored on disk—can require sub stantial alterations to the
programs that are affected by the change. In contrast, a DBMS separates
the data descriptions from the applications, thereby making applications
immune to changes in the data descriptions. This is known as data
independence and is discussed further in Section 2.1.5. The provision of
data independence simplifies database application maintenance.
Advantages of DBMS
• Increased Concurrency: In some file-based systems, if two or more
users are allowed to access the same file simultaneously, it is possible
that the accesses will interfere with each other, resulting in loss of
information or even loss of integrity. Many DBMSs manage
concurrent database access and ensure that such problems cannot
occur. We discuss concurrency control in Chapter 22.
Advantages of DBMS
• Improve Backup And Recovery: Many file-based systems place the
responsibility on the user to provide measures to protect the data
from failures to the computer system or application program. This
may involve performing a nightly backup of the data. In the event of a
failure during the next day, the backup is restored and the work that
has taken place since this backup is lost and has to be re-entered.
• In contrast, modern DBMSs provide facilities to minimize the amount
of processing that is lost following a failure. We discuss database
recovery in Section 22.3.
Disadvantages of DBMS
• Complexity: The provision of the functionality that we expect of a
good DBMS makes the DBMS an extremely complex piece of
software. Database designers and developers, data and database
administrators, and end-users must understand this functionality to
take full advantage of it. Failure to understand the system can lead to
bad design decisions, which can have serious consequences for an
organization.
• Size: The complexity and breadth of functionality makes the DBMS an
extremely large piece of software, occupying many megabytes of disk
space and requiring substantial amounts of memory to run efficiently.
Disadvantages of DBMS
• Cost of DBMSs: The cost of DBMSs varies significantly, depending on the
environment and functionality provided. For example, a single-user DBMS
for a per sonal computer may only cost $100. However, a large mainframe
multi-user DBMS servicing hundreds of users can be extremely expensive,
perhaps $100,000 or even $1,000,000. There is also the recurrent annual
maintenance cost, which is typically a percentage of the list price.
• Additional hardware costs: The disk storage requirements for the DBMS and
the database may necessitate the purchase of additional storage space.
Furthermore, to achieve the required performance, it may be necessary to
purchase a larger machine, perhaps even a machine dedicated to running
the DBMS. The procurement of additional hardware results in further
expenditure.
Disadvantages of DMS
• Cost of conversion In some situations, the cost of the DBMS and extra hardware may be
relatively small compared with the cost of converting existing applications to run on the new
DBMS and hardware. This cost also includes the cost of training staff to use these new
systems, and possibly the employment of specialist staff to help with the conversion and
running of the systems. This cost is one of the main reasons why some organizations feel tied
to their current systems and cannot switch to more modern database technology. The term
legacy system is sometimes used to refer to an older, and usually inferior, system.
• Performance: Typically, a file-based system is written for a specific application, such as
invoicing. As a result, performance is generally very good. However, the DBMS is written to
be more general, to cater for many applications rather than just one. The result is that some
applications may not run as fast as they used to.
• Greater impact of a failure :The centralization of resources increases the vulnerability of the
system. Because all users and applications rely on the availability of the DBMS, the failure of
certain components can bring operations to a halt.