Lecture # 1 Introduction to
Databases
Rashmi Dutta Baruah
Department of Computer Science & Engineering
Outline
• Databases
• Database Management System
• DBMS functionality
• Characteristics and advantages of DBMS
• When not to use DBMS
• Data Models
• Three-schema architecture
• Data independence
• DBMS Language
• DBMS components
• DBMS architecture
2
Databases
• You may not notice, but you encounter several activities
everyday that involve some interaction with a database.
– Banking and Financial Services, Airline/Railway reservation,
Government and Public Services (tax filing) : Traditional
database applications – stored or accessed information is
textual or numeric.
– Online search engines (Google), Social media platforms
(Instagram, LinkedIn), E-commerce platforms (Amazon) : NOSQL
Systems - stored or accessed information can be documents,
programs, images, videos.
3
What is a Database?
• Database
– Collection of related data where data is known facts that
can be recorded and have implicit meaning.
• It has following implicit properties:
– represents some aspects of real-world
– logically coherent collection of data with some inherent
meaning
– designed, built, and populated with data for a specific
purpose
4
What is a DBMS?
• Database Management
System (DBMS)
– general purpose software
that enables creation and
maintenance of database
• Database System
– The DBMS software
together with the data
itself. Sometimes, the
applications are also
included. A simplified database system environment
5
Typical DBMS Functionality
• Define a database in terms of data types, structures
and constraints
• Construct or Load the Database on a secondary
storage medium
• Manipulating the database - querying, generating
reports, insertions, deletions and modifications to its
content
• Concurrent Processing and Sharing by a set of users
and programs – yet, keeping all data valid and
consistent
• Other features include protecting and maintaining
the database.
6
Example of a Database
• Mini-world for the example: Part of a UNIVERSITY
environment.
• Some mini-world entities:
– STUDENTs
– COURSEs
– SECTIONs (of COURSEs)
– (academic) DEPARTMENTs
– INSTRUCTORs
• Some mini-world relationships:
– SECTIONs are of specific COURSEs
– STUDENTs take SECTIONs
– COURSEs have prerequisite COURSEs
– INSTRUCTORs teach SECTIONs
– COURSEs are offered by DEPARTMENTs
– STUDENTs major in DEPARTMENTs
7
Characteristics of Database Approach
• The main characteristics of the database approach versus the
file-processing approach:
• Self-describing nature of a database system
– A DBMS catalog stores the description of the database.
The description is called meta-data
Relation Columns
Relation_name No_of_columns Column_ Data_type Belongs_to_relation
name
STUDENT 4
Name Char(30) STUDENT
COURSE 4
Student_ Char(4) STUDENT
SECTION 5 number
… …
8
Characteristics of Database Approach
• Insulation between programs and data
– Called program-data independence. Allows changing data
storage structures and operations without having to change the
DBMS access programs.
• Support of multiple views of the data
– Each user may see a different view of the database, which
describes only the data of interest to that user.
• Sharing of data and multiuser transaction processing
– allowing a set of concurrent users to retrieve and to update the
database. Concurrency control within the DBMS guarantees that
each transaction is correctly executed or completely aborted
9
Advantages of using Database approach
• Controlling redundancy in data storage and in
development and maintenance efforts
• Sharing of data among multiple users
• Restricting unauthorized access to data
• Providing persistent storage for program Objects
• Providing Storage Structures for efficient Query
Processing
• Providing backup and recovery services.
• Providing multiple interfaces to different classes of users.
• Representing complex relationships among data.
10
Advantages of using Database approach
• Enforcing integrity constraints on the database.
• Drawing Inferences and Actions using rules
• Potential for enforcing standards
• Reduced application development time
• Flexibility to change data structures
• Availability of up-to-date information
• Economies of scale
11
When not to use a DBMS
• Main inhibitors (costs) of using a DBMS:
– High initial investment and possible need for additional hardware.
– Overhead for providing generality, security, concurrency control,
recovery, and integrity functions.
• When a DBMS may be unnecessary:
– If the database and applications are simple, well defined, and not
expected to change.
– If there are stringent real-time requirements that may not be met
because of DBMS overhead.
– If access to data by multiple users is not required.
• When no DBMS may suffice:
– If the database system is not able to handle the complexity of data
because of modeling limitations
– If the database users need special operations not supported by the
DBMS.
12
When not to use a DBMS
Use Case DBMS Useful? Reason
University System
E-commerce Platform
Smart Microwave Oven
Local craft store website
(static)
13
When not to use a DBMS
Use Case DBMS Useful? Reason
Structured, multi-user,
University System Yes
large-scale data
High concurrency,
E-commerce Platform Yes
reliability, analytics
Real-time response, no
Smart Microwave Oven No
persistent complex data
Local craft store website Simple data, rare updates,
No
(static) no query needed
14
Data Models
• Data Model: A set of concepts to describe the
structure of a database, and certain constraints that
the database should obey.
• Data Model Operations: Operations for specifying
database retrievals and updates by referring to the
concepts of the data model. Operations on the data
model may include basic operations and user-defined
operations.
15
Categories of data models
• Conceptual (high-level, semantic) data models:
Provide concepts that are close to the way many
users perceive data. (Also called entity-based or
object-based data models.)
• Physical (low-level, internal) data models: Provide
concepts that describe details of how data is stored
in the computer.
• Implementation (representational) data models:
Provide concepts that fall between the above two,
balancing user views with some computer storage
details.
16
Schemas versus Instances
• Database Schema: The description of a database.
Includes descriptions of the database structure and
the constraints that should hold on the database.
• Schema Diagram: A diagrammatic display of (some
aspects of) a database schema.
• Schema Construct: A component of the schema or
an object within the schema, e.g., STUDENT,
COURSE.
• Database Instance: The actual data stored in a
database at a particular moment in time. Also called
database state (or occurrence).
17
Database Schema Vs. Database State
• Database State: Refers to the content of a database
at a moment in time.
• Initial Database State: Refers to the database when
it is loaded
• Valid State: A state that satisfies the structure and
constraints of the database.
• Distinction
• The database schema changes very infrequently. The
database state changes every time the database is
updated.
• Schema is also called intension, whereas state is called
extension.
18
Three-Schema Architecture
• Proposed to support DBMS characteristics of:
• Program-data independence.
• Support of multiple views of the data.
• Defines DBMS schemas at three levels:
• Internal schema at the internal level to describe physical
storage structures and access paths. Typically uses a physical
data model.
• Conceptual schema at the conceptual level to describe the
structure and constraints for the whole database for a
community of users. Uses a conceptual or an implementation
data model.
• External schemas at the external level to describe the various
user views. Usually uses the same data model as the conceptual
level.
19
Three Schema architecture
20
Image source: https://medium.com/nixis-institute/dbms-three-level-architecture-c96ba7a21f5b
Data Independence
• Logical Data Independence: The capacity to change
the conceptual schema without having to change the
external schemas and their application programs.
• Physical Data Independence: The capacity to change
the internal schema without having to change the
conceptual schema.
21
Data Independence
• Mappings among schema levels are needed to transform
requests and data. Programs refer to an external schema,
and are mapped by the DBMS to the internal schema for
execution.
• When a schema at a lower level is changed, only the
mappings between this schema and higher-level
schemas need to be changed in a DBMS that fully
supports data independence. The higher-level schemas
themselves are unchanged. Hence, the application
programs need not be changed since they refer to the
external schemas.
22
DBMS Languages
• Data Definition Language (DDL): CREATE TABLE student (
Used by the DBA and database s_id INT PRIMARY KEY,
name VARCHAR(100),
designers to specify the dept VARCHAR(50),
conceptual schema of a database. class INT,
INDEX (department)
In many DBMSs, the DDL is also ) ENGINE = InnoDB;
used to define internal and
external schemas (views). In CREATE TABLE student (
s_id INT PRIMARY KEY,
some DBMSs, separate storage name VARCHAR(100) NOT
definition language (SDL) and NULL,
dept_id INT,
view definition language (VDL) FOREIGN KEY
are used to define internal and (department_id) REFERENCES
departments(dept_id)
external schemas. );
Slide 2-23
DBMS Languages
• Data Manipulation Language (DML): Used to specify
database retrievals and updates.
• DML commands (data sublanguage) can be embedded in a
general-purpose programming language (host language), such
as COBOL, C or an Assembly Language.
• Alternatively, stand-alone DML commands can be applied
directly (query language).
• In current DBMSs, the mentioned languages are usually
not considered distinct languages.
Slide 2-24
DBMS Component Modules
25
DBMS Architectures: Centralized and
Client-Server
• Centralized DBMS: combines everything into single system
including- DBMS software, hardware, application programs
and user interface processing software.
• Basic Client-Server Architectures
• Clients
• DBMS Server
• Provides database query and transaction services to the
clients
• Sometimes called query and transaction servers
Slide 2-26
Two Tier Client-Server Architecture
• User Interface Programs and Application Programs
run on the client side
• Interface called ODBC (Open Database Connectivity)
provides an Application program interface (API) allow
client side programs to call the DBMS. Most DBMS
vendors provide ODBC drivers.
Slide 2-27
Three Tier Client-Server Architecture
• Common for Web applications
• Intermediate Layer called Application Server or Web Server:
• stores the web connectivity software and the rules and
business logic (constraints) part of the application used to
access the right amount of data from the database server
• acts like a conduit for sending partially processed data
between the database server and the client.
• Additional Features- Security:
• encrypt the data at the server before transmission
• decrypt data at the client
Slide 2-28
GUI, Presentation
Web Interface Layer
Application
Business Logic
Programs,
Layer
Web Pages
Database Database
Management Services
System Layer
Figure: 3-tier client server architecture
29
Summary
• Database - a collection of related data, where data means
recorded facts –represents some aspect of real world – used
for specific purpose
• DBMS –generalized software package for implementing and
maintaining database
• Discussed characteristics of database approach versus file
processing applications
• Discussed advantages of DBSM and situations where it is not
advantageous to use it
• Languages that DBMS support
• DBMS components and 2-tier and 3-tier client server
architecture
30