Unit 4
Unit 4
Catalogue
4.0 INTRODUCTION
The system catalogue is a collection of tables and views that contain important
information about a database. It is the place where a relational database management
system stores schema metadata, such as information about tables and columns, and
internal bookkeeping information. A system catalogue is available for each database.
Information in the system catalogue defines the structure of the database. For
example, the DDL (data dictionary language) for all tables in the database is stored in
the system catalogue. Most system catalogues are copied from the template database
during database creation, and are thereafter database-specific. A few catalogues are
physically shared across all databases in an installation; these are marked in the
descriptions of the individual catalogues.
The system catalogue for a database is actually part of the database. Within the
database are objects, such as tables, indexes, and views. The system catalogue is
basically a group of objects that contain information that defines other objects in the
database, the structure of the database itself, and various other significant
information.
The system catalogue may be divided into logical groups of objects to provide tables
that are accessible by not only the database administrator, but by any other database
user as well. A user typically queries the system catalogue to acquire information on
the user’s own objects and privileges, whereas the DBA needs to be able to inquire
about any structure or event within the database. In some implementations, there are
system catalogue objects that are accessible only to the database administrator.
The terms system catalogue and data dictionary have been used interchangeably in
most situations. We will make a distinction in the two terms in the context of data
dictionary system, which is an implementation of the data dictionary or system
catalogue. This unit provides more information on the system catalogue or data
dictionary.
65
Database Design and
Implementation
4.1 OBJECTIVES
After going through this unit, you should be able to:
Security and authorisation information is also kept in the catalogue, which describes:
The system catalogue can also be used to store some statistical and descriptive
information about relations. Some such information can be:
Let us show a catalogue structure for base relation with the help of an example.
Figure 1 shows a relational schema, and Figure 2 shows its catalogue. The catalogue
here has stored - relation names, attribute names, attribute type and primary key as
well as foreign key information.
Student
(RollNo, Name, Address, PhoneNo, DOB, email, CourseNo, DNo)
66
Database System
Catalogue
Course
(CourseNo, CourseTitle, Professor)
Dept
(DeptNo, DeptName, Location)
Grade
(RollNo, CourseNo, Grade)
The description of the relational database schema in the Figure 1 is shown as the
tuples (contents) of the catalogue relation in Figure 2. This entry is called as
CAT_ENTRY. All relation names should be unique and all attribute names within a
particular relation should also unique. Another catalogue relation can store
information such as tuple size, current number of tuples, number of indexes, and
creator name for each relation.
Data dictionaries also include data on the secondary keys, indexes and views. The
above could also be extended to the secondary key, index as well as view information
by defining the secondary key, indexes and views. Data dictionaries do not contain
any actual data from the database, it contains only book-keeping information for
managing it. Without a data dictionary, however, a database management system
cannot access data from the database.
67
Database Design and
Implementation
The data dictionary stores useful metadata, such as field descriptions, in a format that
is independent of the underlying database system. Some of the functions served by
the Data Dictionary include:
• ensuring efficient data access, especially with regard to the utilisation of
indexes,
• partitioning the database into both logical and physical regions,
• specifying validation criteria and referential constraints to be automatically
enforced,
• supplying pre-defined record types for Rich Client features, such as security
and administration facilities, attached objects, and distributed processing (i.e.,
grid and cluster supercomputing).
2) Integrity constraints are applied to relations only and are not stored
separately in the Data Dictionary.
If a data dictionary system is used only by designers, users, and administrators, and
not by the DBMS software, it is called a passive data dictionary; otherwise, it is
called an active data dictionary or data directory. An active data dictionary is
automatically updated as changes occur in the database. A passive data dictionary
must be manually updated.
The data dictionary consists of record types (tables) created in the database by
system-generated command files, tailored for each supported back-end DBMS.
Command files contain SQL statements for CREATE TABLE, CREATE UNIQUE
68
Database System
Catalogue
INDEX, ALTER TABLE (for referential integrity), etc., using the specific SQL
statement required by that type of database.
• support for user-designed entity types (this is often called the “extensibility”
feature); this facility is often exploited in support of data modelling, to record
and cross-reference entities, relationships, data flows, data stores, processes,
etc.
• the ability to distinguish between versions of entities (e.g., test and production)
• enforcement of in-house standards and conventions.
• a query facility, both for administrators and casual users, which includes the
ability to perform generic searches on business definitions, user descriptions,
synonyms, etc.
• automated input facilities (e.g., to load record descriptions from a copy library).
• security features
69
Database Design and
Implementation
• simplifying documentation
• automating programming activities.
• reduce maintenance effort by identifying the impact of change as it affects:
• users,
• data base administrators,
• programmers.
• improve the quality of application software by enforcing standards in the
development process
• ensure application system longevity by maintaining documentation beyond
project completions
• data dictionary information created under one database system can easily be
used to generate the same database layout on any of the other database systems
BFC supports (Oracle, MS SQL Server, Access, DB2, Sybase, SQL Anywhere,
etc.)
These benefits are maximised by a fully utilised data dictionary. As the next section
will show, our environment is such that not all of these benefits are immediately
available to us.
Not only is the data dictionary central to every database, it is an important tool for all
users, from end users to application designers and database administrators. SQL
statements are used to access the data dictionary. Because the data dictionary is read-
only, only SELECT statements can be used against its tables and views.
70
Database System
Catalogue
The data dictionary is structured in tables and views, just like other database data. All
the data dictionary tables and views for a given database are stored in that database’s
SYSTEM table space. Access to the data dictionary is allowed through numerous
views, which may be divided into three categories: USER, ALL, and DBA.
• have columns identical to the other views, except that the column OWNER is
implied,
For example, the following query returns all the objects contained in your schema:
SELECT object_name, object_type FROM USER_OBJECTS;
71
Database Design and
Implementation
• Oracle modifies the data dictionary every time that a data definition language
(DDL) statement is issued.
• Any Oracle user can use the data dictionary as a read-only reference for
information about the database.
The system catalogue in Oracle contains information on all three levels of database
schemas: external which has view definitions, conceptual - which defines base tables,
and internal schema - giving the storage and index descriptions. Some of the
catalogue views relating to each of the three schema levels are given below. The
catalogue (metadata) data can be retrieved through SQL statements just as the user’s
actual data.
The first query is the example of the conceptual schema information where an object
owned by a particular user, ‘KARAN’ can be obtained:
SELECT * FROM ALL-CATALOGUE WHERE OWNER = 'KARAN';
The result of this query could be as shown in Figure 3, which indicates that three
base tables are owned by KARAN: STUDENT, COURSE, and DEPT, plus a view
STUCRS. The columns are retrieved as shown.
To find some of the information describing the columns of the GRADE table for
'KARAN', the following query could be submitted:
SELECT COLUMN-NAME, DATA-TYPE, DATA-LENGTH, NUM-DISTINCT,
LOW-VALUE, HIGH-VALUE
FROM USER-TAB-COLUMNS
WHERE TABLE-NAME = ‘GRADE’;
The result of this query could be shown as in Figure 4. Because the USER- TAB-
COLUMNS table of the catalogue has the prefix USER, this query must be submitted
by the owner of the DEPT table. The NUM-DISTINCT column specifies the number
of distinct values for a given column and LOW-VALUE and HIGH-VALUE show
the lowest and highest values for the given column.
GRADE VARCHAR 2 8 F A
72
Database System
Catalogue
select table_name as table_name,
as column_name,
entity_definition as entity_definition,
entity_note as entity_note,
as column_datatype,
as default_value,
as pk, as fk, as ak,
as attribute_definition,
as sample_instance_data,
query as query
from entity
union
select table_usage,
column_name,
column_datatype,
null_option,
default_value,
pk, fk, ak,
attribute_definition,
from attribute
order by 1, 2
At runtime each detail row displays either table data, or column data, but not both. A
listing of GRANT commands is given in Figure 6. This gives the privileges to the
users according to the GRANT commands.
Some of these details are specific to the needs of one particular data base application.
Their primary importance is to the human reader, not computer programme. Many of
these details are available elsewhere; for example, the data types are visible in the
database diagram shown in Figure 7, and the default values and access privileges
may be discovered by querying the system catalogue (Figure 8). The dictionary
73
Database Design and
Implementation
serves to gather the important information together in one place and, of course, to
provide the table and column descriptions that make a dictionary what it is.
Addr_Info Table
Business or any other address Information
End Users have SELECT, INSERT and UPDATE rights
The main reason to store dictionary information in the database diagram itself is the
ease of data entry. It is quite hard to ensure correctness and completeness if the
dictionary data is stored separately from the diagram.
The text stored in the query column has been written to be both readable by human
beings and usable by the SQL command in Figure 7. Precise positive statements are
used, such as “End users have select, insert and update rights” rather than vague or
negative sentences like “End users can’t remove any rows from this table”.
A data dictionary is a valuable tool if it contains practical and useful information, and
if it is kept up to date and accurate. That implies it must reflect the physical reality of
the database rather than simple logical view, and that it must be easy to update and
generate.
The applications of the database systems over the time were required to support
distributed nature of organisation and object oriented system design. Let us discuss
the issues of data dictionary in the context of these advanced database systems.
The distributed database catalogue entries must specify site(s) at which data is being
stored in addition to data in a system catalogue in a centralised DBMS. Because of
data partitioning and replication, this extra information is needed. There are a number
of approaches to implementing a distributed database catalogue.
75
Database Design and
Implementation
Catalogues are generally updated during creation and/or deletion of relations and
modification of attributes. Certain characteristics of the data affect the access and
query evaluation. For example, in some distributed catalogues such as R* information
of locally stored objects including the fragments and replicates is stored on local
catalogues. The object storage sites of any object are maintained at the place where
the object was first created and the new sites where the object is moved is updated.
This is a partitioned scheme where the catalogue is partitioned and replicated as per
the demands of usage patterns.
In some distributed systems local and global relations are different. Global relations
can be accessed from all sites. A catalogue containing information of a global nature
is maintained at all sites. This is a fully replicated or redundant scheme where - one
copy of the catalogue is kept at each site.
76
Database System
Catalogue
Remote catalogue information can be requested by any site and stored for later use.
However, this stored catalogue may become inconsistent over time as the remote
catalogue gets updated. Any query accessing the stored catalogue yields out-of-date
information. Hence, the stored remote catalogue has to be updated from time to time.
Data dictionary can also support the process of database application development and
testing (although this is not a direct relationship to database administration) as they
contain the basic documentation while the systems are in the process of being
developed.
77
Database Design and
Implementation
…………………………………………………………………………………
………
3) How are Object Oriented catalogues implemented?
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
………
4.7 SUMMARY
This unit provides an detailed view of a data dictionary in a DBMS. The data
dictionary is one of the most important implementation tools in a database system.
The system catalogue provides all the information on various entities, attributes,
database statistics etc. It is a very useful tool if implemented actively in a database
system. However, active implementation of a data dictionary is costly.
In this unit we have discussed concepts related to data dictionary and its use in oracle
by the different types of users. We have also presented information on the data
dictionary system and its advantages and disadvantages. We have provided a brief
introduction to system catalogue in distributed systems and how data dictionary is
useful in system administration.
4.8 SOLUTIONS/ANSWERS
Check Your Progress 1
1) False 2) False 3) True 4) False 5) True
78
Database System
Catalogue
c) Partitioned: Partitions and replicates the catalogue as per the demands of
the usage pattern
d) Centralised/partitioned: Combination of the above
79