Spatial /multimedia/mobile databases
Spatial Databases
Spatial data is associated with geographic locations such as cities, towns etc. A spatial database is
optimized to store and query data representing objects. These are the objects which are defined in a
geometric space.
Characteristics of Spatial Database
A spatial database system has the following characteristics
• It is a database system
• It offers spatial data types (SDTs) in its data model and query language.
• It supports spatial data types in its implementation, providing at least spatial indexing and efficient
algorithms for spatial join.
Example
A road map is a visualization of geographic information. A road map is a 2-dimensional object which
contains points, lines, and polygons that can represent cities, roads, and political boundaries such as states or
provinces.
In general, spatial data can be of two types:
• Vector data: This data is represented as discrete points, lines and polygons
• Raster data: This data is represented as a matrix of square cells.
The spatial data in the form of points, lines, polygons etc. is used by many different databases as shown
above.
1|Page
Depending on the type of problem that needs to be solved, the type of maps that need to be made, and the
data source, either raster or vector, or a combination of the two can be used. Each data model has strengths
and weaknesses in terms of functionality and representation. As you get more experience with GIS, you will
be able to determine which data type to use for a particular application.
There are two main types of spatial data models: the Raster and Vector models. The raster data model
represents spatial data as grid of cells, and each cell has one non-spatial attribute associated with it. The
vector data model represents spatial data as either points, lines, or polygons that are each linked to one or
more non-spatial attributes. These two models represent the world in fundamentally different ways. One is
not inherently better than the other, but they are better suited for different circumstances. The choice of
which model to use is often dictated by three main factors:
1. The type of phenomena we are trying to represent.
2. The scale at which we plan to analyze our data.
3. How we plan to use the data.
Figure 3.18: Representing space in the raster model vs. the vector model.
Raster Data Model
The raster data model represents a phenomena across space as a gridded set of cell (or pixels). The cell size
determines the Resolution of the raster image that is the smallest feature we can resolve with the raster. A 10
m resolution raster has cells that are 10 x 10 m (100 m2), a 2 m resolution has cells that are 2 x 2 m (4 m2).
Along with the cell size, the number of rows and columns dictates the extent (or bounds) of a raster image. A
raster with a 1 m cell size, 5 rows, and 5 columns, will cover an area of 5 m x 5 m (25 m2). Because of the
full coverage within their bounds, raster data models are very well suited for representing continuous
phenomena where cell values correspond to measured (or estimated) value at specific location. In GIS,
2|Page
Rasters are commonly encountered as: satellite and drone imagery, elevation models, climate data, model
outputs, and scanned maps.
Figure 3.19: Example of raster data.
The value of a pixel can be quantitative (e.g. elevation) or qualitative (e.g. land use). Each pixel/cell can only
have a single value associated with it. Multiple bands can be combined to store or more information, as is
done with a RGB color photograph. Algebraic expressions can also be performed quickly and efficiently
with raster layers a inputs. This is known as raster overlay, and is one of the key advantages to raster data. If
layer A = Average July Temperature and layer B = Average January Temperature, then A – B will give us
the Average Temperature Range across the raster’s domain.
Figure 3.20: Raster math illustration.
3|Page
Rasters data relies on Spatial Autocorrelation and The First Law of Geography, the model assumes that all
areas within a given cell are equally represented by the cell value. Depending on the resolution of the raster
and the scale of the task at hand, this may or may not be an effective assumption. If you are trying to
represent the coastline of Nova Scotia, 100 m or even 1 km resolution cells will likely suffice (see Figure
3.21). However, 10 km cells severely degrade the quality of the representation and at a 100 km cell size, the
province is indistinguishable.
Figure 3.21: Raster Resolution.
3.4.2 Vector Data
The vector data model is much better suited to represent discrete phenomena than the raster data model. A
vector feature is a representation of a discrete object as a set of x,y coordinate pairs (points) linked to set of
descriptive attribute about that object. A vector feature’s coordinates can consist of just one (x,y) pair to
form a single point feature, or multiple points which can be connected to form lines or polygons (see Figure
3.22). The non-spatial attribute data is typically stored in a Tabular format separate from the spatial data,
and it is linked using an index. One of the key advantages of the vector model is the ability to store and
retrieve many attributes them quickly. In GIS, vector data are commonly encountered as: political
boundaries, census data, pathways (road, trails, etc.), point location (stop sign, fire hydrant), etc.
4|Page
Figure 3.22: Vector objects (points, lines, or polygons) are stored along with any number of attribute. Point,
line, and polygon data are typically stored in separate files.
Points are “zero-dimensional”, they have no length or width or area. A point feature is just an individual
(x,y) coordinate pair representing a precise location, that has some linked attribute information. Points are
great for representing a variety of objects, depending on the scale. Fire hydrants, light poles, and trees are
suitable to be represented as points in almost any application. If you are making a map of mines in British
Columbia, or cities across Canada, it is probably acceptable to just display them as points.
Figure 3.23: An example of point data showing locations of trees. The points are labeled with their index
(unique ID number) which corresponds to the attribute table below which stores more information about
each tree.
index Longitude (X) Latitude (Y) Name Age Height
0 0.44 0.03 Fir 54 119
1 0.55 0.71 Fir 29 56
5|Page
index Longitude (X) Latitude (Y) Name Age Height
2 0.89 0.33 Fir 82 197
3 0.18 0.02 Fir 46 98
4 0.65 0.51 Maple 87 212
5 0.43 0.81 Maple 73 172
6 0.38 0.86 Maple 94 233
7 0.68 0.04 Cedar 34 68
8 0.15 0.13 Cedar 36 73
Lines are one-dimensional, they have length, but no width and thus no area. A line consists of two or more
points. Every line must have a start point and end point, they may also have any number of middle points,
called vertices. A vertex is just any point where two or more lines meet. Lines are also great for representing
a variety of objects, depending on the scale. Hiking trails, flight paths, coastlines, and power lines are
suitable to be represented as lines in almost most applications. When making smaller scale maps, it is often
sufficient to represent rivers as lines, though at large scales we might elect to use a polygon.
Figure 3.24: Roads are typically reprinted as line data. Though they obviously have an area, unless we are
making a very large scale map, we do not need (or have the room) to show that on a map. This British
Columbia road atlas makes use of line data, representing roads a lines and using different colors to denote
the type of road.
Polygons are two-dimensional, they have both a length and width and therefore we can also calculate their
area. All polygons consist of a set of at three or more points (vertices) connected by line segments called
“edges” that connect to form an enclosed shape. All polygons form an enclosed shape, but some can also
have “holes” (think doughnuts!), these holes are sometimes called interior rings. Each interior ring is a
6|Page
separate set vertices and edges that is wholly contained within the polygon and no two interior rings can
overlap. Polygons are useful for representing many different objects depending: political boundaries,
Köppen climate zones, lakes, continents, etc. At large scales they can represent things like buildings which
we might choose to represent as points at smaller scales.
Sometimes, a discrete object has multiple parts, which are spatially separated. In these circumstances, the
vector model allows for multi-polygon, multi-line, or multi-point objects. A good example of when a multi-
polygon would be useful is the Stats Canada provincial boundary file (see Figure 3.25). Roads sometimes
need to be stored as multi-lines as well, for example Highway 1 crosses the Georgia Straight from
Vancouver to Nanaimo. If we want them to represent the entire Highway as one object, we need to use a
multi-line.
Figure 3.25: This is the official Stats Canada provincial boundary layer. All the other coastal provinces and
territories have islands. We do not need to represent every island as a separate object, so we can ‘bundle’
together the polygons as multipolygons. The landlocked provinces do not have any coastlines and are
represented as simple polygons rather than multipolygons. The attribute table below corresponds to the map
and lists the geometry type (polygon/multipolygons).
PRNAME Province ID Population Area Geometry Type
Newfoundland and Labrador 10 525572 373872 MultiPolygon
Prince Edward Island 11 157329 5660 MultiPolygon
Nova Scotia 12 971451 53338 MultiPolygon
New Brunswic 13 779940 71450 MultiPolygon
7|Page
PRNAME Province ID Population Area Geometry Type
Quebec 24 8536855 1365128 MultiPolygon
Ontario 35 14666590 917741 MultiPolygon
Manitoba 46 1389952 553556 MultiPolygon
Saskatchewan 47 1206019 591670 Polygon
Alberta 48 4511223 642317 Polygon
British Columbia 59 5111756 925186 MultiPolygon
Yukon 60 41774 474391 MultiPolygon
Vector data also has a resolution although it has a somewhat different definition in the context of the vector
model. Vector resolution is determined by the smallest resolvable feature. Another way to describe vector
resolution, would be the distance between vertices. The greater the distance between vertices, the fewer
vertices there are per polygon and the lower the resolution. If a vector object (line or polygon) has many
vertices, we will have a higher resolution representation of the feature.
Figure 3.26: Vector image of Nova Scotia at different resolutions.
Here the original polygon (top left) has been down sampled to lower resolutions, by setting the minimum
allowable distance between verticies. As the distance beween verticies increases, the resolution decreases
and the coastline becomes less distinguishable.
8|Page
Spatial queries
Spatial queries are queries in a spatial database that can be answered on the basis of geometric information
only, i.e., the spatial position and extent of the objects involved. A spatial query is defined by a query space
S, i.e., either the whole spatial database, or a portion of it obtained through suitable filters; by a query object
q that can either belong or not belong to the database; and by a spatial relation ℜ. A generic query is thus
defined as follows:
Return all objects s ∈ S that are in relation ℜ with q.
A classification of spatial queries follows directly from the classification of spatial. Computational
techniques that can be adopted to answer spatial queries depend on the nature of the query space (and on the
model which encodes it), on the nature of the query object, and on the nature of the relation. However, a
simple exhaustive study of all possibilities shows that most spatial queries can be eventually reduced to few
basic problems in computational geometry.
Multimedia Database
Multimedia database is the collection of interrelated multimedia data that includes text, graphics (sketches,
drawings), images, animations, video, audio etc and have vast amounts of multisource multimedia data. The
framework that manages different types of multimedia data which can be stored, delivered and utilized in
different ways is known as multimedia database management system.
Content of Multimedia Database management system:
1. Media data – The actual data representing an object.
2. Media format data – Information such as sampling rate, resolution, encoding scheme etc. about
the format of the media data after it goes through the acquisition, processing and encoding phase.
3. Media keyword data – Keywords description relating to the generation of data. It is also known as
content descriptive data. Example: date, time and place of recording.
4. Media feature data – Content dependent data such as the distribution of colors, kinds of texture
and different shapes present in data.
Types of multimedia applications based on data management characteristic are:
1. Repository applications – A Large amount of multimedia data as well as meta-data (Media format
data, Media keyword data, and Media feature data) that is stored for retrieval purpose, e.g.,
Repository of satellite images, engineering drawings, and radiology scanned pictures.
9|Page
2. Presentation applications – They involve delivery of multimedia data subject to temporal
constraint. Optimal viewing or listening requires DBMS to deliver data at certain rate offering the
quality of service above a certain threshold. Here data is processed as it is delivered. Example:
Annotating of video and audio data, real-time editing analysis.
3. Collaborative work using multimedia information – It involves executing a complex task by
merging drawings, changing notifications. Example: Intelligent healthcare network.
There are still many challenges to multimedia databases, some of which are:
1. Modelling – Working in this area can improve database versus information retrieval techniques
thus, documents constitute a specialized area and deserve special consideration.
2. Design – The conceptual, logical and physical design of multimedia databases has not yet been
addressed fully as performance and tuning issues at each level are far more complex as they consist
of a variety of formats like JPEG, GIF, PNG, MPEG which is not easy to convert from one form to
another.
3. Storage – Storage of multimedia database on any standard disk presents the problem of
representation, compression, mapping to device hierarchies, archiving and buffering during input-
output operation. In DBMS, a”BLOB” (Binary Large Object) facility allows untyped bitmaps to be
stored and retrieved.
4. Performance – For an application involving video playback or audio-video synchronization,
physical limitations dominate. The use of parallel processing may alleviate some problems but such
techniques are not yet fully developed. Apart from this multimedia database consume a lot of
processing time as well as bandwidth.
5. Queries and retrieval –For multimedia data like images, video, audio accessing data through
query opens up many issues like efficient query formulation, query execution and optimization
which need to be worked upon.
Areas where multimedia database is applied are:
• Documents and record management: Industries and businesses that keep detailed records and
variety of documents. Example: Insurance claim record.
• Knowledge dissemination: Multimedia database is a very effective tool for knowledge
dissemination in terms of providing several resources. Example: Electronic books.
• Education and training: Computer-aided learning materials can be designed using multimedia
sources which are nowadays very popular sources of learning. Example: Digital libraries.
10 | P a g e
• Marketing, advertising, retailing, entertainment and travel. Example: a virtual tour of cities.
• Real-time control and monitoring: Coupled with active database technology, multimedia
presentation of information can be very effective means for monitoring and controlling complex
tasks Example: Manufacturing operation control.
Mobile Database
A Mobile database is a database that can be connected to a mobile computing device over a mobile network (or
wireless network). Here the client and the server have wireless connections. In today’s world, mobile computing is
growing very rapidly, and it is huge potential in the field of the database. It will be applicable on different-different
devices like android based mobile databases, iOS based mobile databases, etc. Common examples of databases are
Couch base Lite, Object Box, etc.
Features of Mobile database:
Here, we will discuss the features of the mobile database as follows:
• A cache is maintained to hold frequent and transactions so that they are not lost due to connection
failure.
• As the use of laptops, mobile and PDAs is increasing to reside in the mobile system.
• Mobile databases are physically separate from the central database server.
• Mobile databases resided on mobile devices.
• Mobile databases are capable of communicating with a central database server or other mobile
clients from remote sites.
• With the help of a mobile database, mobile users must be able to work without a wireless
connection due to poor or even non-existent connections (disconnected).
• A mobile database is used to analyze and manipulate data on mobile devices.
Mobile Database typically involves three parties:
1. Fixed Hosts: It performs the transactions and data management functions with the help of
database servers.
2. Mobiles Units:
These are portable computers that move around a geographical region that includes the cellular
network that these units use to communicate to base stations.
11 | P a g e
3. Base Stations:
These are two-way radios installation in fixed locations, which pass communication with the mobile
units to and from the fixed hosts.
Limitations:
Here, we will discuss the limitation of mobile databases as follows.
• It has limited wireless bandwidth.
• In the mobile database, Wireless communication speed is slow.
• It required unlimited battery power to access.
• It is less secured.
• It is Hard to make theft-proof.
Data Processing
Collection, manipulation, and processing collected data for the required use is known as data processing. It is a
technique normally performed by a computer; the process includes retrieving, transforming, or classification
of information.
However, the processing of data largely depends on the following:
• The volume of data that need to be processed
• The complexity of data processing operations
• Capacity and inbuilt technology of respective computer system
• Technical skills
• Time constraints
Methods of Data Processing
Let us now discuss the different methods of data processing.
• Single user programming
• Multiple programming
• Real-time processing
• On-line processing
• Time sharing processing
• Distributed processing
Single User Programming
It is usually done by a single person for his personal use. This technique is suitable even for small offices.
12 | P a g e
Multiple Programming
This technique provides facility to store and execute more than one program in the Central Processing Unit (CPU)
simultaneously. Further, the multiple programming technique increases the overall working efficiency of the
respective computer.
Real-time Processing
This technique facilitates the user to have direct contact with the computer system. This technique eases data
processing. This technique is also known as the direct mode or the interactive mode technique and is developed
exclusively to perform one task. It is a sort of online processing, which always remains under execution.
On-line Processing
This technique facilitates the entry and execution of data directly; so, it does not store or accumulate first and
then process. The technique is developed in such a way that reduces the data entry errors, as it validates data at
various points and also ensures that only corrected data is entered. This technique is widely used for online
applications.
Time-sharing Processing
This is another form of online data processing that facilitates several users to share the resources of an online
computer system. This technique is adopted when results are needed swiftly. Moreover, as the name suggests, this
system is time based.
Following are some of the major advantages of time-sharing processing:
• Several users can be served simultaneously
• All the users have almost equal amount of processing time
• There is possibility of interaction with the running programs
Distributed Processing
This is a specialized data processing technique in which various computers (which are located remotely) remain
interconnected with a single host computer making a network of computer.
All these computer systems remain interconnected with a high speed communication network. This facilitates in the
communication between computers. However, the central computer system maintains the master database and
monitors accordingly.
13 | P a g e