GIS Data Structures
SPATIAL DATA MODELS
GIS data represents real world objects. Real world objects can be divided into two abstractions: Discrete (soil, land use, cities) Continuous (elevation or rain fall). Traditionally, there are two broad methods used to store data in a GIS for both abstractions: Raster & Vector
SPATIAL DATA MODELS
RASTER & VECTOR
RASTER DATA MODEL
Cell or pixel is the basic spatial unit for a Raster / Grid data Pixels are generally square in shape (Square Tessellation) Pixels are organized into an array of Rows and Columns called a Grid/Raster Rows and columns are numbered from 0 Hence, origin for raster data is upper left corner Pixel locations are referenced by their row and column position Every pixel can be uniquely identified by its row and column position Pixels are assigned an integer, floating point, or NO DATA value Each pixel represent some kind of geographic phenomenon Number of rows and columns does not have to be the same
Point representation
Line / Arc representation
Polygon / Area representation
Raster dataset attribute table
Raster Data Types
Continuous Raster
Thematic Raster
PIXEL SIZE
Advantages Simple data structure Resolution is set by cell size Easily modified Display/output good for images Faster and very efficient for overlay operation Raster data mainly is obtained from satellite images and scanning Raster is utilized when data change continuously across a region (High spatial variability is efficiently represented)
Disadvantages Not all phenomena related directly with raster representation Requires large storage Errors in perimeter, and shape Displays jagged edges at large scale Implementing Topology is difficult Difficult network analysis
Vector Model
Primitive Features
Databases
A B C D
Points
Lines / Arcs
Polygons
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
VECTOR DATA MODEL
Derived from the formulation of spatial concepts that emphasize on real world objects Geometry primitives of vector data model are Point, Line and Polygon objects can be built from these primitives Object location determined by represented location point Uniqueness of vector data model lies in its management and storage of data geometry primitives Origin for vector data is lower left corner
VECTOR DATA TYPES
Spaghetti model Topology model
The Spaghetti Model
The spaghetti model is the most simple vector data model The model is a direct representation of a graphical image NO explicit topological information The geometries may be points, lines or polygons No constraints wrt how geometries may positioned (like two lines may
intersect without adding a point at intersection, two polygons may intersect without restriction)
The Spaghetti Model
Advantages Simple Ease of editing efficient for display and plotting
Disadvantages
Redundant storage of data Major deficiencies for dealing with neighbourhood and inclusion though connectivity is possible Computation expense in building topological or network relationships among features Cannot be used to effectively represent surfaces inefficient for most types of spatial analysis
Topological Model
Connections & relationships between objects are independent of their coordinates Overcomes major weakness of spaghetti model allowing for GIS analysis (Overlaying,
Network, Contiguity, Connectivity)
Requires all lines be connected, polygons closed, loose ends removed.
Arc-Node Topology
To Node
Left Polygon
Right Polygon
From Node
Vector Topologic Data Model
Arc Coordinate Data A n1 a3 C a4 n2 B a2 a1 Arc a1 a2 a3 a4 Arc Topology StartXY IntermediateXY 4,5 4,5 4,5 4,3 Node Topology (4,8), (8,8), (8,1), (4,1) (6,7), (6,3) (1,3) EndXY 4,3 4,3 4,3 4,5 Polygon Topology ID A B C Arcs a1, a2 a2, a4 a3, a4
Arc Start
a1 a2 a3 n1 n1 n1
End
n2 n2 n2
Left Right
A A C B
Node
n1 n2
Arcs
a4, a2, a1, a3 a2, a4, a3, a1
a4
n2
n1
Planar Enforcement: No two individual features can overlap. There are no holes or slands that are not themselves features. Every feature is represented as a record in the attribute table.
Topological Model
(The Intelligent mode of representation)
Where is it? (location)
What is is next to (adjacency) Is it inside or outside (containment) How far is it (connectivity)
Topology represents the structuring of coordinate data which clearly describes adjacency, containment, and connectivity.
Advantages
more efficient data storage (Compact data structure) topological encoding more efficient suitable for most usage and compatible with data good graphic presentation Efficient projection transformation Efficient for network analysis Accurate map output
Disadvantages
overlay operation not efficient complex data structure High spatial variability is inefficiently represented
3-D Data Representation Triangulated Irregular Network (TIN)
TIN is a vector data structure that partitions geographic space into contiguous, non-overlapping triangles. The vertices of each triangle are sample data points with x, y and z values. These points are connected by lines to form Delaunay triangles.
TIN is a vector topological data model for representing surfaces TIN represents a surface as a set of interconnected triangular facets derived from sample points Associated Data tables: - Node table - lists each triangle and its defining nodes - Edge table - lists 3 adjacent triangles for each facet - XY coordinate table - stores nodes coordinates
Triangulated Irregular Network (TIN)
Node Face
Edge
Contours
TIN
Advantages
Slope and Aspect calculated for each triangle and stored as attributes of the facet For areas of complex relief, TIN works better than raster More detailed representation for higher density of data points
Disadvantages
Significantly more processing required to generate the TIN file to start (but then more efficient representation) Errors along edges often need correction
What is Database?
A system whose overall purpose is to record and maintain data. The data concerned can be anything that is deemed to be of significance to the organization.
ATTRIBUTE DATA MODELS
(Describes conceptual structuring of data stored in database)
Hierarchical Data Model
Network Data Model
Relational Data Model
Object Data Model
(Object oriented and Object relational models)
Hierarchical Data Model (parts superior to suppliers)
Hierarchical Data Model
Now obsolete, a hierarchical DBMS assumed hierarchical relationships between data. i.e., tree structure. The root may have any number of dependents, each of these may have any number of lower-level dependents, and so on, to any number of levels. (Examples are IBMs IMS, Informatics Mark IV.) Many-to-many model is not possible with this structure. A true model for representing hierarchical structures from the real world.
Asymmetry is a major drawback Update operations are difficult
Network Data Model
Network Data Model
Network DBMS allowed complex data structures to be built but were inflexible and required careful design. The network model allows to model many-to-many relationship directly than does the hierarchical approach. The network structure is more symmetric than the hierarchical structure.Very efficient in storage and fast Best examples are airline booking systems. A pre-cursor to and largely superseded by Relational DBMS Fast and Efficient Inflexible Technically obsolete (although many in commercial use).
Relational Data Model
Relational Data Model
The Relational model to data is based on the realization that files that obey certain constraints may be considered as mathematical relations.
In much of the Relational literature, tables are referred to as relations. Rows of such tables referred to as tuples, also in general known as Record. Columns are referred to as attributes. The most popular type of DBMS in use, very simple and easy to under stand. Relational DBMS have to employ many tables to conform absolutely to the various normalization rules.
Object Data Model
Object orientation for a database means the capability of storing and retrieving objects in addition to mere data.
Objects are complex and not well handled by standard Relational DBMS.
Most systems can handle images, video and other objects but do so in a non-standard way in many cases. The first system to announce the use of an Object Oriented DBMS is Taos from Data Research Associates.
An Object-Relational database (ORDBMS) adds features associated with object oriented systems to a RDBMS.
It enables you to make the features in GIS datasets smarter by endowing them with natural behaviors and relationship among features. It brings a physical model closer to its logical model. It lets you implement the majority of custom behaviors without writing any code. (e.g., over passes and under passes
Basic OO Characteristics Polymorphism
The behaviors (or methods) of an object class that can adapt to variations of objects.
Encapsulation
An object is accessed only through a well-defined set of software methods, organized into software interfaces.
Inheritance
An object class can be defined to include the behavior of another object class and have additional behaviors.
GIS Standards & Interoperability
Documented agreements containing technical
specifications or other precise criteria to be used consistently as rules, guidelines, or definitions of characteristics, to ensure that materials, products, procedures, and services are fit for their purpose.
(as defined by ISO)
Standards facilitate data sharing and increase interoperability among geographic information systems.
Interoperability enables sharing and exchange of information and processes in heterogeneous, autonomous, and distributed computing environments. However, interoperability presents a much greater challenge in GIS than in other fields of information science because the greater complexity of geographic information.
GIS Standards include
Spatial Data Standards Metadata Standards Database Standards User interface Standards
Networking Standards
Database Query Standards Display and Plotting Standards Data Exchange Standards
ISO/TC 211
Digital Geographic Information Working Group
GLOBAL SPATIAL DATA INFRASTRUCTURE ASSOCIATION
Organizations involved with developing standards and Interoperability in GeoSpatial both national and international are OGC Open Geospatial Consortium (http://www.opengeospatial.org) ISO International Organization for Standardization (http://www.iso.ch/ ) ANSI American National Standards Institute (http://www.ansi.org/ ) W3C World Wide Web Consortium (http://www.w3c.org/ ) WS-I Web Services Interoperability Organization (http://www.ws-i.org/ ) IHO International Hydrographic Organization (http://www.iho.shom.fr/ ) LIF Location Interoperability Forum (http://www.openmobilealliance.org/lif/ ) GSDI Global Spatial Data Infrastructure (http://www.gsdi.org/) CEN European Committee for Standardization (http://www.cenorm.be/) DGIWG Digital Geographic Information Working Group (http://www.digest.org/)
National Spatial Data Infrastructure (NSDI) (www.nsdiindia.gov.in) NNRMS standards
National Spatial Data Infrastructure (NSDI) A Clearing House for information on spatial data (Metadata) generated by various National and State Agencies
NSDI Stakeholders
NSDI METADATA STANDARDS NATIONAL SPATIAL DATA EXCHANGE (NSDE) FORMAT
For interoperability in GIS, Open GIS Consortium (OGC) is the key.
The Open Geospatial Consortium (OGC), an international voluntary consensus standards organization, originated in 1994. In the OGC, more than 400 commercial, governmental, nonprofit and research organizations worldwide collaborate in a consensus process encouraging development and implementation of open standards for geospatial content and services, GIS data processing and data sharing The OGC standards baseline comprises more than 30 standards
ISO/TC 211 is a standard technical committee formed within ISO, tasked with covering the areas of digital geographic information and geomatics. It is responsible for preparation of a series of International Standards and Technical Specifications numbered in the range starting at 19101.
ISO/TC 211
The Infrastructure for Geospatial Standardization
Data Models for Geographic Information
Geographic Information Management Geographic Information Services
Encoding of Geographic Information
Specific Thematic Areas
References
1. Principles of Geographical Information System for Land Resource Assessment by P.A. Burrough, Oxford University Press 2. Concepts and Techniques of GIS (2nd Edition) by Chor Pang Lo, Albert K W Yeung, Published by Prentice Hall 3. An Introduction to Database Systems by C.J. Date