Introduction to GIS
Lecture 01: Introduction to GIS and GIS Data Models
Outline today
• Part 1: What is GIS?
• GIS definitions
• GIS components
• GIS history
• Common GIS software
• Part 2: GIS Data Model
• Vector/Raster data model
• Common ESRI GIS filetype
• ArcCatalog
2
Definition 1: A GIS is atoolbox
• “A powerful set of tools for storing and retrieving at will, transforming
and displaying spatial data from the real world for a particular set of
purposes. ” (Burrough, 1986)
• “A system for capturing, storing, checking, manipulating, analyzing, and
displaying data which are spatially referenced to theEarth” (Department
of Environment, 1987)
• “An information technology which stores, analyses, and displaysboth
spatial and non‐spatial data” (Parker,1988)
3
Definition 2: A GIS is anInformation System
(Database definitions)
• a System ‐ a group of connected entities and activities
• an Information System ‐ a set of procedures, executed on raw data, to
produce information for decisionmaking
• a Geographic Information System ‐ an Information System using
geographically referenced data
4
Definition 3: GIS is an approach toscience
• The science behind the technology
• Addresses the fundamental issues arising from use
• is the science needed to keep technology at the cutting edge
• Systematic study of geographic information andgeographic information
system technologies using scientificmethods
• Analogy: GIScience is to GIS as statistics is to statistical software
packages
Reference: Michael F.Goodchild, and the project, NCGIA Core
Curriculum in GIScience. 1997
5
GIScience: Contributing Disciplines
• Geography • Statistics
• Cartography • Operations research
• Remote Sensing • Computer Science
• Photogrammetry • Mathematics
• Surveying • Information Science
• Geodesy • Management science
6
Scholarly Journals Emphasizing GIS Research
• International Journal of GIScience (formerly Intern’l Journal ofGISystems)
• Cartography and GIScience (formerly American Cartographerand
Cartography and GISystems)
• Computers and Geosciences
• Computers, Environment and Urban Systems
• Photogrammetric Engineering and RemoteSensing
• Transactions in GISystems
• Geographical and Environmental Modeling
• Geographical Analysis
• GeoInformatica
• Annalsof the Association of American Geographers
• Journal of Geographical Systems (successor to GeographicalSystems)
7
Components of a GIS
1. Data
2. Hardware (computer system)
3. Software
4. Brainware: People/Procedures/Plan
5. Infrastructure: GIS operation environ mental
8
Basic Elements of a GIS:Data
• Non‐Spatial Data
• Attributes or information that describes the spatialentity
• Spatial Data: geographically referenced data
• Latitude and longitude
• X and Ycoordinates
• Street address
• Range and township
• Location shown on amap
9
GIS Data: Spatial is Special
• Geographic location is a key feature of 80‐90% of all government data.
(http://www.fgdc.gov/publications/homeland.html )
• Experts estimate that as much as 80% of the cost associated with a GIS
system is related to the development and maintenance of its spatial data
• Federal Agencies alone are spending $2.5 – 3.0 billion annually on
collection and management of geospatial data (National Academyof
Public Administration NAPA ‐ Geographic Information for the 21st
Century, 2003/05/01)
10
DataInformation
Data – numbers, text, symbols Geographic Information
• Sea surface temperature,soil (map, digital form)
type, population density • Information about places onEarth’s surface
Geographic versus spatial
Information – differentiated Geographic refers to Earth’s surfaceand
from data near surface
• implying some degree of Spatial refers to any space (moregeneral)
selection, organization, and • Knowledge about where somethingis
preparation for particular • Knowledge about what is at a given location
purpose, or
• data given some degreeof
interpretation • Can be very detailed or verycoarse
• Can be relatively static or changerapidly
• Can be very sparse orvoluminous
11
Basic Elements of a GIS:Hardware
• Fast computer with a video card & video memory
• High resolution display
• Networking capabilities
• Optional:
• Digitizing tablet or large formatscanner
• Flat bed scanner
• Large format plotter
• Printer (color or black &white)
12
Basic Elements of a GIS:Software
• Data management capabilities
• Analysis tools
• Display tools
• Information dissemination capabilities (import, export and map
creation)
• Data entry features
• Editing capabilities
13
Basic Elements of a GIS: aPlan
• The recipe for implementing a GIS at the project or program level
• Clear description of the problem (frame thequestion)
• Understanding of the data needs to solve the problem (collect your
data)
• Understanding of the GIS users and managers, desired outcomes:
efficiency, increased knowledge
• Choose analysis methods
• Process the data
• Present the results
14
A typical GIS process: start to think about your
final project now
1. Understanding basic geographic concepts 4. Database manipulation
• Projections, datums, coordinate systems • Attribute data
• Reading maps • Database management
2. Formulating a game plan • Computer database types
• Planning the process 5. Analysis techniques
3. Acquiring data • Spatial analysis
• Data storage formats • Models andmodeling
• Data sources Cartographic
• Data challenges Interpolation
Dynamic modelling
6. Presenting the results
• Map creation anddesign
15
Evolution of GIS: From Stand‐alone to WebServices
Mapping TroupMovements
Manual Map overlays
1870 1950‐1970ies 1980ies 1990ies 2000ies
16
GIS Web Service Chain
Portrayal service assembles Reprojection service
orthoimage from severalimagery the image from onecoordinate
services system to another one
Web
Coverage Service
Overlay service overlays the input
image and the vector data and
sends the overlay to theclient
Web Portrayal Reprojection
Coverage Service To Client
Service Service
Overlay
Service
Web
CoverageService Vector Data
Provider Service
Vector data provider service
returns a certain layer at the
extent specified
17
Prevailing GIS Software
• ESRI: ArcGIS (desktop and server), ArcView and other products
(http://www.esri.com)
• Quantum GIS, or QGIS, http://qgis.org/en/site/ (latest release:2.18
Las Palmas)
• Intergraph: GeoMedia and MGE (http://www.intergraph.com)
• MapInfo Inc, MapInfo: http://www.mapinfo.com
• Autodesk: Autodesk Map, http://www.autodesk.com
• Baylor Unversity, Texas; University of Hannover, Germany, GRASS
(Geographic Resources Analysis Support System): http://grass.itc.it/(free)
18
Prevailing GIS Software (ctd)
• Manifold.net, Manifold System (http://www.manifold.net)
• Clark Labs, IDRSI, http://www.clarklabs.org
• PCI Geomatics, Geomatica, http://www.pcigeomatics.com
• Caliper Corporation, TransCAD, Maptitude, http://www.caliper.com
• Leica, ERDAS http://gis.leica‐geosystems.com/
• ERMapper, ERMapper, http://www.ermapper.com
• SuperMap, SuperMap GIS, http://www.supermap.com(China)
• Free GIS software list:http://www.freegis.org
• Open source GIS: http://www.opensourcegis.org
19
Part 2: GIS Data Models
Spatial Information is usually modeled in one of two ways:
Vector Data Model Raster Data Model
Points Forest
Lines
City
River
Areas
Spatial data are represented by Space is divided into a regularly
these three objects. spaced grid; each cell is “coded”
according to what is on the
(We will use thetopological
surface.
vector model often.)
20
GIS Data Model Level
Conceptual Model
(object view or fieldview)
GIS Data Model
Vector or Raster
GIS Data Structure
(Shape, Coverage,
Geodatabase)
GIS FileStructure
21
GIS Data Conceptual Model
The strategy chosen depends on whether one takes the field‐view versus
object‐view of reality…
• Field‐view: geographic phenomena that vary continuously throughout
space. Examples: elevation, precipitation, etc.
• Object‐view: an empty space ‘littered’ with discrete objects.Examples:
roads, buildings, utilities, etc.
22
GIS Data Model: Vector vs.Raster
Real World
0 1 2 3 4 5 6 7 8 9
0 R T
1 R T
point line 2 H R
3 R
4 R R
5 R
6 R T T H
7 R T T
polygon 8 R
9 R
Vector Representation Raster Representation
23
Data Modeling Processing
After Bernhardsen 1999, p.39
24
Multiple Representations
25
Spatial Data
Spatial Data
Raster Data Vector Data Attribute Data Metadata
Non-topological Topological
Simple Higher-level Data
Vector or Raster? TIN
••Type
Type of operations
••Experience
Experience and viewers of GISusers Regions
••Data
Data availability Dynamic Segmentation
••Data
Data quality and storage
From: Intro. to GIS, Chang, 1997
26
Vector Data Model
• point (node):0‐dimension
• single x,y coordinate pair
• zero area 2
y=2
Point: (1,2)
• tree, oil well, label location x=1
1
• line (arc):1‐dimension
• two (or more) connected x,y 1 2
coordinates 2
• road, stream 1 Line: (1,2), (2,1)
• polygon :2‐dimensions
1 2
• four or more ordered and connected
x,y coordinates 2
Polygon: (1,2),
• first and last x,y pairs are thesame (2,1), (1,1) , (1,2)
1
• encloses an area
• census tracts, county, lake 1 2
27
Two Common Vector Models
• Spaghetti model (non‐topological) • Topological model
C A
28
Spaghetti Vector Model
• Lines and points are entered and may be visible, but the program does
not recognize the relationships betweenlines.
• Each line is represented as a separate feature with a start node and an
end node, possibly vertices inbetween.
• Still exists during data entry andediting
C
• You have to build topologysomehow A
29
Why Topology Matters
• Getting data to lineup,
connect, intersect,
move together
• Important for GIS
operations and analyses
• Coordinate
transformation
• Map projection
• Area calculations
• Queries
• In order to do this we
use topology
30
Definition: Topology
• ESRI: The spatial relationships between connecting oradjacent coverage
features (e.g., arcs, nodes, polygons, and points). For example, the
topology of an arc includes its from‐ and to‐ nodes and its left and right
polygons.
• Textbook: (Study of) shape‐invariant spatial properties of line or area
features such as adjacency, contiguity, and connectivity, oftenrecorded
in a set of related tables(Bolstad, p. 32‐33)
• Webster: “(Study of) those properties of geometric forms that remain
invariant under certain transformations, as bending, stretching, etc.”
(Webster’s Encyclopedic Unabridged Dictionary)
31
Topological Vector Data Model
• The connections and relationships between objects are described
independently of their coordinates
32
Topology vs. Coordinate (Continue)
• A topologically accurate
map: relationships between
subway stations are
accurately shown.
• Actual locations and shapes
of the tracks and tunnels
are not accurate.
33
How to define GISTopology
• Based on
• Point/Node: Where lines
begin, end, or intersect Point/Node
• Line/Link: Line segments
between two nodes
• Polygon: composed of
alternating links and
nodes Polygon Line/Link
• Unique identifiers are
assigned to each link,
node, and polygon
34
Define Topology
• Topology (relations) can be described in 3 tables.
• Polygon Topology Table: Links composing all thepolygons.
• Node Topology Table: List of the links that meet at each node.
• Link (or line segment) Topology Table: List of the beginning and end nodes for
each link; polygons to the right and left to the link. (“From node,” “To node,”
“right poly,” “left poly”
35
A Topology Example
N3 L3 N4 L7 N7 Polygon Topology Table
L4
N5 Polygon Links
L2 C L5 L8 A L1,L2,L3,L4,L5,L6…
N10 N6 B L7,L8,L9,L6,L5,L4
D A B D D L1,L2,L3,L7,L8,L9
L6
N8
N2 L1 N1 L9
Link Topology Table Node Topology Table
Link Start End Left Right Node Links
L1 N1 N2 D A N1 L1,L6,L9
L2 N2 N3 D A N2 L1,L2
L5 N5 N6 B A N3 L2,L3
L9 N8 N1 D B
36
Vector Data Creation
• input of the spatialdata
• Digitizing/scan then vectorized
• Build topology
• input of the attributedata
• linking spatial and attribute
data
37
Common Vector GIS DataFiles
• Coverage (topological)
• Shape File (non‐topological)
• GeoDatabase (topological)
• MapInfo (topological)
• TIN (topological)
• CAD (non topology)
38
Raster Data Model
• The raster model represents reality (and feature geometry)through
uniform, regular cells (pixels)
• Within each cell, the terrain is generalized to an areal unit in which attributes
are constant
Real World
Overlay grid on
the Real World
Raster representation
Cells
39
Raster Data Model
• The finer the grid size, the more precise the information about the real
world is
Real World
Coarse detail Finer detail
Cells are usually assigned the value of the object taking up the greatest part of
the cell area – Bolstad has a long discussion.
40
Attribute Data and Coor.System
• Each cell is identified by a row and column
number. (x,y) Columns
• Attribute values are stored for each cellbased
on the majority feature (attribute) in the cell,
such as land usetype Rows 1 2 3 4 5 6...
• Location coordinates are calculated by adding 1
or subtracting cell size x rows and columns to 2
the known coordinates of theorigin.. 3
4
• Generally, the midpoint of a cell is considered ...
its location. BUT NOT ALWAYS (e.g.ENVI)
41
Resolution
• Ingeneral, resolution can be defined as the
minimum linear dimension of the smallest
unit of geographic space for which dataare
recorded
• In the raster model the smallest units are
generally rectangular (occasionally systems
have used hexagons or triangles) ; these
smallest units are known ascells, pixels
• High resolutionrefers to raster with small
cell dimensions
• high resolution means lots of detail, lots of
cells, large rasters, small cells
42
Raster Cell Size
43
Resolution and Scale
44
Raster as Thematic Layers
• Each layer can be treatedseparate ly
• But they can also be combined in a GIS because all objects are linked toa
coordinate system.
• Spatial operations:
• Overlay
• Map algebra
45
Primary Uses for RasterData
• Modelsdescribing continuous attributes of the real world.
• Elevation, soils, temperature, etc.
• Images (satellites, scanned maps, photographs).
• Output (e.G. Printers, plotters, monitors).
Physical Variables Derived Variables, e.g.
Distance from Points
46
Vector vs. Raster Data Model
Continuous data (Raster) Discrete (Vector)
Simple data structure Complex data structure
Large data volumes Compact Data File
Easy overlay Overlay is more difficult
Rapid data collection Slow data collection
Poor network analysis Possibility of Network analysis
No topology stored (no relationships
Efficient Topology
shown)
High spatial variability Low spatial variability
Suitable for highly variable data Good for homogeneous data
Lower positional accuracy Potentially excellent positional accuracy
Determined by cell size Given by (X,Y) coordinates
Low geometric accuracy High geometric accuracy
Better suited for imagery Better suited for graphics
47
Data Conversion
• Data can be transformed from one of these data models to the other
Vectorization
•
Rasterization
•
Some information is always lost when converting from one data format to the
other.
48
Rasterization
Vector Format Raster Format
Key points:
•Rasterization loses topological features
•No information aboutrelationships
•Positional accuracy decreases
•Depends on cell size: Positional accuracy ~ ½cellsize
49
Vectorization
Raster Format Vector Format
Key points:
•Feature boundaries become jagged in the vectorrepresentation
•Topology is created (relationships)
50
Data In ArcCatalog
51
Data File Display Differently
Windows Explorer vs. ArcCatalog
52
ESRI GIS DataFiles
GeoDatabase
Shapefile Coverage
MXD file
Image
Pay attention to the icon shape and color
53
How shapefiles are stored
Shapefiles: simple, non‐topological format, storing the geometriclocation
and attribute information.
• .shp ‐ the file that stores the feature geometry.Required.
• .shx ‐ the file that stores the index of the feature geometry. Required.
• .dbf ‐ the dBASE file that stores the attribute information of features.
Required.
• .sbn and .sbx ‐ the files that store the spatial index of the features.
Optional.
• .fbn and .fbx ‐ the files that store the spatial index of the features for
shapefiles that are read‐only.Optional.
• .ain and .aih ‐ the files that store the attribute index of the active fields in
a table or a theme'sattribute table. Optional.
• .prj ‐ the file that stores the coordinate system information.Optional.
• .xml ‐ metadata for ArcGIS. Optional.
Tech details: http://www.esri.com/library/whitepapers/pdfs/shapefile.pdf
54
How coverage files are stored?
Tech details: http://avce00.maptools.org/docs/v7_bin_cover.html
55
GeoDatabase
• Database data sets (RDBMS data
sources), rather than file‐based datasets
• Supported by major RDBMS, such as DB2,
SqlServer, Oracle, Microsoft Access
(personal Geodatabase)
56
MXD FileFormat
• ArcMap Map Documentfile
• It does not save your GIS data. Map document such as symbology, layer
name saved only.
• Very important for your exercise: Relative Path vs. FullPath
57
Relative Path vs. Absolute Path
58
Absolute (full) and relative path
• An absolute, or full, path begins with a drive letter followed by a colon,
such as D:. such as:G:\classes\FoundGIS\Data
• A relative path refers to a location that is relative to a current directory.
59
Why use relative vs. absolutepath?
• Using absolute pathnames:
• You can move the document or toolbox anywhere on your computer and the
data will be found when you reopen the document or tool.
• On most personal computers, the location of data is usually constant. That is,
you typically don't move your data around much on your personal computer.
In such cases, absolute pathnamesare preferred.
• You can reference data on other diskdrives.
• Using relative pathnames: (recommended in thisclass)
• When moving a map document or toolbox, the referenced data has to move
as well.
• When delivering documents, toolboxes, and data to another user, relative
pathnames should be used. Otherwise, the recipient's computer must have
the same directory structure asyours.
60
Do not losethe big picture!
• Part 1: What is GIS?
• GIS definitions
• GIS components
• GIS history
• Common GIS software
• Part 2: GIS Data Model
• Vector and Raster
• Common ESRI GIS filetype
• ArcCatalog
61