DATA ANALYSIS
The Data Analysis refers to the size of the output cell and the area for analysis. The area analysis can
be defined by the raster minimum and maximum X & Y coordinates on a composite of raster.
EX. An Elevation raster created from a DEM of an includes no data cells along its boundary.
Two types
I. Spatial
II. Non Spatial
SPATIAL ANALYSIS
Spatial analysis deals with adding significance to the geographic data and converting the data into
useful information
Spatial analysis in GIS involves three types of operations:
i. Attribute Query also known as non-spatial Query
ii. Spatial Query and
iii. Generation of new data sets from the original database
Spatial analysis can be
A. Inductive: to study empirical elegance in the search for patterns
B. Deductive: Focussing on the non-theories or principles against data.
C. Normative: Using spatial analysis to develop or prescribe new on better design.
Spatial analysis allows us to study real-world processes including the present situation of specific
areas and features or the change in situation or the trends e.g. decreasing or increasing urbanisation
in Delhi and NCR (National Capital Region)?
Thus the spatial analysis is a process that involves looking at geographic patterns in the data and
their relationships. The actual methods for spatial analysis can be very simple such as just by making
a map of the theme you are analysing or more complex, involving models that mimic the real world
by combining many data layers.
The spatial analysis relates to the use of the technological inputs in collecting, storing, retrieving,
displaying, manipulating, managing and analysing the spatial information.
Spatial Elements
Spatial objects identified in the real world are easily identifiable into four types: Points, Lines, Areas,
and Surfaces
Point features are spatial phenomena each of which occurs at one location in space. Each feature is
said to be discrete in that it can occupy only a given point in space at any time and considered to
have no spatial dimension – no width or length. Example of such feature would be a house or a
village.
Line features are conceptualised as occupying only a single dimension in coordinate space. Roads,
rivers, are the examples of linear features. Area features have two dimensions both length and width
dimensions.
Area is composed of series of lines that begin and end at the same location. Surface, add the third
dimension of height to area features.
Surfaces have three dimensions – length, width, and height. For instance, hills, valleys, and ridges can
be described by citing their locations.
NON-SPATIAL ANALYSIS
Non-spatial data does not have a direct connection to a specific location. It
contains information that is independent of geographic coordinates or
addresses.
Characteristics:
o Attribute-Based: Non-spatial data focuses on “what” rather than
“where.” It includes characteristics such as population statistics,
land use classifications, economic indicators, and other attribute
information.
Examples of Non-Spatial Data:
o Someone’s height, sales figures, temperature readings, or text-
based information.
Applications:
o Traditional statistical analysis, text mining, and other attribute-
based analyses.
OVERLAY ANALYSIS
Overlay analysis involves combining information from one GIS layer with
another to create a new layer that highlights specific features or relationships.
Overlay analysis represent the composite map by the combination of different attribute and
geometry of datasets or entity.
Overlay is the operations of comparing variables among multiple coverages.
In the overlay analysis new spatial data sets are created by merging data from two or more input
data layers.
Overlay analysis is one of the most common and powerful GIS technique.
It analyses the multiple layer with common coordinate systems and determine what is on the top
layer. Overlay operations combine the data from same entity or different entities and create the new
geometries and new unit of change entity.
Two overlay methods
Union: Union is the laying one GIS database on another to produce a combination of
the two.
Intersection: Intersection is the laying one GIS database on another to produce a
combination of the two and finding the overlapping areas between two GIS
databases
Overlay operations can be classified into the following:
Points-in-Polygon: A point-in-polygon operation is a key feature of GIS analysis. Point
features of one layer can be overlaid on polygon features of another layer to identify
the polygons within which each point falls
Line-in-Polygon: A point-in-polygon operation is a key feature of GIS analysis. Point
features of one layer can be overlaid on polygon features of another layer to identify
the polygons within which each point falls
Polygon Overlay: Polygon Overlay is an important operation for vector GIS, but it has
two distinct versions. Polygon overlay has a numerous uses, of which land use is the
simplest. Another popular use is population statistics.
Applications od overlay analysis
1. Site Selection and Suitability Modeling:
o Overlay analysis helps identify optimal locations for specific activities
or facilities. Examples include:
New Housing Development: Determining suitable sites based
on factors like land cost, proximity to services, slope, and flood
risk.
Wildlife Habitat: Identifying areas suitable for deer habitat or
other wildlife.
Economic Growth: Predicting where economic development is
likely to occur.
Risk Assessment: Locating areas susceptible to mudslides or
other hazards.
2. Environmental Impact Assessment:
o Overlaying environmental factors (such as wetlands, forests, or water
bodies) with proposed development sites helps assess potential
impacts.
o For instance, overlaying a proposed road network with sensitive
ecosystems can highlight areas requiring mitigation.
3. Zoning and Land Use Planning:
o Overlaying zoning regulations, land use policies, and existing land
cover helps create comprehensive land use plans.
o It ensures that development aligns with local regulations and
environmental considerations.
4. Natural Resource Management:
o Overlaying data on soil types, vegetation, and water availability aids in
sustainable resource management.
o For example, identifying suitable locations for reforestation or
agricultural expansion.
5. Emergency Response and Preparedness:
o Overlaying hazard maps (flood zones, earthquake risk, etc.) with
critical infrastructure (hospitals, fire stations) helps plan emergency
response.
o It assists in identifying vulnerable areas and allocating resources
effectively.
6. Healthcare Facility Placement:
o Overlaying population density, transportation networks, and healthcare
accessibility helps determine optimal locations for hospitals, clinics,
and pharmacies.
7. Transportation Planning:
o Overlaying road networks, traffic patterns, and population centers aids
in designing efficient transportation systems.
o It informs decisions on road expansions, public transit routes, and bike
lanes.
8. Conservation and Biodiversity:
o Overlaying habitat suitability models with protected areas helps
prioritize conservation efforts.
o It guides decisions on land acquisition and habitat restoration.
RASTER ANALYSIS
Raster analysis involves analyzing spatial information contained in grid
datasets.
Raster operations include four basic types of operations:
local operations processes the contents of data sets pixel by pixel, performing
operations on each pixel or comparing the contents of the same pixel on each layer;
focal operations compares the contents of a pixel with those of neighbouring pixels,
using a fixed neighbourhood (often the pixel’s eight immediate neighbours);
zonal operations performs operations on zones or contiguous blocks of pixels having
the same values;
global operations are performed for all pixels. Examples of each type of operation
are given in the following sections
VECTOR ANALYSIS
Vector analysis deals with geometric objects represented as points, lines, and
polygons. It focuses on spatial relationships and attributes.
Common Techniques:
Overlay Analysis: Combines multiple vector layers to create new
features (e.g., union, intersection, difference).
Buffering: Creates a zone around vector features based on a
specified distance.
Network Analysis: Analyzes connectivity and flow (e.g., shortest
path, service areas).
Spatial Joins: Combines attributes from different layers based on
spatial relationships.
Topological Operations: Maintain the integrity of spatial
relationships (e.g., snapping, splitting, merging)
BUFFERING OPERATIONS
Buffering is the process of creating one or more zones around selected features,
within a pre-specified distance from features.
Buffers can be created for any kind of object such as points, lines, or areas and are very widely used
in GIS analysis.
The applications of buffering as mentioned below:
A buffer zone is often treated as a protection zone and is used for planning and regulatory
purposes.
It is useful in locating the areas/population benefitted or denied of the facilities and essential
services such as hospitals, post office, roads etc.
It can also be used to undertake population and environmental studies.
It helps in determining the spatial proximity or nearness of various geographic features.
RECLASSIFICATION
Reclassification can be defined as repackaging of exiting information based on initial values,
locations, size, shape, contiguity. Reclassification involves the (re)assignment of thematic
values to categories of an existing map.
A GIS provides a wide variety of ways to classify and reclassify the stored attribute data to
achieve the result. The nature of working and applicability of the user can actually be called
one of reclassification, because the data that are input to GIS have already been classified.
QUERY
Querying involves selecting a subset of records from a database based
on specific criteria.
Querying allows us to ask questions about geographic features and their
attributes, as well as relationships between them.
SPATIAL QUERY: It includes queries made on the spatial location of features. You can read
about some spatial queries as mentioned below:
Query a location is represented by a Point. Locational query provides the record number, or
point-ID number coordinates specific to spatial data model, latitude-longitude of the point
and the projection coordinates.
Query Distance computes both the current and cumulative distances. There are several ways
to provide the distance information, viz. the longitude and latitude of the interactive cursor
location, the pixel resolution of the current window, great circle distance between two points
etc.
Circular/Rectangular/Polygon Area Search are useful for querying a data layer and select only
those entities that fall within a user specific radius / rectangular /polygon area.
NON SPATIAL (attribute) QUERY: It refers to the characteristics of any feature or
phenomenon which are stored in form of a table in a database with a unique identifier key.
This ‘attribute’ table can be queried to locate row or column that satisfies the search
conditions. The data linkage is established between the spatial elements and their
corresponding attribute tables through the common feature identifier key
Topological query: It is defined as the inter-relationship amongst the features that are
dependent of distance or direction. The key topological relationships are connectivity,
adjacency and containment.
Spatio-temporal query: This query provides ready answers to simple questions of feature,
areal histories and attributes for the past moments in time.
SQL based query formulation: A Structured Query Language (SQL) is a tool to extract data
from a database and present result in a useful manner. SQL is used extensively in many
database applications and has become a standard for relational DBMS.
VECTOR BASED OVERLAY
Overlay of vector data combine point, line, and polygon features. In this data model
operations rely on geometry and topology of surface. Vector based overlay is time
consuming, complex and computationally expensive. For example taking the ordering
network layer of Ganga Watershed and laying over it with the layer of village. The result
would be which orders of stream of Ganga flow in which village.
Point in Polygon Overlay: operation will also generate combinative properties of point
attributes of one layer and the polygon attribute of the analysis layer. It is a spatial operation
in which one point coverage is overlaid with polygon coverage to determine which points
falls within the polygon boundaries
Polygon on polygon overlay operations: polygon on polygon overlay operations need to check before
starting the input layer it should topologically correct.
Line in area Overlay operations: Line in area overlay operations need to check linear object
or attribute which will combine or meagre with area layer. It should be also topological
correct
INTERSECT: operator performs the intersection of two input layers. The resultant layer will
keep those portions of the first input layer features which fall within the second input layer
polygon.
CLIP: creates a new map that includes only those features of the input layer that falls within
area extent of the clip map. The input layer may be points, lines, or polygons but the analysis
layer (clip layer) must be polygon layer.
ERASE. is a reverse process of Clip where the features of the input layer that fall within the
boundary of the analysis are erased and those fall outside the boundary of the analysis layer
are retained.
SPLIT. divides the input coverage into two or more coverages. For this a series of clip
operation is performed.
IDENTITY. operation overlays polygons and keeps all input layer features and only those
features from the analysis layer that overlap the input layer.
RASTER BASED OVERLAY
Raster overlay analysis involves combining multiple layers of raster
datasets that represent different themes. It allows us to analyze or identify
relationships between each layer.
The raster data processing methods can be classified into the following categories:
Local operations: Local operations are based on point-by-point or cell-by-cell analysis. The
most important of this group is the overlay analysis.
Neighbourhood operation: Local neighbourhood operations are also known as focal
operations. It uses the topological relationship of adjacency between cells in the input raster
layer to create a new layer.
Regional operations: Operators on regions (Regional operators) are also known as zonal
operations. Generally a region is defined as the area with homogeneous characteristics.
Advantages:
o Efficiency: Raster overlay is computationally less demanding and
efficient for operations.
o Cell-by-Cell Combination: It operates on a cell-by-cell basis,
making it straightforward.
o Map Algebra: It involves pixel-based calculations using map
algebra.
Use Cases:
o Land Cover Change Detection: Overlaying historical land cover
data with current data to identify changes over time.
o Habitat Suitability Modelling: Combining environmental factors
(e.g., elevation, vegetation) to assess habitat suitability for species.
o Environmental Impact Assessment: Analyzing the impact of
proposed developments on existing features (e.g., wetlands,
forests).
o Risk Assessment: Combining hazard maps (e.g., flood risk,
landslide susceptibility) to assess overall risk.
o Urban Planning: Overlaying transportation networks, land use,
and infrastructure to optimize urban development.
RASTER DATA MODELS
IS widely used in applications ranging far beyond geographic
applications. Most likely we are familiar with these data as it is related
with digital photographs UBIQUITUS, JPEG, BMP and TIFF file formats
are based on the Raster data model. The raster model consists of rows
and columns of equally sized pixels interconnected to form a planar
surface. These pixels are used as building blocks for creating points,
lines, surfaces network and aerials.
The raster model will average all values within a given pixel to yield a
single value.
Three of raster data models are:-
1. Cell-By-Cell-Data-Encoding
Minimally intensity method encoding
2. Run-Length-Data-Encoding:- These data encodes cell values in runs of
similarly values pixels.
3. Quad-Tree-Raster-Encoding:- This raster divides a raster into a hierarchy
of encoding.
Advantages and Disadvantages.
Advantages
First the technology required to raster graphics is inexpensive and
ubiquitous.
Nearly everyone currently owns some sort of raster image generator.
Relative simplicity of underline data structure.
Disadvantage
The raster files are typically very large particularly in the case of Raster
images.
The output images are less than their vector counterparts.
It is not suitable for some types of spatial analyst.
VECTOR DATA MODEL
Vector Data model Structures
1. Spaghetti Data Model
Each point, line and polygon feature is represented as a stream of X
and Y coordinates.
2. Topological Data Model.
Is characterised by the inclusion of Topological information within the
dataset.
Advantages
Found to be better representative than raster data model
Vector data tend to be more compact in data structure, so file sizes are
much smaller
Types of GIS Models
Binary Model
Index Model
Regression
Process
Disadvantages
Forest data model tend to be more complex than raster data model
The algorithms for manipulating and analyzing vector data are complex
and can lead to intensive processing.