0% found this document useful (0 votes)

36 views14 pages

Data Driven+Geography

The article discusses the shift from a data-scarce to a data-rich environment in geographic research, emphasizing the emergence of data-driven geography influenced by Big Data. It highlights the challenges associated with this transition, including issues of data volume, velocity, and variety, as well as the implications for research methodology and theory. The authors argue that while data-driven approaches may seem revolutionary, they are rooted in longstanding geographic research themes and present both opportunities and challenges for understanding spatial dynamics.

Uploaded by

longbzhou

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views14 pages

Data Driven+Geography

Uploaded by

longbzhou

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/282538532

Data-driven geography

Article in GeoJournal · August 2015

DOI: 10.1007/s10708-014-9602-6

CITATIONS READS

194 1,657

2 authors:

Harvey Miller Michael Goodchild

The Ohio State University University of California, Santa Barbara
134 PUBLICATIONS 7,049 CITATIONS 373 PUBLICATIONS 24,484 CITATIONS

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Multi-Dimensional Visualization View project

PiHG GIS updates View project

All content following this page was uploaded by Harvey Miller on 26 April 2016.

The user has requested enhancement of the downloaded file.

GeoJournal (2015) 80:449–461
DOI 10.1007/s10708-014-9602-6

Data-driven geography
Harvey J. Miller • Michael F. Goodchild

Published online: 10 October 2014

Ó Springer Science+Business Media Dordrecht 2014

Abstract The context for geographic research has knowledge to clean data and to ignore spurious
shifted from a data-scarce to a data-rich environment, patterns, and how to build data-driven models that
in which the most fundamental changes are not just the are both true and understandable.
volume of data, but the variety and the velocity at
which we can capture georeferenced data; trends often Keywords Big data GIScience Spatial statistics
associated with the concept of Big Data. A data-driven Geographic knowledge discovery Geographic
geography may be emerging in response to the wealth thought Time geography
of georeferenced data flowing from sensors and people
in the environment. Although this may seem revolu-
tionary, in fact it may be better described as evolu-
tionary. Some of the issues raised by data-driven Introduction
geography have in fact been longstanding issues in
geographic research, namely, large data volumes, A great deal of attention is being paid to the potential
dealing with populations and messy data, and tensions impact of data-driven methods on the sciences. The
between idiographic versus nomothetic knowledge. ease of collecting, storing, and processing digital data
The belief that spatial context matters is a major theme may be leading to what some are calling the fourth
in geographic thought and a major motivation behind paradigm of science, following the millennia-old
approaches such as time geography, disaggregate traditional of empirical science describing natural
spatial statistics and GIScience. There is potential to phenomena, the centuries-old tradition of theoretical
use Big Data to inform both geographic knowledge- science using models and generalization, and the
discovery and spatial modeling. However, there are decades-old traditional of computational science sim-
challenges, such as how to formalize geographic ulating complex systems. Instead of looking through
telescopes and microscopes, researchers are increas-
ingly interrogating the world through large-scale,
H. J. Miller (&)
complex instruments and systems that relay observa-
Department of Geography, The Ohio State University,
Columbus, OH, USA tions to large databases to be processed and stored as
e-mail: miller.81@osu.edu information and knowledge in computers (Hey et al.
2009).
M. F. Goodchild
This fundamental change in the nature of the data
Department of Geography, University of California, Santa
Barbara, Santa Barbara, CA, USA available to researchers is leading to what some call
e-mail: good@geog.ucsb.edu Big Data. Big Data refer to data that outstrip our

123
450 GeoJournal (2015) 80:449–461

capabilities to analyze. This has three dimensions, the and the environmental, and the existence within the
so-called ‘‘three Vs’’: (1) volume—the amount of data discipline of traditions with very different approaches
that can be collected and stored; (2) velocity—the to research. Moreover, although data-driven geogra-
speed at which data can be captured; and (3) variety— phy may seem revolutionary, in fact it may be better
encompassing both structured (organized and stored in described as evolutionary since its challenges have
tables and relations) and unstructured (text, imagery) long been themes in the history of geographic thought
data (Dumbill 2012). Some of these data are generated and the development of geographical techniques.
from massive simulations of complex systems such as The next section of this paper discusses the
cities (e.g., TRANSIMs; see Cetin et al. 2002), but a concepts of Big Data and data-driven geography,
large portion of the flood is from sensors and software addressing the question of what is special about the
that digitize and store a broad spectrum of social, new flood of georeferenced data. The ‘‘Data-driven
economic, political, and environmental patterns and geography: challenges’’ section of this paper dis-
processes (Graham and Shelton 2013; Kitchin 2014). cusses major challenges facing data-driven geogra-
Sources of geographically (and often temporally) phy; these include dealing with populations (not
referenced data include location-aware technologies samples), messy (not clean) data, and correlations
such as the Global Positioning System and mobile (not causality). The ‘‘Theory in data-driven geogra-
phones; in situ sensors carried by individuals in phy’’ section discusses the role of theory in data-
phones, attached to vehicles, and embedded in infra- driven geography. ‘‘Approaches to data-driven geog-
structure; remote sensors carried by airborne and raphy’’ identifies ways to incorporate Big Data into
satellite platforms; radiofrequency identification geographic research. The final section concludes this
(RFID) tags attached to objects; and georeferenced paper with a summary and some cautions on the
social media (Miller 2007, 2010; Sui and Goodchild broader impacts of data-driven geography on society.
2011; Townsend 2013).
Yet despite the enthusiasm over Big Data and data-
driven methods, the role it can play in scholarly
research, and specifically research in geography may Big data and data-driven geography
not be immediately apparent. Are theory and expla-
nation archaic when we can measure and describe so Humanity’s current ability to acquire, process, share,
much, so quickly? Does data velocity really matter in and analyze huge quantities of data is without prec-
research, with its traditions of careful reflection? Can edent in human history. It has led to the coining of such
the obvious problems associated with variety—lack of terms as the ‘‘exaflood’’ and the metaphor of ‘‘drinking
quality control, lack of rigorous sampling design—be from a firehose’’ (Sui et al. 2013; Waldrop 1990). It is
overcome? Can we make valid generalizations from also led to the suggestion that we are entering a new,
ongoing, serendipitous (instead of carefully designed fourth phase of science that will be driven not so much
and instrumented) data collection? In short, can Big by careful observation by individuals, or theory
Data and data-driven methods lead to significant development, or computational simulation, as by this
discoveries in geographic research? Or will the new abundance of digital data (Hey et al. 2009).
research community continue to rely on what for the It is worth recognizing immediately, however, that
purposes of this paper we will term Scarce Data: the the firehose metaphor has a comparatively long history
products of public-sector statistical programs that in geography, and that the discipline is by no means
have long provided the major input to research in new to an abundance of voluminous data. The Landsat
quantitative human geography? program of satellite-based remote sensing began in the
Our purpose in this paper is to explore the impli- early 1970s by acquiring data at rates that were well in
cations of these tensions—theory-driven versus data- excess of the analytic capacities of the computational
driven research, prediction versus discovery, law- systems of the time; subsequent improvements in
seeking versus description-seeking—for research in sensor resolution and the proliferation of military and
geography. We anticipate that geography will provide civilian satellites have meant that four decades later
a distinct context for several reasons: the specific issues data volumes continue to challenge even the most
associated with location, the integration of the social powerful computational systems.

123
GeoJournal (2015) 80:449–461 451

Volume is clearly not the only characteristic that Data-driven geography: challenges
distinguishes today’s data supply from that of previous
eras. Today, data are being collected from many In Big Data: A Revolution That Will Transform How
sources, including social media, crowd sourcing, We Live, Work, and Think, Mayer-Schonberger and
ground-based sensor networks, and surveillance cam- Cukier (2013) identify three main challenges of Big
eras, and our ability to integrate such data and draw Data in science: (1) populations, not samples; (2)
inferences has expanded along with the volume of the messy, not clean data, and; (3) correlations, not
supply. The phrase Big Data implies a world in which causality. We discuss these three challenges for
predictions are made by mining data for patterns and geographic research in the following subsections.
correlations among these new sources, and some very
compelling instances of surprisingly accurate predic- Populations, not samples
tions have surfaced in the past few years with respect
to the results of the Eurovision song contest (O’Leary Back when analysis was largely performed by hand
2012), the stock market (Preis et al. 2013), and the flu rather than by machines, dealing with large volumes of
(Butler 2008). The theme of Big Data is often data was impractical. Instead, researchers developed
associated not only with volume but with variety, methods for collecting representative samples and for
reflecting these multiple sources, and velocity, given generalizing to inferences about the population from
the speed with which such data can now be analyzed to which they were drawn. Random sampling was thus a
make predictions in close-to-real time. strategy for dealing with information overload in an
Ubiquitous, ongoing data flows are a big deal earlier era. In statistical programs such as the US Census
because they allow us to capture spatio-temporal of Population it was also a means for controlling costs.
dynamics directly (rather than inferring them from Random sampling works well, but it is fragile: it
snapshots) and at multiple scales. The data are works only as long as the sampling is representative. A
collected on an ongoing basis, meaning that both sampling rate of one in six (the rate previously used by
mundane and unplanned events can be captured. To the US Bureau of the Census for its more elaborate
borrow Nassim Taleb’s metaphor for probable and Long Form) may be adequate for some purposes, but
inconsequential versus improbable but consequential becomes increasingly problematic when analysis
events (Taleb 2007): we do not need to sort the white focuses on comparatively rare subcategories. Random
swans from the black swans before collecting data: we sampling also requires a process for enumerating and
can measure all swans and then figure out later which selecting from the population (a sampling frame),
are white or black. White swans may also combine in which is problematic if enumeration is incomplete.
surprising ways to form black-swan events. Sample data also has a lack of extensibility for
Big Data is leading to new approaches to research secondary uses. Because randomness is so critical, one
methodology. Fotheringham (1998) defines geocom- must carefully plan for sampling, and it may be
putation as quantitative spatial analysis where the difficult to re-analyze the data for purposes other than
computer plays a pivotal role. The use of the computer those for which it was collected (Mayer-Schonberger
drives the form of the analysis rather than just being a and Cukier 2013).
convenient vehicle: analysts design geocomputational In contrast, many of the new data sources consist of
techniques with the computer in mind. Similarly, data populations, not samples: the ease of collecting,
play a pivotal role in data-driven methods. From this storing, and processing digital data means that instead
perspective data are not just a convenient way to of dealing with a small representation of the popula-
calibrate, validate, and test but rather the driving force tion we can work with the entire population and thus
behind the analysis. Consequently, analysts design escape one of the constraints of the past. But one
data-driven techniques with data in mind–and not just problem with populations is that they are often self-
large volumes of data, but a wider spectrum of data selected rather than sampled: for example, all people
flowing at higher speeds from the world. In this sense who signed up for Facebook, all people who carry
we may indeed be entering a fourth scientific paradigm smartphones, or all cars than happened to travel within
where scientific methods are configured to satisfy data the City of London between 8 a.m.–11:00 a.m. on 2
rather than data configured to satisfy methods. September 2013. Geolocated tweets are an attractive

123
452 GeoJournal (2015) 80:449–461

source of information on current trends (e.g., Tsou private actions that people wish to keep private
et al. 2013), but only a small fraction of tweets are (Goffman 1959). While there are certainly cases of
accurately geolocated using GPS. Since we do not over-sharing behavior (especially among celebrities)
know the demographic characteristics of any of these we cannot be assured that the information people
groups, it is impossible to generalize from them to any volunteer is an accurate depiction of their complete
larger populations from which they might have been lives or just of the lives they wish to present to the
drawn. social sphere. Several geographic questions follow
Yet geographers have long had to contend with the from these observations. What is the geography of
issues associated with samples and their parent stage versus backstage realms in a city or region? Does
populations. Consider, for example, an analysis of this distribution vary by age, gender, socioeconomic
the relationship between people over 65 years old and status, or culture? What do these imply for what we
people registered as Republicans, the case studied by can know about human spatial behavior?
Openshaw and Taylor in their seminal article on the In addition to selective volunteering of information
modifiable areal unit problem (Openshaw and Taylor about their lives, there also may be selection biases in
1979). The 99 counties of Iowa (their source of data) the information people volunteer about environments.
are all of the counties that exist in Iowa. They are not Open Street Map (OSM) is often identified as a
therefore a random sample of Iowa counties, or even a successful crowdsourced mapping project: many cities
representative sample of counties of the US, so the of the world have been mapped by people on a voluntary
methods of inferential statistics that assume random basis to a remarkable degree of accuracy. However,
and independent sampling are not applicable. In some regions get mapped quicker than others, such as
remote sensing it is common to analyze all of the tourist locations, recreation areas, and affluent neigh-
pixels in a given scene; again, these are not a random borhoods, while locations of less interest to those who
sample of any larger population. participate in OSM (such as poorer neighborhoods)
However, the cases discussed above are where we receive less attention (Haklay 2010). While biases exist
can be assured that the entire population of interest is in official, administrative maps (e.g., governments in
included: we are interested in all of the land cover in a developing nations often do not map informal settle-
scene, or all of the people over 65 and Republicans in ments such as favelas), the biases in crowdsourced maps
Iowa. This is often not true with many new sources of are likely to be more subtle. Similarly, the rise of civic
data. A challenge is how to identify the niches to hacking where citizens generate data, maps, and tools to
which monitored population data can be applied with solve social problems tends to focus on the problems
reasonable generality. This inverts the classic sam- that citizens with laptops, fast internet connections,
pling problem where we identify a question and collect technical skills, and available time consider to be
data to answer that question. Instead, we collect the important (Townsend 2013).
data and determine what questions we can answer.
Another issue concerns what people are volunteer- Messy, not clean
ing when they volunteer geographic and other infor-
mation (Goodchild 2007). Social media such as The new data sources are often messy, consisting of
Facebook may have high penetration rates with data that are unstructured, collected with no quality
respect to population, but do not necessarily have control, and frequently accompanied by no documen-
high penetration rates into peoples’ lives. Checking in tation or metadata. There are at least two ways of
at an orchestra concert or lecture provides a noble dealing with such messiness. On the one hand, we can
image that a person would like to promote, while restrict our use of the data to tasks that do not attempt
checking in at a bar at 10am is an image that a person to generalize or to make assumptions about quality.
may be less keen to share. In the classic sociology text Messy data can be useful in what one might term the
The Presentation of Self in Everyday Life, Erving softer areas of science: initial exploration of study
Goffman uses theater as a metaphor and distinguishes areas, or the generation of hypotheses. Ethnography,
between stage and backstage behaviors, with stage qualitative research, and investigations of Grounded
behaviors being consistent with the role people wish to Theory (Glaser and Strauss 1967) often focus on using
play in public life and backstage behaviors being interviews, text, and other sources to reveal what was

123
GeoJournal (2015) 80:449–461 453

otherwise not known or recognized, and in such Goodchild and Li (2012) describe the social
contexts the kinds of rigorous sampling and docu- solution as implementing a hierarchical structure of
mentation associated with Scarce Data are largely volunteer moderators and gatekeepers. Individuals are
unnecessary. We discuss this option in greater detail nominated to roles in the hierarchy based on their track
later in the paper. record of activity and the accuracy of their contribu-
On the other hand, we can attempt to clean and verify tions. Volunteered facts that appear questionable or
the data, removing as much as possible of the messi- contestable are referred up the hierarchy, to be
ness, for use in traditional scientific knowledge con- accepted, queried, or rejected as appropriate. Schemes
struction. Goodchild and Li (2012) discuss this such as this have been implemented by many projects,
approach in the context of crowdsourced geographic including OSM and Wikipedia. Their major disad-
information. They note that traditional production of vantage is speed: since humans are involved, the
geographic information has relied on multiple sources, solution is best suited to applications where time is not
and on the expertise of cartographers and domain critical.
scientists to assemble an integrated picture of the The third, the knowledge solution, asks how one
landscape. For example, terrain information may be might know if a purported fact is false, or likely to be
compiled from photogrammetry, point measurements false. Spelling errors and mistakes of syntax are simple
of elevation, and historic sources; as a result of this indicators which all of us use to triage malicious email.
process of synthesis the published result may well be In the geographic case, one can ask whether a
more accurate than any of the original sources. purported fact is consistent with what is already
Goodchild and Li (2012) argue that that traditional known about the geographic world, in terms both of
process of synthesis, which is largely hidden from facts and theories. Moreover such checks of consis-
popular view and not apparent in the final result, will tency can potentially be automated, allowing triage to
become explicit and of critical importance in the new occur in close-to real time; this approach has been
world of Big Data. They identify three strategies for implemented, although on a somewhat unstructured
cleaning and verifying messy data: (1) the crowd basis, by companies that daily receive thousands of
solution; (2) the social solution; and (3) the knowledge volunteered corrections to their geographic databases.
solution. The crowd solution is based on Linus’ Law, A purported fact can deviate from established
named in honor of the developer of Linux, Linus geographic knowledge in either syntax or semantics,
Torvalds: ‘‘Given enough eyeballs, all bugs are or both. Syntax refers to the rules by which the world is
shallow’’ (Raymond 2001). In other words, the more constructed, while semantics refers to the meaning of
people who can access and review your code, the those facts. Syntactical knowledge is often easier to
greater the accuracy of the final product. Geographic check than semantic knowledge. For example, Fig. 1
facts that can be synthesized from multiple original
reports are likely to be more accurate than single
reports. This is of course the strategy used by
Wikipedia and its analogs: open contributions and
open editing are evidently capable of producing
reasonably accurate results when assisted by various
automated editing procedures.
In the geographic case, however, several issues
arise that limit the success of the crowd solution.
Reports of events at some location may be difficult to
compare if the means used to specify location (place
names, street address, GPS) are uncertain, and if the
means used to describe the event is ambiguous.
Geographic facts may be obscure, such as the names
of mountains in remote parts of the world, and the
crowd may therefore have little interest or ability to Fig. 1 Syntactical geographic knowledge: Highway on-ramp
edit errors. feature geometry

123
454 GeoJournal (2015) 80:449–461

Fig. 2 Semantic
geographic knowledge:
Where is Mirror Lake?
(Google Earth; last accessed
24 September 2013 10:00am
EDT)

illustrates an example of syntactical geographic semantic facts that can be dismissed confidently as
knowledge. We know from engineering specifications absurd—one would not expect to see a lake scene on
that an on-ramp can only intersect a freeway at a small the top of Mt. Everest or in the Sahara Desert.
angle (typically 30 degrees or less). If a road-network Nevertheless, there is no firm line between clearly
database appears to have on-ramp intersections of[30 absurd and non-absurd semantic facts—e.g., one
degrees we know that the data are likely to be wrong; would not expect to see Venice or New York City in
in the case of Fig. 1, many of the apparent intersec- the Mojave Desert, but Las Vegas certainly exists.
tions of the light-blue segments are more likely to be A major task for the knowledge solution is formal-
overpasses or underpasses. Such errors have been izing knowledge to support automated triage of
termed errors of logical consistency in the literature of asserted facts and automated data fusion. Knowledge
geographic information science (e.g., Guptill and can be derived empirically or as predictions from
Morrison 1995). theories, models, and simulations. In the latter case,
In contrast, Fig. 2 illustrates semantic geographic we may be looking for data at variance with predic-
knowledge: a photograph of a lake that has been linked tions as part of the knowledge-discovery and con-
to the Google Earth map of The Ohio State University struction processes.
campus. However, this photograph seems to be located There are at least two major challenges to
incorrectly: we recognize the scene as Mirror Lake, a formalizing geographic knowledge. First, geographic
campus icon to the southeast of the purported location concepts such as neighborhood, region, the Midwest,
indicated on the map. The purported location must be and developing nations can be vague, fluid, and
wrong, but can we be sure? Perhaps the university contested. A second challenge is the development of
moved Mirror Lake to make way for a new Geography explicit, formal, and computable representations of
building? Or perhaps Mirror Lake was so popular that geographic knowledge. Much geographic knowledge
the university created a mirror Mirror Lake to handle is buried in formal theories, models, and equations
the overflow? We cannot immediately and with that must be solved or processed, or in informal
complete confidence dismiss this empirical fact with- language that must be interpreted. In contrast,
out additional investigation since it does not violate knowledge-discovery techniques require explicit
any known rules by which the world is constructed: representations such as rules, hierarchies, and con-
there is nothing preventing Mirror Lake from being cept networks that can be accessed directly without
moved or mirrored. Of course, there are some processing (Miller 2010).

123
GeoJournal (2015) 80:449–461 455

Correlations, not causality coherent models, unified theories, or really any

mechanistic explanation at all.’’
Traditionally, scholarly research concerns itself with Duncan Watts makes a similar argument about
knowing why something occurs. Correlations alone theory in the social sciences, stating that unprece-
are not sufficient, because the existence of correlation dented volumes of social data have the potential to
does not imply that change in either variable causes revolutionize our understanding of society, but this
change in the other. In the correlation explored by understanding will not be in the form of general laws
Openshaw and Taylor cited earlier (Openshaw and of social science or cause-and-effect social relation-
Taylor 1979), the existence of a correlation between ships. Although Watts suggests the limitations of
the number of registered Republicans in a county and theory in the era of data-driven science, he does not
the number of people aged 65 and over does not imply call for the end of theory but rather for a more modest
that either one has a causal effect on the other. Over the type of theory that would include general propositions
years, science has adopted pejorative phrases to (such as what interventions work for particular social
describe research that searches for correlations with- problems) or how more obvious social facts fit
out concern for causality or explanation: ‘‘curve- together to generate less obvious outcomes. Watts
fitting’’ comes to mind. Nevertheless correlations may links this approach to calls by sociologist Robert
be useful for prediction, especially if one is willing to Merton in the mid-twentieth century for middle-range
assume that an observed correlation can be general- theories: theories that address identifiable social
ized beyond the specific circumstances in which it is phenomena instead of abstract entities such as the
observed. entire social system (Watts 2011). Middle-range
But while they may be sufficient, explanation and theories are empirically grounded: they are based in
causality are not necessary conditions for scientific observations, and serve to derive hypotheses that can
research: much research, especially in such areas as be investigated. However, they are not endpoints:
spatial analysis, is concerned with advancing method, rather, they are temporary stepping-stones to general
whether its eventual use is for explanation or for conceptual schemes that can encompass multiple
prediction. The literature of geographic information middle-range theories (Merton 1967).
science is full of tools that have been designed not for Data-driven science seems to entail a shift away
finding explanations but for more mundane activities from the general and towards the specific—away from
such as detecting patterns, or massaging data for attempts to find universal laws than encompass all
visualization. Such tools are clearly valuable in an era places and times and towards deeper descriptions of
of data-driven science, where questions of ‘‘why’’ may what is happening at particular places and times. There
not be as important. In the next section we extend this are clearly some benefits to this change: as Batty
argument by taking up the broader question of the role (2012) points out, urban science and planning in the
of theory in data-driven geography. era of Scarce Data focused on radical and massive
changes to cities over the long-term, with little
concern for small spaces and local movements.
Theory in data-driven geography Data-driven urban science and planning can rectify
some of the consequent urban ills by allowing greater
In a widely discussed article published in Wired focus on the local and routine. However, over longer
magazine, Anderson called for the end of science as time spans and wider spatial domains the local and
we know it, claiming that the data deluge is making the routine merges into the long-term; a fundamental
scientific method obsolete (Anderson 2008). Using scientific challenge is how local and short-term Big
physics and biology as examples, he argued that as Data can inform our understanding of processes over
science has advanced it has become apparent that longer temporal and spatial horizons; in short, the
theories and models are caricatures of a deeper problem of generalization.
underlying reality that cannot be easily explained. Geography has long experience with partner-
However, explanation is not required for continuing ships—and tensions—between nomothetic (law-seek-
progress: as Anderson states ‘‘Correlation supersedes ing) and idiographic (description-seeking) knowledge
causation, and science can advance even without (Cresswell 2013). Table 1 provides a summary. The

123
456 GeoJournal (2015) 80:449–461

Table 1 A brief history of partnerships and tensions between and Edward Ullman asserting that geography should
nomothetic (law-seeking) and idiographic (description-seek- be a law-seeking science that answers the question
ing) knowledge in geographic thought
‘‘why?’’ rather than building a collection of facts
Path to geographic Advocates describing what is happening in particular regions.
knowledge Physical geographers have—perhaps wisely—disen-
Nomothetic $ idiographic Strabo gaged themselves from these debates, but the tension
Ptolemy between nomothetic and idiographic approaches per-
Nomothetic ? idiographic Varenius sists in human geography (see Cresswell 2013;
Nomothetic / idiographic Humboldt
DeLyser and Sui 2013; Schuurman 2000; Sui 2004;
Ritter
Sui and DeLyser 2012).
Idiographic Hartshorne
However, attempts to reconcile nomothetic and
idiographic knowledge did not die with Humboldt and
Nomothetic Schaefer
Ritter. Approaches such as time geography seek to
Nomothetic $ idiographic Hägerstrand (time geography)
capture context and history and recognize the roles of
Fotheringham/Anselin (local
spatial statistics) both agency and structure in human behavior (Cres-
Tomlinson/Goodchild swell 2013). In spatial analysis, the trend towards local
(GIScience) statistics, exemplified by Geographically Weighted
Regression (Fotheringham et al. 2002) and Local
Indicators of Spatial Association (Anselin 1995),
early history of geography in the time of Strabo (64/63 represents a compromise in which the general princi-
BCE–24 CE) and Ptolemy (90-168 CE) involved both ples of nomothetic geography are allowed to express
generalizations about the Earth and intimate descrip- themselves differently across geographic space.
tions of specific places and regions; these were two Goodchild (2004) has characterized GIS as combining
sides of the same coin. Bernhardus Varenius the nomothetic, in its software and algorithms, with
(1622–1650) conceptualized geography as consisting the idiographic in its databases.
of general (scientific) and special (regional) knowl- In a sense, the paths to geographic knowledge
edge, although he considered the latter to be subsidiary engendered by data-intensive approaches such as time
to the former (Warntz 1989; Goodchild et al. 1999). geography, disaggregate spatial statistics and GI-
Alexander von Humboldt (1769–1859) and Carl Ritter Science are a return to the early foundation of
(1779–1859), often regarded as the founders of geography where neither law-seeking nor descrip-
modern geography, tried to derive general laws tion-seeking were privileged. Geographic generaliza-
through careful measurement of geographic phenom- tions and laws are possible but space matters: spatial
ena at particular locations and times. In more recent dependency and spatial heterogeneity create local
times, the historic balance between nomothetic and context that shapes physical and human processes as
idiographic geographic knowledge has become more they evolve on the surface of the Earth. Geographers
unstable. The early twentieth century witnessed the have believed this for a long time, but this belief is also
dominance of nomothetic geography in the guise of supported by recent breakthroughs in complex sys-
the environmental determinism in the early 1900s, tems theory, which suggests that patterns of local
followed by a backlash against its abuses and the interactions lead to emergent behaviors that cannot be
subsequent rise of idiographic geography in the form understood in isolation at either the local or global
of areal differentiation: Richard Hartshorne famously levels. Understanding the interactions among agents
declared in The Nature of Geography that the only law within an environment is the scientific glue that binds
in geography is that all areas are unique (Hartshorne the local with the global (Flake 1998).
1939). The dominance of idiographic geography and In short, data-driven geography is not necessarily a
the concurrent crisis in American academic geography radical break with the geographic tradition: geography
(in particular, the closing of Harvard’s geography has a longstanding belief in the value of idiographic
program in 1948; Smith 1992) led to the Quantitative knowledge by itself as well as its role in constructing
Revolution of the 1950s and 1960s, with geographers nomothetic knowledge. Although this belief has been
such as Fred Schaefer, William Bunge, Peter Haggett, tenuous and contested at times, data-driven geography

123
GeoJournal (2015) 80:449–461 457

may provide the paths between idiographic and starts with data describing something and ends with
nomothetic knowledge that geographers have been a hypothesis that explains the data. It is a weaker
seeking for two millennia. However, while complexity form of inference relative to deductive or inductive
theory supports this belief, it also suggests that this reasoning: deductive reasoning shows that X must
knowledge may have inherent limitations: emergent be true, inductive reasoning shows that X is true,
behavior is by definition surprising. while abductive reasoning shows only that X may be
true. Nevertheless, abductive reasoning is critically
important in science, particularly in the initial
Approaches to data-driven geography discovery stage that precedes the use of deductive
or inductive approaches to knowledge-construction
If we accept the premise—at least until proven (Miller 2010).
otherwise—that Big Data and data-driven science Abductive reasoning requires four capabilities: (1)
harmonize with longstanding themes and beliefs in the ability to posit new fragments of theory; (2) a
geography, the question that follows is: how can data- massive set of knowledge to draw from, ranging from
driven approaches fit into geographic research? Data- common sense to domain expertise; (3) a means of
driven approaches can support both geographic searching through this knowledge collection for
knowledge-discovery and spatial modeling. However, connections between data patterns and possible expla-
there are some challenges and cautions that must be nations, and; (4) complex problem-solving strategies
recognized. such as analogy, approximation, and guesses. Humans
have proven to be more successful than machines in
Data-driven geographic knowledge discovery performing these complex tasks, suggesting that data-
driven knowledge-discovery should try to leverage
Geographic knowledge-discovery refers to the initial these human capabilities through methods such as
stage of the scientific process where the investigator geovisualization rather than try to automate the
forms his or her conceptual view of the system, discovery process. Gahegan (2009) envisions a
develops hypotheses to be tested, and performs human-centered process where geovisualization
groundwork to support the knowledge-construction serves as the central framework for creating chains
process. Geographic data facilitates this crucial phase of inference among abductive, inductive, and deduc-
of the scientific process by supporting activities such tive approaches in science, allowing more interactions
as study-site selection and reconnaissance, ethnogra- and synergy among these approaches to geographic
phy, experimental design, and logistics. knowledge building.
Perhaps the most transformative impact of data- One of the problems with Big Data is the size and
driven science on geographic knowledge-discovery complexity of the information space implied by a
will be through data-exploration and hypothesis massive multivariate database. A good data-explora-
generation. Similar to a telescope or microscope, tion system should generate all of the interesting
systems for capturing, storing, and processing massive patterns in a database, but only the interesting ones to
amounts of data can allow investigators to augment avoid overwhelming the analyst. Two ways to manage
their perceptions of reality and see things that would the large number of potential patterns are background
otherwise be hidden or too faint to perceive. From this knowledge and interestingness measures. Background
perspective, data-driven science is not necessarily a knowledge guides the search for patterns by repre-
radically new approach, but rather a way to enhance senting accepted knowledge about the system to focus
inference for the longstanding processes of explora- the search for novel patterns. In contrast, we can use
tion and hypothesis generation prior to knowledge- interestingness measures a posteriori to filter spurious
construction through analysis, modeling, and verifi- patterns by rating each pattern based on dimensions
cation (Miller 2010). such as simplicity, certainty, utility, and novelty.
Data-driven knowledge-discovery has a philo- Patterns with ratings below a user-specified threshold
sophical foundation: abductive reasoning, a form of are discarded or ignored (Miller 2010). Both of these
inference articulated by astronomer and mathemati- approaches require formalization of geographic
cian C. S. Peirce (1894–1914). Abductive reasoning knowledge, a challenge discussed earlier in this paper.

123
458 GeoJournal (2015) 80:449–461

Data-driven modeling GAM is arguably an exploratory technique, while

Openshaw’s automated system for exploring a uni-
Traditional approaches to modeling are deductive: the verse of possible spatial interaction models leaps more
scientist develops (or modifies or borrows) a theory into the traditional realm of deductive modeling. The
and derives a formal representation that can be automated system uses genetic programming to breed
manipulated to generate predictions about the real spatial interaction models from basic elements such as
world that can be tested with data. Theory-free the model variables (e.g., origin inflow and destination
modeling, on the other hand, builds models based on outflow totals, travel cost, intervening opportunities),
induction from data rather than through deduction functional forms (e.g., square root, exponential),
from theory. parameterizations, and binary operators (add, subtract,
The field of economics has flirted with data-driven multiply and divide) using goodness-of fit as a
modeling in the form of general-to-specific modeling criterion (Diplock 1998; Openshaw 1988).
(Miller 2010). In this strategy, the researcher starts One challenge in theory-free modeling is that it
with the most complex model possible and reduces it takes away a powerful mechanism for improving the
to a more elegant one based on data, founded on the effectiveness of a search for an explanatory model—
belief that, given enough data, only the true specifi- namely, theory. Theory tells us where to look for
cation will survive a sufficiently stringent battery of explanation, and (perhaps more importantly) where
statistical tests designed to pare variables from the not to look. In the specific case of spatial interaction
model. This contrasts with the traditional specific-to- modeling, for example, the need for models to be
general strategy where one starts with a spare model dimensionally consistent can limit the options, though
based on theory and conservatively builds a more the possibility of dimensional analysis (Gibbings
complex model (Hoover and Perez 1999). However, 2011) was not employed in Openshaw’s work. The
this approach is controversial, with some arguing that information space implied by a universe of potential
given the enormous number of potential models one models can be enormous even in a limited domain
would have to be very lucky to encompass the true such as spatial interaction. Powerful computers and
model within the initial, complex model. Therefore, clever search techniques can certainly improve our
predictive performance is the only relevant criterion; chances (Gahegan 2000). But as the volume, variety,
explanation is irrelevant (Hand 1999). and velocity of data increase, the size of the informa-
Geography has also witnessed attempts at theory- tion spaces for possible models also increases, leading
free modeling, also not without controversy. Stan to a type of arms race with perhaps no clear winner.
Openshaw is a particularly strong advocate for using A second challenge in data-driven modeling is that
the power of computers to build models from data: the data drive the form of the model, meaning there is
examples include the Geographical Analysis Machine no guarantee that the same model will result from a
(GAM) for spatial clustering of point data, and different data set. Even given the same data set, many
automated systems for spatial interaction modeling. different models could be generated that fit the data,
GAM uses a technique that generates local clusters or meaning that slight alterations in the goodness-of-fit
‘‘hot spots’’ without requiring a priori theory or criterion used to drive model selection can produce
knowledge about the underlying statistical distribu- very different models (Fotheringham 1998). This is
tion. GAM searches for clusters by systematically essentially the problem of statistical overfitting, a
expanding circular search from locations within a well-known problem with inductive techniques such
lattice. The system saves circles with observed counts as artificial neural networks and machine learning.
greater than expected and then systematically varies However, despite methods and strategies to avoid
the radii and lattice resolution to begin the search overfitting, it appears to be endemic: some estimate
again. The researcher does not need to hypothesize or that three-quarters of the published scientific papers in
have any prior expectations regarding the spatial machine learning are flawed due to overfitting (The
distribution of the phenomenon: the system searches, Economist 19 October 2013).
in a brute-force manner, all possible (or reasonable, at A third challenge in theory-free modeling is the
least) spatial resolutions and neighborhoods (Charlton complexity of resulting models. Traditional model
2008; Openshaw et al. 1987). building in science uses parsimony as a guiding

123
GeoJournal (2015) 80:449–461 459

Fig. 3 Three of the spatial

interaction models
generated by Openshaw’s
automated modeling system
(Openshaw 1988)

principle: the best model is the one that explains the from Nate Silver: telling stories about data instead of
most with the least. This is sometimes referred to as reality is dangerous and can lead to mistaking noise for
‘‘Occam’s Razor’’: given two models with equal signal (Silver 2012).
validity, the simpler model is better. Model interpre- A final challenge in data-driven spatial modeling is
tation is an informal but key test: the model builder de-skilling: a loss of modeling and analysis skills.
must be able to explain what the model results say While allocating mundane tasks to computers frees
about reality. Models derived computationally from humans to perform sophisticated activities, there are
data and fine-tuned based on feedback from predic- times when mundane skills become crucial. For
tions can generate reliable predictions from processes example, there are documented cases of airline pilots,
that are too complex for the human brain (Townsend due to a lack of manual flying experience, reacted
2013; Weinberger 2011). For example, Openshaw’s badly in emergencies when the autopilot shuts off
automated system for breeding spatial interaction (Carr 2013). Although rarely life-threatening, one
models has been known to generate very complex, could make a similar argument about automatic model
non-intuitive models (Fotheringham 1998), many of building: if a data-driven modeling process generates
which are also dimensionally inconsistent. Figure 3 anomalous results, will the analyst be able to deter-
illustrates some of the spatial interaction models mine if they are artifacts or genuine? With Open-
generated by Openshaw’s automated system; as can shaw’s automated spatial interaction modeling
be seen, they defy easy comprehension. system, the analyst may become less skilled at spatial
The knowledge from data-driven models can be interaction modeling and more skilled at combinato-
complex and non-compressible: the data are the rial optimization techniques. While these skills are
explanation. But if the explanation is not understand- valuable and may allow the analyst to reach greater
able, do we really have an explanation? Perhaps the scientific heights, they are another level removed from
nature of explanation is evolving. Perhaps computers the empirical system being modeled. However, the
are fundamental in data-driven science not only for more anomalous the results, the deeper the thinking
discovering but also for representing complex patterns required.
that are beyond human comprehension. Perhaps this is A solution to de-skilling is to force the skill: require
a temporary stopgap until we achieve convergence it as part of education and certification, or design
between human and machine intelligence as some software that encourages or requires analysts to
predict (Kurzweil 1999). While we cannot hope to maintain some basic skills. However, this is a difficult
resolve this question (or its philosophical implica- case to make compared to the hypnotic call of
tions) within this paper, we can add a cautionary note sophisticated methods with user-friendly interfaces

123
460 GeoJournal (2015) 80:449–461

(Carr 2013). Re-reading Jerry Dobson’s prescient categorizing and reacting to people and places based
essay on automated geography thirty years later on potentials derived from correlations rather than
(Dobson 1983), one is impressed by the number of actual behavior. Finally, we must avoid a data
the activities in geography that used to be painstaking dictatorship: data-driven research should support, not
but are now push-button. Geographers of a certain age replace, decision-making by intelligent and skeptical
may recall courses in basic and production cartogra- humans. Some of the other papers in this special issue
phy without much nostalgia. What skills that we explore these challenges in depth.
consider essential today will be considered the pen,
ink, and lettering kits of tomorrow? What will we
lose?
References

Conclusion Anderson, C. (2008). The end of theory: The data deluge makes
the scientific method obsolete. Wired, 16, 07.
Anselin, L. (1995). Local indicators of spatial association:
The context for geographic research has shifted from a LISA. Geographical Analysis, 27(2), 93–115.
data-scarce to a data-rich environment, in which the Batty, M. (2012). Smart cities, big data. Environment and
most fundamental changes are not the volume of data, Planning B, 39(2), 191–193.
but the variety and the velocity at which we can Butler, D. (2008). Web data predict flu. Nature, 456, 287–288.
Carr, N. (2013) The great forgetting. The Atlantic, pp. 77–81.
capture georeferenced data. A data-driven geography Cetin, N., Nagel, K., Raney, B., & Voellmy, A. (2002). Large-
may be emerging in response to the wealth of scale multi-agent transportation simulations. Computer
georeferenced data flowing from sensors and people Physics Communications, 147(1–2), 559–564.
in the environment. Some of the issues raised by data- Charlton, M. (2008). Geographical Analysis Machine (GAM).
In K. Kemp (Ed.), Encyclopedia of Geographic Informa-
driven geography have in fact been longstanding tion Science (pp. 179–180). London: Sage.
issues in geographic research, namely, large data Cresswell, T. (2013). Geographic thought: A critical introduc-
volumes, dealing with populations and messy data, tion. New York: Wiley-Blackwell.
and tensions between idiographic versus nomothetic DeLyser, D., & Sui, D. (2013). Crossing the qualitative-quan-
titative divide II: Inventive approaches to big data, mobile
knowledge. However, the belief that spatial context methods, and rhythmanalysis. Progress in Human Geog-
matters is a major theme in geographic thought and a raphy, 37(2), 293–305.
major motivation behind approaches such as time Diplock, G. (1998). Building new spatial interaction models by
geography, disaggregate spatial statistics, and GI- using genetic programming and a supercomputer. Envi-
ronment and Planning A, 30(10), 1893–1904.
Science. There is potential to use Big Data to inform Dobson, J. E. (1983). Automated geography. The Professional
both geographic knowledge-discovery and spatial Geographer, 35, 135–143.
modeling. However, there are challenges, such as Dumbill, E. (2012). What is big data? An introduction to the big
how to formalize geographic knowledge to clean data data landscape, http://strata.oreilly.com/2012/01/what-is-
big-data.html. Last accessed 17 April 2014.
and to ignore spurious patterns, and how to build data- Flake, G. W. (1998). The computational beauty of nature:
driven models that are both true and understandable. computer explorations of fractals, chaos, complex systems,
Cautionary notes need to be sounded about the and adaptation. Cambridge: MIT Press.
impact of data-driven geography on broader society Fotheringham, A. S. (1998). Trends in quantitative methods II:
Stressing the computational. Progress in Human Geogra-
(see Mayer-Schonberger and Cukier 2013). We must phy, 22(2), 283–292.
be cognizant about where this research is occurring— Fotheringham, A. S., Brunsdon, C., & Charlton, M. (2002).
in the open light of scholarly research where peer Geographically weighted regression: The analysis of
review and reproducibility is possible, or behind the spatially varying relationships. Chichester: Wiley.
Gahegan, M. (2000). On the application of inductive machine
closed doors of private-sector companies and govern- learning tools to geographical analysis. Geographical
ment agencies, as proprietary products without peer Analysis, 32(1), 113–139.
review and without full reproducibility. Privacy is a Gahegan, M. (2009). Visual exploration and explanation in
vital concern, not only as a human right but also as a geography: Analysis with light. In H. J. Miller & J. Han
(Eds.), Geographic data mining and knowledge discovery
potential source of backlash that will shut down data- (2nd ed., pp. 291–324). London: Taylor and Francis.
driven research. We must be careful to avoid pre- Gibbings, J. C. (2011). Dimensional analysis. New York:
crimes and pre-punishments (Zedner 2010): Springer.

123
GeoJournal (2015) 80:449–461 461

Glaser, B. G., & Strauss, A. L. (1967). The discovery of analysis of point data sets. International Journal of Geo-
grounded theory. Chicago: Aldine. graphical Information Systems, 1(4), 335–358.
Goffman, E. (1959). The presentation of self in everyday life. Openshaw, S., & Taylor, P. J. (1979). A million or so correlation
New York: Anchor Books. coefficients: three experiments on the modifiable areal unit
Goodchild, M. F. (2004). GIScience, geography, form, and problem. In N. Wrigley (Ed.), Statistical methods in the
process. Annals of the Association of American Geogra- social sciences (pp. 127–144). London: Pion.
phers, 94(4), 709–714. Preis, T., Moat, H. S., & Stanley, H. E. (2013). Quantifying
Goodchild, M. F. (2007). Citizens as sensors: The world of trading behavior in financial markets using Google Trends.
volunteered geography. GeoJournal, 69(4), 211–221. Scientific Reports, 3 (1684). doi:10.1038/srep01684.
Goodchild, M. F., Egenhofer, M. J., Kemp, K. K., Mark, D. M., Raymond, E. S. (2001). The cathedral and the bazaar: Musings
& Sheppard, E. (1999). Introduction to the Varenius pro- on linux and open source by an accidental revolutionary.
ject. International Journal of Geographical Information Sebastopol: O’Reilly Media.
Science, 13(8), 731–745. Schuurman, N. (2000). Trouble in the heartland: GIS and its
Goodchild, M. F., & Li, L. (2012). Assuring the quality of critics in the 1990s. Progress in Human Geography, 24(4),
volunteered geographic information. Spatial Statistics, 1, 569–589.
110–120. doi:10.1016/j.spasta.2012.03.002. Silver, N. (2012). The signal and the noise: Why most predic-
Graham, M., & Shelton, T. (2013). Geography and the future of tions fail—but some don’t.
big data, big data and the future of geography. Dialogues in Smith, N. (1992). History and philosophy of geography: Real
Human Geography, 3(3), 255–261. wars, theory wars. Progress in Human Geography, 16(2),
Guptill, S. C., & Morrison, J. L. (Eds.). (1995). Elements of 257–271.
spatial data quality. Oxford: Elsevier. Sui, D. (2004). GIS, cartography, and the ‘‘Third Culture’’:
Haklay, M. (2010). How good is volunteered geographical Geographic imaginations in the computer age. Profes-
information? A comparative study of OpenStreetMap and sional Geographer, 56(1), 62–72.
Ordnance Survey datasets. Environment and Planning B: Sui, D., & DeLyser, D. (2012). Crossing the qualitative-quan-
Planning and Design, 37(4), 682–703. titative chasm I: Hybrid geographies, the spatial turn, and
Hand, D. J. (1999). Discussion contribution on ‘data mining volunteered geographic information (VGI). Progress in
reconsidered: Encompassing and the general-to-specific Human Geography, 36(1), 111–124.
approach to specification search’ by Hoover and Perez. Sui, D., & Goodchild, M. F. (2011). The convergence of GIS and
Econometrics Journal, 2(2), 241–243. social media: Challenges for GIScience. International
Hartshorne, R. (1939). The nature of geography: A critical Journal of Geographical Information Science, 25(11),
survey of current thought in the light of the past. Wash- 1737–1748.
ington, DC: Association of American Geographers. Sui, D., Goodchild, M. F., & Elwood, S. (2013). Volunteered
Hey, T., Tansley S., & Tolle, K. (Eds.). (2009). The fourth geographic information, the exaflood, and the growing
paradigm: Data-intensive scientific discovery. digital divide. In D. Sui, S. Elwood, & M. F. Goodchild
Hoover, K. D., & Perez, S. J. (1999). Data mining reconsidered: (Eds.), Crowdsourcing geographic knowledge (pp. 1–12).
Encompassing and the general-to-specific approach to New York: Springer.
specification search. Econometrics Journal, 2(2), 167–191. Taleb, N. N. (2007). The black swan: The impact of the highly
Kitchin, R. (2014). Big data and human geography: Opportu- improbable. New York: Random House.
nities, challenges and risks. Dialogues in Human Geog- The Economist. (19 October 2013). Trouble at the lab,
raphy, 3(3), 262–267. pp. 26–30.
Kurzweil, R. (1999). The age of spiritual machines: when Townsend, A. (2013). Smart cities: Big data, civic hackers, and
computers exceed human intelligence. New York: Vintage. the quest for a new utopia. New York: Norton.
Mayer-Schonberger, V., Cukier, K. (2013). Big Data: A revo- Tsou, M. H., Yang, J. A., Lusher, D., Han, S., Spitzberg, B.,
lution that will transform how we live, work, and think. Gawron, J. M., et al. (2013). Mapping social activities and
Merton, R. K. (1967). On sociological theories of the middle concepts with social media (Twitter) and web search
range. In R. K. Merton (Ed.), On theoretical sociology (pp. engines (Yahoo and Bing): a case study in 2012 US Pres-
39–72). New York: The Free Press. idential Election. Cartography and Geographic Informa-
Miller, H. J. (2007). Place-based versus people-based geo- tion Science, 40(4), 337–348.
graphic information science. Geography Compass, 1(3), Waldrop, M. M. (1990). Learning to drink from a fire hose.
503–535. Science, 248(4956), 674–675.
Miller, H. J. (2010). The data avalanche is here. Shouldn’t we be Warntz, W. (1989). Newton, the Newtonians, and the Geogra-
digging? Journal of Regional Science, 50(1), 181–201. phia Generalis Varenii. Annals of the Association of
O’Leary, M. (2012). Eurovision statistics: post-semifinal American Geographers, 79(2), 165–191.
update, Cold Hard Facts (May 23). Available: http:// Watts, D. J. (2011). Everything is Obvious – Once You Know the
mewo2.com/nerdery/2012/05/23/eurovision-statistics- Answer. United States of America: Crown Business.
post-semifinal-update/. Accessed October 25, 2013. Weinberger, D. (2011). The machine that would predict the
Openshaw, S. (1988). Building an automated modeling system future, Scientific American, November 15, 2011. http://
to explore a universe of spatial interaction models. Geo- www.scientificamerican.com/article.cfm?id=the-machine-
graphical Analysis, 20(1), 31–46. that-would-predict.
Openshaw, S., Charlton, M., Wymer, C., & Craft, A. (1987). Zedner, L. (2010). Pre-crime and pre-punishment: a health
A Mark I geographical analysis machine for the automated warning. Criminal Justice Matters, 81(1), 24–25.

123

View publication stats

Geographic Data Science
No ratings yet
Geographic Data Science
15 pages
Ography Research 1 PDF
100% (1)
Ography Research 1 PDF
324 pages
Geographic Data Mining Insights
No ratings yet
Geographic Data Mining Insights
6 pages
Algorithmic Geographies: Big Data, Algorithmic Uncertainty, and The Production of Geographic Knowledge
No ratings yet
Algorithmic Geographies: Big Data, Algorithmic Uncertainty, and The Production of Geographic Knowledge
10 pages
Geospatial
No ratings yet
Geospatial
5 pages
Geospatial Challenges in The 21st Century: Kostis Koutsopoulos Rafael de Miguel González Karl Donert Editors
No ratings yet
Geospatial Challenges in The 21st Century: Kostis Koutsopoulos Rafael de Miguel González Karl Donert Editors
422 pages
Big Data and Human Geography: Opportunities, Challenges and Risks
No ratings yet
Big Data and Human Geography: Opportunities, Challenges and Risks
6 pages
Previewpdf
100% (1)
Previewpdf
66 pages
Geospatial Big Data Cartography
No ratings yet
Geospatial Big Data Cartography
49 pages
Download
No ratings yet
Download
25 pages
Geoscience Knowledge Graph in The Big Data Era
No ratings yet
Geoscience Knowledge Graph in The Big Data Era
11 pages
Spatial Data Science: Geo-Information
No ratings yet
Spatial Data Science: Geo-Information
5 pages
Geographic Data Mining and Knowledge Discovery 1st Edition Harvey J. Miller (Editor) Newest Edition 2025
No ratings yet
Geographic Data Mining and Knowledge Discovery 1st Edition Harvey J. Miller (Editor) Newest Edition 2025
173 pages
Automatic Geospatial Data Matching
No ratings yet
Automatic Geospatial Data Matching
17 pages
Sustainability 14 01727 v2
No ratings yet
Sustainability 14 01727 v2
15 pages
Human-Centered Geospatial Data Science
No ratings yet
Human-Centered Geospatial Data Science
10 pages
Geoai: Spatially Explicit Artificial Intelligence Techniques For Geographic Knowledge Discovery and Beyond
No ratings yet
Geoai: Spatially Explicit Artificial Intelligence Techniques For Geographic Knowledge Discovery and Beyond
13 pages
2020 - GeoAI Spatially Explicit Artificial Intelligence Techniques For Geographic Knowledge Discovery and Beyond
No ratings yet
2020 - GeoAI Spatially Explicit Artificial Intelligence Techniques For Geographic Knowledge Discovery and Beyond
13 pages
Big Earth Data Analytics Survey
No ratings yet
Big Earth Data Analytics Survey
26 pages
Discovery-12194036: Download PDF
No ratings yet
Discovery-12194036: Download PDF
113 pages
Editorial 21 PDF
No ratings yet
Editorial 21 PDF
5 pages
Discovery-12194036: 4.7 Out of 5.0 (95 Reviews)
No ratings yet
Discovery-12194036: 4.7 Out of 5.0 (95 Reviews)
142 pages
1996 Jul 833-838
No ratings yet
1996 Jul 833-838
6 pages
Innovative Way To Support Data Processing Using The Geospatial Data Science
No ratings yet
Innovative Way To Support Data Processing Using The Geospatial Data Science
8 pages
Crowdsourced Geospatial Data in Urban Science
No ratings yet
Crowdsourced Geospatial Data in Urban Science
5 pages
Key Methods in Geography 3rd Edition, (Ebook PDF) Available Full Chapters
100% (1)
Key Methods in Geography 3rd Edition, (Ebook PDF) Available Full Chapters
164 pages
Ebook PDF Key Methods in Geography 3Rd Edition Ebook PDF Full Chapter
100% (42)
Ebook PDF Key Methods in Geography 3Rd Edition Ebook PDF Full Chapter
61 pages
Earth Science (Big) Data Analytics: March 2018
No ratings yet
Earth Science (Big) Data Analytics: March 2018
37 pages
New Horizonsfor Regional Geography Eurasian Geographyand Economics 2009
No ratings yet
New Horizonsfor Regional Geography Eurasian Geographyand Economics 2009
13 pages
Recent Trends and Developments in Social Sciences: With Special Reference To Application in Geography
No ratings yet
Recent Trends and Developments in Social Sciences: With Special Reference To Application in Geography
6 pages
A New Direction For Applied Geography: Regmald G Golledge Jack M Loomls Roberta L Klatzky
No ratings yet
A New Direction For Applied Geography: Regmald G Golledge Jack M Loomls Roberta L Klatzky
21 pages
Geography's Four Traditions Explained
No ratings yet
Geography's Four Traditions Explained
7 pages
A Survey of Spatial Data Mining Methods Databases
No ratings yet
A Survey of Spatial Data Mining Methods Databases
10 pages
Geographic Thinking For Data Scientists - Geographic Data Science With Python
No ratings yet
Geographic Thinking For Data Scientists - Geographic Data Science With Python
15 pages
Big Data, Smart Cities and City Planning: December 2013
No ratings yet
Big Data, Smart Cities and City Planning: December 2013
7 pages
Spatial Data Mining Overview
No ratings yet
Spatial Data Mining Overview
25 pages
Key Final Review Packet
No ratings yet
Key Final Review Packet
35 pages
2018 - TGIS - Validity of Historical VGI
No ratings yet
2018 - TGIS - Validity of Historical VGI
16 pages
Wenzhong Shi, Peter Fisher, Michael F. Goodchild - Spatial Data Quality-CRC Press (2002)
100% (1)
Wenzhong Shi, Peter Fisher, Michael F. Goodchild - Spatial Data Quality-CRC Press (2002)
354 pages
Principle of Geographic Information Systems
No ratings yet
Principle of Geographic Information Systems
20 pages
Monitoring Physical Growth of Nanded City by Using Geoinformatics Techniques
No ratings yet
Monitoring Physical Growth of Nanded City by Using Geoinformatics Techniques
8 pages
SSCI583 - Issues With Spatial Analysis - Spring2023
No ratings yet
SSCI583 - Issues With Spatial Analysis - Spring2023
34 pages
Guinian Lu - Michael-Batty - Reflections and Speculations On The Progress in GIS - 2018
No ratings yet
Guinian Lu - Michael-Batty - Reflections and Speculations On The Progress in GIS - 2018
23 pages
Machine Learning Algorithms For GeoSpatial Data - Applications and Software Tools
No ratings yet
Machine Learning Algorithms For GeoSpatial Data - Applications and Software Tools
9 pages
MacEachren CGIS 01 Kraakpreprint
No ratings yet
MacEachren CGIS 01 Kraakpreprint
11 pages
Lecture 1
No ratings yet
Lecture 1
42 pages
Seeing Cities Through Big Data - Research, Methods and Applications in Urban Informatics
No ratings yet
Seeing Cities Through Big Data - Research, Methods and Applications in Urban Informatics
553 pages
Geospatial Big Data - Challenges and Opportunities Big Data Research Lee Kang 2015
No ratings yet
Geospatial Big Data - Challenges and Opportunities Big Data Research Lee Kang 2015
8 pages
BATTY Reflections and Speculations On The Progress in Geographic Information Systems GIS A Geographic Perspective
No ratings yet
BATTY Reflections and Speculations On The Progress in Geographic Information Systems GIS A Geographic Perspective
23 pages
Laws of Geography: February 2018
No ratings yet
Laws of Geography: February 2018
25 pages
Spatial Machine Learning: New Opportunities For Regional Science
No ratings yet
Spatial Machine Learning: New Opportunities For Regional Science
43 pages
Wu Liu Hu2022IJGIGeo InformationTechnologyandItsApplications Reprint
No ratings yet
Wu Liu Hu2022IJGIGeo InformationTechnologyandItsApplications Reprint
316 pages
Download
No ratings yet
Download
27 pages
Dendrogram Clustering For 3D Data Analytics in Smart City
No ratings yet
Dendrogram Clustering For 3D Data Analytics in Smart City
7 pages
Handbook of Big Geospatial Data 1st Edition Martin Werner Download
No ratings yet
Handbook of Big Geospatial Data 1st Edition Martin Werner Download
113 pages
Review
No ratings yet
Review
62 pages
2021 OxfordBibliographies GeoAI
No ratings yet
2021 OxfordBibliographies GeoAI
17 pages
Geography
No ratings yet
Geography
3 pages
Scientific Knowledge and Philosophic Thought
No ratings yet
Scientific Knowledge and Philosophic Thought
132 pages
Task 02: Example of Analysing Data and Residual Volatility and Estimating ARCH and GARCH Models
No ratings yet
Task 02: Example of Analysing Data and Residual Volatility and Estimating ARCH and GARCH Models
12 pages
350 BC METAPHYSICS by Aristotle Translated by W. D. Ross
No ratings yet
350 BC METAPHYSICS by Aristotle Translated by W. D. Ross
1 page
Notes & Highlights - How To Read A Book
No ratings yet
Notes & Highlights - How To Read A Book
6 pages
Inductive LearningMathTAP
No ratings yet
Inductive LearningMathTAP
15 pages
Crack CCE 19 FIA CAA Screening
No ratings yet
Crack CCE 19 FIA CAA Screening
88 pages
Logic Philosophy and Human Existence I
100% (1)
Logic Philosophy and Human Existence I
37 pages
AGI-25 Paper 183
No ratings yet
AGI-25 Paper 183
10 pages
Game Research Methodshods Lankoski Bjork Etal Web
100% (2)
Game Research Methodshods Lankoski Bjork Etal Web
373 pages
Bismillah Benar
No ratings yet
Bismillah Benar
14 pages
MLR Insights for Data Analysts
No ratings yet
MLR Insights for Data Analysts
34 pages
Philosophical Method of Inquiry
100% (1)
Philosophical Method of Inquiry
7 pages
Topic 6 Heteroscedasticity
No ratings yet
Topic 6 Heteroscedasticity
15 pages
Syllabus B.A. Philosophy
No ratings yet
Syllabus B.A. Philosophy
10 pages
Math Reasoning & Problem Solving
No ratings yet
Math Reasoning & Problem Solving
5 pages
Econometrics Eviews 4
No ratings yet
Econometrics Eviews 4
14 pages
Hypothesis Testing Spinning The Wheel
No ratings yet
Hypothesis Testing Spinning The Wheel
1 page
Example of Hypothesis
No ratings yet
Example of Hypothesis
1 page
Psychological Statistics Exam
No ratings yet
Psychological Statistics Exam
3 pages
Estimating R 2 Shrinkage in Regression
No ratings yet
Estimating R 2 Shrinkage in Regression
6 pages
Assignment Z Test
0% (1)
Assignment Z Test
2 pages
Nolt, John - Possible Worlds and Imagination in Informal Logic
No ratings yet
Nolt, John - Possible Worlds and Imagination in Informal Logic
4 pages
Excercise-A Independent Sample Case - Parametric Approach Q-1A)
No ratings yet
Excercise-A Independent Sample Case - Parametric Approach Q-1A)
11 pages
Kelly Addo 2023 Ghana S Disaster Management Role and Application of Information Technology
No ratings yet
Kelly Addo 2023 Ghana S Disaster Management Role and Application of Information Technology
30 pages
Unit Ii TFN
No ratings yet
Unit Ii TFN
14 pages
Forecasting Math
No ratings yet
Forecasting Math
40 pages
Chapter 8 Hypothesis Testing
No ratings yet
Chapter 8 Hypothesis Testing
34 pages
Civil Engineering Program Guide
No ratings yet
Civil Engineering Program Guide
16 pages
Sugar Concerns in Coffee Chains
100% (1)
Sugar Concerns in Coffee Chains
6 pages
(Economics Collection) Foldvary, Fred E - The Foundations of Economic Theory-Business Expert Press (2015)
No ratings yet
(Economics Collection) Foldvary, Fred E - The Foundations of Economic Theory-Business Expert Press (2015)
110 pages

Data Driven+Geography

Uploaded by

Data Driven+Geography

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Article in GeoJournal · August 2015

Harvey Miller Michael Goodchild

SEE PROFILE SEE PROFILE

Multi-Dimensional Visualization View project

PiHG GIS updates View project

The user has requested enhancement of the downloaded file.

Published online: 10 October 2014

Correlations, not causality coherent models, unified theories, or really any

Data-driven modeling GAM is arguably an exploratory technique, while

Fig. 3 Three of the spatial

View publication stats

You might also like