Skip to main content

Caglar Koylu

University of Iowa, Geographical and Sustainability Sciences, Faculty Member

Followers

8

Following

9

Co-author

1

Public Views

InterestsView All (30)

Uploads

Papers by Caglar Koylu

Extracting and Visualizing Geo-Social Semantics from the User Mention Network on Twitter

Proceedings of GIScience 2016 Workshop on Rethinking the ABCs: Agent-Based Models and Complexity Science in the age of Big Data, CyberGIS, and Sensor Networks, Montreal, Canada, September 27, 2016., 2016

This paper introduces an approach for extracting and visualizing geo-social semantics of user men... more This paper introduces an approach for extracting and visualizing geo-social semantics of user mentions on Twitter. The approach consists of three steps. First, data filtering and processing is performed to construct a directed area-to-area mention network in which tweets are aggregated into flow bins between geographic areas. Second, using flow bins as documents, a probabilistic topic model is employed to detect a collection of topics, and classify each area-to-area flow into a mixture of topics with differing probabilities. Third, for each topic, a modularity graph of mentions is obtained and visualized using a flow map and a topic cloud to infer semantics from the set of frequently co-occurring words for each topic. To demonstrate, a dataset of 19 million geo-tagged mentions during the primary elections (Feb-Jun, 2016) in the US were analyzed. The results highlight changing patterns of symmetry, distance and clustering of flows by the topic of content.

CarSenToGram: Geovisual text analytics for exploring spatio- temporal variation in public discourse on Twitter

Assessing the impact of events on the evolution of online public discourse is challenging due to ... more Assessing the impact of events on the evolution of online public discourse is challenging due to the lack of data prior to the event and appropriate methodologies for capturing the progression of tenor of public discourse, both in terms of their tone and topic. In this article, we introduce a geovisual analytics framework, CarSenToGram, which integrates topic modeling and sentiment analysis with cartograms to identify the changing dynamics of public discourse on a particular topic across space and time. The main novelty of CarSenToGram is coupling comprehensible spatio-temporal overviews of the overall distribution, topical and sentiment patterns with increasing levels of information supported by zoom and filter, and details-on-demand interactions. To demonstrate the utility of CarSenToGram, in this article, we analyze tweets related to immigration the month before and after the January 27, 2017 travel ban in order to reveal insights into one of the defining moments of President Trump's first year in office. Not only do we find that the travel ban influenced online public discourse and sentiment on immigration, but it also highlighted important partisan divisions within the United States.

Discovering Multi-Scale Community Structures from the Interpersonal Communication Network on Twitter

Despite the controversies of privacy and ethics, spatially-embedded communication data from wides... more Despite the controversies of privacy and ethics, spatially-embedded communication data from widespread and emerging online social networks provide an unprecedented opportunity to study human interactions at the global scale. Detecting communities of individuals who live close by and have strong communication among each other is critical for a variety of application areas such as managing disaster response, controlling disease spread, and developing sustainable urban spaces and infrastructure. The ease of long-distance travel and communication have generated a highly complex network of human interactions, in which long-distance and short-distance ties coexist in multiple scales. Also, there is a hierarchical spatial organization in human interaction networks which reflect historic and socio-political borders. Patterns of human connectivity cross these historic and socio-political borders at multiple geographic scales. Therefore, a comprehensive understanding of human interactions necessitates analysis methods to take into account the complexity introduced by the multi-scale nature of human connectivity. This paper employs a spatially-constrained hierarchical regionalization algorithm to reveal multi-scale community structures in the interpersonal communication network on Twitter. The interpersonal communication network was constructed using a year of reciprocal and geo-located mention tweets in the U.S. between August 2015 and 2016. The results strikingly showed nested borders of cohesive regions at multiple scales, which are inherent to human communication patterns in the regional hierarchy of the U.S. Unsurprisingly, people communicated with others that live nearby, and multi-scale regions overlap with administrative boundaries of the states, cultural and dialectal regions, and topographical features. Furthermore, visualization of interregional communication patterns revealed a variety of spatial connectivity patterns such as poly-centricity, hierarchies, and spanning trees. Discovery of such patterns is essential for understanding of the complex social system that is influenced by long-distance ties.

Design and evaluation of line symbolizations for origin–destination flow maps

We present the results of a user study comparing variants of commonly used line symbolizations fo... more We present the results of a user study comparing variants of commonly used line symbolizations for directed origin–destination flow maps. Our design and evaluation consisted of five line symbolizations that employ a combination of following visual variables: arrowheads, origin–destination coloring (color hue, and value), line shortening, line width, tapered edges (varying width from wide to narrow, and narrow to wide), and curvature asymmetry and strength. To guide our evaluation, we used a task-by-type typology and chose four representative tasks that are commonly used in flow map reading: identifying dominant direction of flows, flows with the highest magnitude (volume), spatial focusing of long flows toward a destination, and clusters of high net-exports (net-outflow). We systematically analyzed user responses and task performance which we measured by task completion time and accuracy. We designed a web-based flow mapping and testing framework and recruited the participants from Amazon Mechanical Turk. To demonstrate the application and user experiment , we used 16 commodity flow data sets in the United States from 2007 and systematically rotated the layouts to evaluate the effect of layout orientation. From this study, we can conclude that there is potential usefulness for all of the five symbolizations we tested; however, the influence of the design on performance and perception depends on the type of the task. Also, we found that data and layout orientation have significant effects on performance and perception of patterns in flow maps which we attribute to the change in visual saliency of node and flow patterns in relation to the way users scan the map. We recommend that the choice of line symbolization should be guided by a task taxonomy which end users are expected to perform. We discuss various design trade-offs and recommendations and potential future work for designing and evaluating line symbolizations for flow mapping.

Analysis of big patient mobility data for identifying medical regions, spatio-temporal characteristics and care demands of patients on the move

Background: Patient mobility can be defined as a patient's movement or utilization of a health ca... more Background: Patient mobility can be defined as a patient's movement or utilization of a health care service located in a place or region other than the patient's place of residence. Mobility provides freedom to patients to obtain health care from providers across regions and even countries. It is essential to monitor patient choices in order to maintain the quality standards and responsiveness of the health system, otherwise, the health system may suffer from geographic disparities in the accessibility to quality and responsive health care. In this article, we study patient mobility in a national health care system to identify medical regions, spatio-temporal and service characteristics of health care utilization, and demands for patient mobility. Methods: We conducted a systematic analysis of province-to-province patient mobility in Turkey from December 2009 to December 2013, which was derived from 1.2 billion health service records. We first used a flow-based region-alization method to discover functional medical regions from the patient mobility network. We compare the results of data-driven regions to designated regions of the government in order to identify the areas of mismatch between planned regional service delivery and the observed utilization in the form of patient flows. Second, we used feature selection, and multivariate flow clustering to identify spatio-temporal characteristics and health care needs of patients on the move. Results: Medical regions we derived by analyzing the patient mobility data showed strong overlap with the designated regions of the Ministry of Health. We also identified a number of regions that the regional service utilization did not match the planned service delivery. Overall, our spatio-temporal and multivariate analysis of regional and long-distance patient flows revealed strong relationship with socio-demographic and cultural structure of the society and migration patterns. Also, patient flows exhibited seasonal patterns, and yearly trends which correlate with implemented policies throughout the period. We found that policies resulted in different outcomes across the country. We also identified characteristics of long-distance flows which could help inform policy-making by assessing the needs of patients in terms of medical specialization, service level and type. Conclusions: Our approach helped identify (1) the mismatch between regional policy and practice in health care utilization (2) spatial, temporal, health service level characteristics and medical specialties that patients seek out by traveling longer distances. Our findings can help identify the imbalance between supply and demand, changes in mobility behaviors, and inform policy-making with insights.

Uncovering geo-social semantics from the Twitter mention network: An integrated approach using spatial network smoothing and topic modeling

Advances in human dynamics research and availability of geo-referenced communication data provide... more Advances in human dynamics research and availability of geo-referenced communication data provide an unprecedented opportunity for studying the semantics of communication and understanding the interplay between online social networks and geography. Among the most extensively studied topics in geographically-embedded communication networks, are the effect of geographic proximity on interpersonal communication; the influence of information diffusion and social networks on real-world geographic events such as group activities and demonstrations; and revealing structural and geographic characteristics of a communication network. However, little is known on how the content of interpersonal communication vary across geographic space. By integrating methods of spatial network smoothing and probabilistic topic modeling, this paper introduces an approach to extracting and visualizing geo-social semantics, i.e., how the semantics of information vary based on the geographic locations and communication ties among the users. Different from the previous work that examine the geographic variation in the content produced by individuals, the proposed approach focuses on an analysis of reciprocal conversations among individuals in a geographically-embedded communication network. To demonstrate the approach, geo-located mention tweets in the U.S. from Aug. 1, 2015 to Aug. 1, 2016 were analyzed. Topics extracted from the analysis reflect geo-social dynamics of the society, way of speaking in the context of friendship, linguistic variation and the use of social media acronyms. Although the tweets were collected during primary and presidential elections, political topics discovered from the reciprocal mentions focused more on civil rights rather than the candidates and primaries. While the topic of primary candidates and elections was prominent at locations of primary elections and core supporters of candidates; civil rights was a prominent topic across the whole country.

Understanding Geo-Social Network Patterns: Computation, Visualization, and Usability

Mapping family connectedness across space and time

by Caglar Koylu and Alice Kasakoff

Understanding the structure and evolution of family networks embedded in space and time is crucia... more Understanding the structure and evolution of family networks embedded in space and time is crucial for various fields such as disaster evacuation planning and provision of care to the elderly. Computation and visualization can potentially play a key role in analyzing and understanding such networks. Graph visualization methods are effective in discovering network patterns; however, they have inadequate capability in discovering spatial and temporal patterns of connections in a network especially when the network exists and changes across space and time. We introduce a measure of family connectedness that summarizes the dynamic relationships in a family network by taking into account the distance (how far individuals live apart), time (the duration of individuals’ coexistence within a neighborhood), and the relationship (kinship or kin proximity) between each pair of individuals. By mapping the family connectedness over a series of time intervals, the method facilitates the discovery of hot spots (hubs) where family connectedness is strong and the changing patterns of such spots across space and time. We demonstrate our approach using a data set of nine families from the US North. Our results highlight that family connectedness reflects changing demographic processes such as migration and population growth.

Smoothing locational measures in spatial interaction networks

Spatial interactions such as migration and airline transportation naturally form a location-to-lo... more Spatial interactions such as migration and airline transportation naturally form a location-to-location network (graph) in which a node represents a location (or an area) and a link represents an interaction (flow) between two locations. Locational measures, such as net-flow, centrality, and entropy, are often derived to understand the structural characteristics and the roles of locations in spatial interaction networks. However, due to the small-area problem and the dramatic difference in location sizes (such as population), derived locational measures often exhibit spurious variations, which may conceal the underlying spatial and network structures. This paper introduces a new approach to smoothing locational measures in spatial interaction networks. Different from conventional spatial kernel methods, the new method first smoothes the flows to/from each neighborhood and then calculates its network measure with the smoothed flows. We use county-to-county migration data in the US to demonstrate and evaluate the new smoothing approach. With smoothed net migration rate and entropy measure for each county, we can discover natural regions of attraction (or depletion) and other structural characteristics that the original (unsmoothed) measures fail to reveal. Furthermore, with the new approach, one can also smooth spatial interactions within sub-populations (e.g., different age groups), which are often sparse and impossible to derive meaningful measures if not properly smoothed.

articles by Caglar Koylu

Identifying disaster-related tweets and their semantic, spatial and temporal context using deep learning, natural language processing and spatial analysis: a case study of Hurricane Irma

Density-based multi-scale flow mapping and generalization

Mapping large volume of origin-destination flow data (or spatial interactions) has long been a ch... more Mapping large volume of origin-destination flow data (or spatial interactions) has long been a challenging problem because of the conflict between massive location-to-location connections and the limited map space. Current approaches for flow mapping only work with a small dataset or have to use data aggregation, which not only cause a significant loss of information but may also produce misleading maps. In this paper, we present a density-based flow map generalization approach that can extract flow patterns and facilitate the analysis and visualization of big origin-destination flow data at multiple scales. Unlike existing methods that generalize flow data by spatial unit-based aggregation, our new flow map generalization algorithm is based on flow density distribution. To demonstrate the approach and assess its effectiveness, a case study is carried out to map 829,039 taxi trips within the New York City. With parameter settings, the proposed method can discover inherent and abstract flow patterns at different map scales and generalization levels, which naturally supports interactive and multi-scale flow mapping.

A web-based geovisual analytics platform for identifying potential contributors to culvert sedimentation

Sediment accumulation at culverts involves large-scale and interlinked environmental processes th... more Sediment accumulation at culverts involves large-scale and interlinked environmental processes that are difficult to address with experimental or physical modeling methods. This article presents an alternative data-driven investigation for shedding insights into these processes. Accordingly, a web-based geovisual analytics application, the IowaDOT platform, was developed, which allows users to explore the complex processes associated with the sediment deposition at culverts. The platform provides systematic procedures for (1) collecting and integrating analytical variables into a single dataset, (2) quantifying the degree of culvert sedimentation using time series of aerial images, (3) identifying drivers that contribute to culvert sedimentation processes from a variety of culvert structural and upstream landscape characteristics using a tree-based feature selection algorithm, and (4) facilitating the understanding of complex spatial and relational patterns of culvert sedimentation processes using multivariate geovisualizations supported by a self-organizing map (SOM). As the outcomes of this study, these patterns identify culvert sedimentation-prone regions in Iowa and quantify empirical relationships between the drivers and culvert sedimentation degrees. A simple evaluation of the platform was performed to assess the usefulness and user satisfaction of the tool by professional users, and positive feedbacks are received.

Conference Presentations by Caglar Koylu

Mapping Temporal Trends of Parent-Child Migration from Population-Scale Family Trees

AutoCarto 2020 International Research Symposium on Cartography and Geographic Information Science, 2020

User-generated family trees are invaluable for constructing population-scale family networks and ... more User-generated family trees are invaluable for constructing population-scale family networks and studying population dynamics over many generations and far into the past. Family trees contain information on individuals such as birth and death places and years, and kinship ties, e.g., parent-child, spouse, and sibling relationships. Such information about individuals in family trees makes it possible to extract migration networks over time. Despite the recent advances, existing spatial and temporal abstraction techniques for time-variant flow data have limitations due to the lack of knowledge on the effect of temporal partitioning on flow patterns. In this study, we extracted state-to-state migration patterns over a period of 150 years between 1776 and 1926 from a cleaned, geocoded and connected family trees from Rootsweb.com. We used birthplaces and birthyears of parents and children to extract intergenerational migration flows between states. To reveal the temporal trends of migration patterns, we evaluated three temporal partitioning strategies: (1) predefined periods in American history, (2) overlapping time periods with fixed length, and (3) time periods with variable length, which have approximately equal volume of moves per time period. To account for the effect of geographic proximity and flow volumes in migration flows, we transformed the raw flows into modularity flows using a double-constrained a gravity model. Our preliminary results revealed longitudinal population mobility in the U.S. on such a large spatial and temporal scale.

Extracting and Visualizing Geo-Social Semantics from the User Mention Network on Twitter

Proceedings of GIScience 2016 Workshop on Rethinking the ABCs: Agent-Based Models and Complexity Science in the age of Big Data, CyberGIS, and Sensor Networks, Montreal, Canada, September 27, 2016., 2016

This paper introduces an approach for extracting and visualizing geo-social semantics of user men... more This paper introduces an approach for extracting and visualizing geo-social semantics of user mentions on Twitter. The approach consists of three steps. First, data filtering and processing is performed to construct a directed area-to-area mention network in which tweets are aggregated into flow bins between geographic areas. Second, using flow bins as documents, a probabilistic topic model is employed to detect a collection of topics, and classify each area-to-area flow into a mixture of topics with differing probabilities. Third, for each topic, a modularity graph of mentions is obtained and visualized using a flow map and a topic cloud to infer semantics from the set of frequently co-occurring words for each topic. To demonstrate, a dataset of 19 million geo-tagged mentions during the primary elections (Feb-Jun, 2016) in the US were analyzed. The results highlight changing patterns of symmetry, distance and clustering of flows by the topic of content.

CarSenToGram: Geovisual text analytics for exploring spatio- temporal variation in public discourse on Twitter

Assessing the impact of events on the evolution of online public discourse is challenging due to ... more Assessing the impact of events on the evolution of online public discourse is challenging due to the lack of data prior to the event and appropriate methodologies for capturing the progression of tenor of public discourse, both in terms of their tone and topic. In this article, we introduce a geovisual analytics framework, CarSenToGram, which integrates topic modeling and sentiment analysis with cartograms to identify the changing dynamics of public discourse on a particular topic across space and time. The main novelty of CarSenToGram is coupling comprehensible spatio-temporal overviews of the overall distribution, topical and sentiment patterns with increasing levels of information supported by zoom and filter, and details-on-demand interactions. To demonstrate the utility of CarSenToGram, in this article, we analyze tweets related to immigration the month before and after the January 27, 2017 travel ban in order to reveal insights into one of the defining moments of President Trump's first year in office. Not only do we find that the travel ban influenced online public discourse and sentiment on immigration, but it also highlighted important partisan divisions within the United States.

Discovering Multi-Scale Community Structures from the Interpersonal Communication Network on Twitter

Despite the controversies of privacy and ethics, spatially-embedded communication data from wides... more Despite the controversies of privacy and ethics, spatially-embedded communication data from widespread and emerging online social networks provide an unprecedented opportunity to study human interactions at the global scale. Detecting communities of individuals who live close by and have strong communication among each other is critical for a variety of application areas such as managing disaster response, controlling disease spread, and developing sustainable urban spaces and infrastructure. The ease of long-distance travel and communication have generated a highly complex network of human interactions, in which long-distance and short-distance ties coexist in multiple scales. Also, there is a hierarchical spatial organization in human interaction networks which reflect historic and socio-political borders. Patterns of human connectivity cross these historic and socio-political borders at multiple geographic scales. Therefore, a comprehensive understanding of human interactions necessitates analysis methods to take into account the complexity introduced by the multi-scale nature of human connectivity. This paper employs a spatially-constrained hierarchical regionalization algorithm to reveal multi-scale community structures in the interpersonal communication network on Twitter. The interpersonal communication network was constructed using a year of reciprocal and geo-located mention tweets in the U.S. between August 2015 and 2016. The results strikingly showed nested borders of cohesive regions at multiple scales, which are inherent to human communication patterns in the regional hierarchy of the U.S. Unsurprisingly, people communicated with others that live nearby, and multi-scale regions overlap with administrative boundaries of the states, cultural and dialectal regions, and topographical features. Furthermore, visualization of interregional communication patterns revealed a variety of spatial connectivity patterns such as poly-centricity, hierarchies, and spanning trees. Discovery of such patterns is essential for understanding of the complex social system that is influenced by long-distance ties.

Design and evaluation of line symbolizations for origin–destination flow maps

We present the results of a user study comparing variants of commonly used line symbolizations fo... more We present the results of a user study comparing variants of commonly used line symbolizations for directed origin–destination flow maps. Our design and evaluation consisted of five line symbolizations that employ a combination of following visual variables: arrowheads, origin–destination coloring (color hue, and value), line shortening, line width, tapered edges (varying width from wide to narrow, and narrow to wide), and curvature asymmetry and strength. To guide our evaluation, we used a task-by-type typology and chose four representative tasks that are commonly used in flow map reading: identifying dominant direction of flows, flows with the highest magnitude (volume), spatial focusing of long flows toward a destination, and clusters of high net-exports (net-outflow). We systematically analyzed user responses and task performance which we measured by task completion time and accuracy. We designed a web-based flow mapping and testing framework and recruited the participants from Amazon Mechanical Turk. To demonstrate the application and user experiment , we used 16 commodity flow data sets in the United States from 2007 and systematically rotated the layouts to evaluate the effect of layout orientation. From this study, we can conclude that there is potential usefulness for all of the five symbolizations we tested; however, the influence of the design on performance and perception depends on the type of the task. Also, we found that data and layout orientation have significant effects on performance and perception of patterns in flow maps which we attribute to the change in visual saliency of node and flow patterns in relation to the way users scan the map. We recommend that the choice of line symbolization should be guided by a task taxonomy which end users are expected to perform. We discuss various design trade-offs and recommendations and potential future work for designing and evaluating line symbolizations for flow mapping.

Analysis of big patient mobility data for identifying medical regions, spatio-temporal characteristics and care demands of patients on the move

Background: Patient mobility can be defined as a patient's movement or utilization of a health ca... more Background: Patient mobility can be defined as a patient's movement or utilization of a health care service located in a place or region other than the patient's place of residence. Mobility provides freedom to patients to obtain health care from providers across regions and even countries. It is essential to monitor patient choices in order to maintain the quality standards and responsiveness of the health system, otherwise, the health system may suffer from geographic disparities in the accessibility to quality and responsive health care. In this article, we study patient mobility in a national health care system to identify medical regions, spatio-temporal and service characteristics of health care utilization, and demands for patient mobility. Methods: We conducted a systematic analysis of province-to-province patient mobility in Turkey from December 2009 to December 2013, which was derived from 1.2 billion health service records. We first used a flow-based region-alization method to discover functional medical regions from the patient mobility network. We compare the results of data-driven regions to designated regions of the government in order to identify the areas of mismatch between planned regional service delivery and the observed utilization in the form of patient flows. Second, we used feature selection, and multivariate flow clustering to identify spatio-temporal characteristics and health care needs of patients on the move. Results: Medical regions we derived by analyzing the patient mobility data showed strong overlap with the designated regions of the Ministry of Health. We also identified a number of regions that the regional service utilization did not match the planned service delivery. Overall, our spatio-temporal and multivariate analysis of regional and long-distance patient flows revealed strong relationship with socio-demographic and cultural structure of the society and migration patterns. Also, patient flows exhibited seasonal patterns, and yearly trends which correlate with implemented policies throughout the period. We found that policies resulted in different outcomes across the country. We also identified characteristics of long-distance flows which could help inform policy-making by assessing the needs of patients in terms of medical specialization, service level and type. Conclusions: Our approach helped identify (1) the mismatch between regional policy and practice in health care utilization (2) spatial, temporal, health service level characteristics and medical specialties that patients seek out by traveling longer distances. Our findings can help identify the imbalance between supply and demand, changes in mobility behaviors, and inform policy-making with insights.

Uncovering geo-social semantics from the Twitter mention network: An integrated approach using spatial network smoothing and topic modeling

Advances in human dynamics research and availability of geo-referenced communication data provide... more Advances in human dynamics research and availability of geo-referenced communication data provide an unprecedented opportunity for studying the semantics of communication and understanding the interplay between online social networks and geography. Among the most extensively studied topics in geographically-embedded communication networks, are the effect of geographic proximity on interpersonal communication; the influence of information diffusion and social networks on real-world geographic events such as group activities and demonstrations; and revealing structural and geographic characteristics of a communication network. However, little is known on how the content of interpersonal communication vary across geographic space. By integrating methods of spatial network smoothing and probabilistic topic modeling, this paper introduces an approach to extracting and visualizing geo-social semantics, i.e., how the semantics of information vary based on the geographic locations and communication ties among the users. Different from the previous work that examine the geographic variation in the content produced by individuals, the proposed approach focuses on an analysis of reciprocal conversations among individuals in a geographically-embedded communication network. To demonstrate the approach, geo-located mention tweets in the U.S. from Aug. 1, 2015 to Aug. 1, 2016 were analyzed. Topics extracted from the analysis reflect geo-social dynamics of the society, way of speaking in the context of friendship, linguistic variation and the use of social media acronyms. Although the tweets were collected during primary and presidential elections, political topics discovered from the reciprocal mentions focused more on civil rights rather than the candidates and primaries. While the topic of primary candidates and elections was prominent at locations of primary elections and core supporters of candidates; civil rights was a prominent topic across the whole country.

Understanding Geo-Social Network Patterns: Computation, Visualization, and Usability

Mapping family connectedness across space and time

by Caglar Koylu and Alice Kasakoff

Understanding the structure and evolution of family networks embedded in space and time is crucia... more Understanding the structure and evolution of family networks embedded in space and time is crucial for various fields such as disaster evacuation planning and provision of care to the elderly. Computation and visualization can potentially play a key role in analyzing and understanding such networks. Graph visualization methods are effective in discovering network patterns; however, they have inadequate capability in discovering spatial and temporal patterns of connections in a network especially when the network exists and changes across space and time. We introduce a measure of family connectedness that summarizes the dynamic relationships in a family network by taking into account the distance (how far individuals live apart), time (the duration of individuals’ coexistence within a neighborhood), and the relationship (kinship or kin proximity) between each pair of individuals. By mapping the family connectedness over a series of time intervals, the method facilitates the discovery of hot spots (hubs) where family connectedness is strong and the changing patterns of such spots across space and time. We demonstrate our approach using a data set of nine families from the US North. Our results highlight that family connectedness reflects changing demographic processes such as migration and population growth.

Smoothing locational measures in spatial interaction networks

Spatial interactions such as migration and airline transportation naturally form a location-to-lo... more Spatial interactions such as migration and airline transportation naturally form a location-to-location network (graph) in which a node represents a location (or an area) and a link represents an interaction (flow) between two locations. Locational measures, such as net-flow, centrality, and entropy, are often derived to understand the structural characteristics and the roles of locations in spatial interaction networks. However, due to the small-area problem and the dramatic difference in location sizes (such as population), derived locational measures often exhibit spurious variations, which may conceal the underlying spatial and network structures. This paper introduces a new approach to smoothing locational measures in spatial interaction networks. Different from conventional spatial kernel methods, the new method first smoothes the flows to/from each neighborhood and then calculates its network measure with the smoothed flows. We use county-to-county migration data in the US to demonstrate and evaluate the new smoothing approach. With smoothed net migration rate and entropy measure for each county, we can discover natural regions of attraction (or depletion) and other structural characteristics that the original (unsmoothed) measures fail to reveal. Furthermore, with the new approach, one can also smooth spatial interactions within sub-populations (e.g., different age groups), which are often sparse and impossible to derive meaningful measures if not properly smoothed.

Identifying disaster-related tweets and their semantic, spatial and temporal context using deep learning, natural language processing and spatial analysis: a case study of Hurricane Irma

Density-based multi-scale flow mapping and generalization

Mapping large volume of origin-destination flow data (or spatial interactions) has long been a ch... more Mapping large volume of origin-destination flow data (or spatial interactions) has long been a challenging problem because of the conflict between massive location-to-location connections and the limited map space. Current approaches for flow mapping only work with a small dataset or have to use data aggregation, which not only cause a significant loss of information but may also produce misleading maps. In this paper, we present a density-based flow map generalization approach that can extract flow patterns and facilitate the analysis and visualization of big origin-destination flow data at multiple scales. Unlike existing methods that generalize flow data by spatial unit-based aggregation, our new flow map generalization algorithm is based on flow density distribution. To demonstrate the approach and assess its effectiveness, a case study is carried out to map 829,039 taxi trips within the New York City. With parameter settings, the proposed method can discover inherent and abstract flow patterns at different map scales and generalization levels, which naturally supports interactive and multi-scale flow mapping.

A web-based geovisual analytics platform for identifying potential contributors to culvert sedimentation

Sediment accumulation at culverts involves large-scale and interlinked environmental processes th... more Sediment accumulation at culverts involves large-scale and interlinked environmental processes that are difficult to address with experimental or physical modeling methods. This article presents an alternative data-driven investigation for shedding insights into these processes. Accordingly, a web-based geovisual analytics application, the IowaDOT platform, was developed, which allows users to explore the complex processes associated with the sediment deposition at culverts. The platform provides systematic procedures for (1) collecting and integrating analytical variables into a single dataset, (2) quantifying the degree of culvert sedimentation using time series of aerial images, (3) identifying drivers that contribute to culvert sedimentation processes from a variety of culvert structural and upstream landscape characteristics using a tree-based feature selection algorithm, and (4) facilitating the understanding of complex spatial and relational patterns of culvert sedimentation processes using multivariate geovisualizations supported by a self-organizing map (SOM). As the outcomes of this study, these patterns identify culvert sedimentation-prone regions in Iowa and quantify empirical relationships between the drivers and culvert sedimentation degrees. A simple evaluation of the platform was performed to assess the usefulness and user satisfaction of the tool by professional users, and positive feedbacks are received.

Mapping Temporal Trends of Parent-Child Migration from Population-Scale Family Trees

AutoCarto 2020 International Research Symposium on Cartography and Geographic Information Science, 2020

User-generated family trees are invaluable for constructing population-scale family networks and ... more User-generated family trees are invaluable for constructing population-scale family networks and studying population dynamics over many generations and far into the past. Family trees contain information on individuals such as birth and death places and years, and kinship ties, e.g., parent-child, spouse, and sibling relationships. Such information about individuals in family trees makes it possible to extract migration networks over time. Despite the recent advances, existing spatial and temporal abstraction techniques for time-variant flow data have limitations due to the lack of knowledge on the effect of temporal partitioning on flow patterns. In this study, we extracted state-to-state migration patterns over a period of 150 years between 1776 and 1926 from a cleaned, geocoded and connected family trees from Rootsweb.com. We used birthplaces and birthyears of parents and children to extract intergenerational migration flows between states. To reveal the temporal trends of migration patterns, we evaluated three temporal partitioning strategies: (1) predefined periods in American history, (2) overlapping time periods with fixed length, and (3) time periods with variable length, which have approximately equal volume of moves per time period. To account for the effect of geographic proximity and flow volumes in migration flows, we transformed the raw flows into modularity flows using a double-constrained a gravity model. Our preliminary results revealed longitudinal population mobility in the U.S. on such a large spatial and temporal scale.