Object segmentation in overhead imagery is a challenging problem in computer vision that has been extensively investigated. One challenge with this task is the tremendous visual variability that be present in real-world overhead imagery due to variations in scene content (e.g., building designs, vegetation types), weather conditions (e.g., cloud cover), time-of-day (e.g., sun direction and intensity), and imaging hardware. Existing training datasets to be used for object segmentation algorithms however only capture a small fraction of this variability, limiting the robustness of trained segmentation models. In particular, recent evidence in the literature suggests that trained models perform poorly on imagery that was collected in novel geographic locations (i.e., physical locations that are not present in the training data), or simply at a different time of day, limiting the widespread adoption of these approaches. In this work I make several contributions towards understanding and addressing these challenges. First, I build a deep learning framework, termed MRS, that streamlines the training and validation of deep learning models on large remote sensing datasets. Using MRS, I investigate how well modern deep learning models generalize to imagery collected over novel geographic locations, providing comprehensive experimental evidence that the generalization of modern models is indeed poor. Based upon these results, and to address this problem, I explore the use of synthetic overhead imagery for training deep learning models. Synthetic overhead imagery allows a designer to systematically add and vary many sources of real-world image variability that would be cost-prohibitive to collect and hand-label using real satellites. To accomplish this goal, I develop a new process for generating synthetic overhead imagery. This software package offers users simple controls of key properties of the synthetic imagery, and also generates imagery in a fully automatic fashion. Subsequently I use the new software package to create two datasets of synthetic overhead imagery, termed Synthinel-1 and Synthinel-2. I then demonstrate experimentally that augmenting real-world training imagery with Synthinel-1 or Synthinel-2 consistently yields more robust deep learning models, especially when the models are applied to novel geographic locations. Finally, I analyze the impact of different design choices for the synthetic imagery, and analyze potential reasons why the synthetic imagery is beneficial. Collectively my work has elucidated a major limitation of modern deep learning models that has prevented their widespread adoption for practical applications. Upon elucidating these limitations, my work has also taken several major steps towards overcoming these limitations and advancing the state of computer vision in overhead imagery.

">
[go: up one dir, main page]

 

Using Synthetic Satellite Imagery from Virtual Worlds to Train Deep Learning Models for Object Recognition

Loading...
Thumbnail Image

Date

2021

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

361
views
257
downloads

Abstract

Object segmentation in overhead imagery is a challenging problem in computer vision that has been extensively investigated. One challenge with this task is the tremendous visual variability that be present in real-world overhead imagery due to variations in scene content (e.g., building designs, vegetation types), weather conditions (e.g., cloud cover), time-of-day (e.g., sun direction and intensity), and imaging hardware. Existing training datasets to be used for object segmentation algorithms however only capture a small fraction of this variability, limiting the robustness of trained segmentation models. In particular, recent evidence in the literature suggests that trained models perform poorly on imagery that was collected in novel geographic locations (i.e., physical locations that are not present in the training data), or simply at a different time of day, limiting the widespread adoption of these approaches. In this work I make several contributions towards understanding and addressing these challenges. First, I build a deep learning framework, termed MRS, that streamlines the training and validation of deep learning models on large remote sensing datasets. Using MRS, I investigate how well modern deep learning models generalize to imagery collected over novel geographic locations, providing comprehensive experimental evidence that the generalization of modern models is indeed poor. Based upon these results, and to address this problem, I explore the use of synthetic overhead imagery for training deep learning models. Synthetic overhead imagery allows a designer to systematically add and vary many sources of real-world image variability that would be cost-prohibitive to collect and hand-label using real satellites. To accomplish this goal, I develop a new process for generating synthetic overhead imagery. This software package offers users simple controls of key properties of the synthetic imagery, and also generates imagery in a fully automatic fashion. Subsequently I use the new software package to create two datasets of synthetic overhead imagery, termed Synthinel-1 and Synthinel-2. I then demonstrate experimentally that augmenting real-world training imagery with Synthinel-1 or Synthinel-2 consistently yields more robust deep learning models, especially when the models are applied to novel geographic locations. Finally, I analyze the impact of different design choices for the synthetic imagery, and analyze potential reasons why the synthetic imagery is beneficial. Collectively my work has elucidated a major limitation of modern deep learning models that has prevented their widespread adoption for practical applications. Upon elucidating these limitations, my work has also taken several major steps towards overcoming these limitations and advancing the state of computer vision in overhead imagery.

Description

Provenance

Citation

Citation

Huang, Bohao (2021). Using Synthetic Satellite Imagery from Virtual Worlds to Train Deep Learning Models for Object Recognition. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/22951.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.