-
Notifications
You must be signed in to change notification settings - Fork 6
Geospatial Data APIs
data:image/s3,"s3://crabby-images/4c55e/4c55ebb817a83959972b742c197d9a16feb49159" alt=""
This session in the NextGen Geospatial Data Science workshop series will cover Data APIs. We will discuss:
- What they are
- Why you would use one, and
- Show a hands-on example using satellite imagery from Planet. We will be using the python scripting language in a Google Colab Jupyter Notebook to access and visualize Planet imagery.
-
If you come to the session in person, please bring your laptop if you want to code along together.
-
You need a Google account in order to access and use Colab. Anyone with an 'arizona.edu' email address should have direct access
-
Please have a Planet account prior to the session. You will need this to find your API key.
The UA Institute for Computation and Data-Enabled Insight(ICDI) has purchased a campus-wide license so all UofA students, staff, and faculty can access Planet satellite data products for free! Please click here to learn how to get your account and starting getting imagery.
- Logon to your Planet account at https://www.planet.com/login. You should land in your user dashboard
data:image/s3,"s3://crabby-images/64b55/64b554c08f42670b5040ce096b367a81503425b6" alt=""
- Click on 'My Settings'. On this page, you should see your API Key. Copy this as you will use it in Colab.
data:image/s3,"s3://crabby-images/1bfd2/1bfd23fa5e69fe62de12c565f5f4c1b14500b4cd" alt=""
...and now to the lesson.
data:image/s3,"s3://crabby-images/bddcc/bddcc51074253c491c7e417f0c2a80139cdac130" alt=""
data:image/s3,"s3://crabby-images/d4387/d43874af9d92f7044ec1edf638c249ccb322ccdd" alt=""
data:image/s3,"s3://crabby-images/a3c89/a3c8985202e639e2f110775075071d9629eda861" alt=""
data:image/s3,"s3://crabby-images/6ac13/6ac13e67f0e004d84bc43cd49b980f381cfc9324" alt=""
GIS and remotely sensed data have proliferated online. From drone imagery, to stream networks, to land cover classifications, to wildfire, to wildlife, to human population, there is spatial data to represent every conceivable phenomenon that can be put on a map. In the earth observation space, there is around 1600 satellites orbiting Earth and snapping images around the clock in a variety of band frequencies.
The imagery company Planet operates more than 200 satellites which covers most of the Earth landmass every single day. Their flagship imagery product called Planetscope consists of 4 bands (blue, green, red, near-infrared) and has 3 meter spatial resolution.
data:image/s3,"s3://crabby-images/eac75/eac75d317a75ca1f7c832af22aee7c49bad49ea7" alt=""
data:image/s3,"s3://crabby-images/3593d/3593d66cf7a839ffa92cdc924ef44335e4068deb" alt=""
The vast majority of online geospatial data is downloaded through a graphical website. And for good reason, it's just easy! Data repositories
8000
such as Earth Explorer and many others, have powerful and intuitive tools for searching and downloading data. Many of them combine a map interface with filters and search terms. For the vast majority of geospatial data seekers, these tools work perfectly.
If you are reading this material or attending this workshop, you may be in a select group that wants to do things the hard way. But with great challenge comes great reward! Many geospatial datasets can be searched for and downloaded through scripting languages. This allows you automate repetitive tasks and scale your imagery analysis well beyond what is possible with point-and-click computing.
Increasingly, geospatial data are available through web application programming interfaces (API). The term API can be used in several different contexts, but for our purposes here let's define it as:
a set of protocols for communicating between computers over the web.
It's just like a website but specifically for data transfer. An API is served on a computer and has a specific web address (URL) just like a website. But instead of interacting with the web address through a graphical browser, you interact with the API through scripting languages. Both websites and APIs use HyperText Transfer Protocol (HTTP).
For our hands-on example today, we will be using a RESTful API. REST, which stands Representational State Transfer, is a common API architecture that is favored for their scalability and ease of integration. They typically use JSON format for data exchange. REST APIs use standard commands for requests (from user to the API) and response (from API back to user). For example:
- GET: Retrieves data from a server. It’s used for reading information.
- POST: Sends data to a server to create or update a resource. It’s often used for creating new resources.
We will see examples of both of these commands in our python example.
Using scripting languages to download geospatial data through APIs makes you a more powerful data science wizard.
Here are the top reasons to use APIs:
-
Automation and Scalability: Python APIs allow for the automation of data retrieval and processing. For instance, a researcher monitoring deforestation could set up a script to automatically download satellite images of a specific region at regular intervals. This is much more efficient than manually downloading each image through a web interface. In that same script, the data can be analyzed in Python's rich ecosystem of libraries.
-
Reproducibility and Sharing of Research: Python scripts can be shared and reproduced by other researchers, ensuring transparency and reproducibility of the research. For example, a script used to analyze urban expansion using satellite images can be shared with the research community, allowing others to replicate the study or build upon it.
-
Handling Large Datasets: For large-scale projects that involve huge amounts of data, using a Python API can be much more practical. A researcher studying global water resources could write a script to handle and analyze terabytes of satellite data, a task that would be impractical through a website explorer.
-
Real-time Data Processing: Some research might require real-time or near-real-time data processing, which can be facilitated by Python APIs. For instance, monitoring natural disasters like wildfires or floods in real-time for quick response.
For our hands-on exercise, we will be tapping into an API from the satellite imagery company Planet. We will be using the python scripting language in a Google Colab Jupyter Notebook to access and visualize Planet imagery.
Planet API and python analysis tutorials using Jupyter Notebooks
CU-Bounlder EarthLab tutorial using R
National Ecological Observation Network (NEON)
Earth Search public datasets on AWS
API - An Application Programming Interface. For the session's purposes, this is a way to get data over the web using programming languages like python instead of using websites. APIs often have a web address just like any other website. For example "https://api.planet.com/data/v1" is the address of the Planet API.
API Endpoint - An endpoint is referring to a specific API web address that has a specific function. In the case of Planet, it has an endpoint "https://api.planet.com/data/v1/stats" that is used to return statistics on imagery assets in their catalog. Planet has a different endpoint https://api.planet.com/data/v1/quick-search" that is used to getting the imagery names and downloading them.
API Key A password-like string of letters and numbers that are issued by the host of the API. For today's session, Planet has issued unique API keys for each user that allows them to access data from the API.
Jupyter Notebook - A browser-based environment for writing computer code in python, R, or julia. A Notebook can be served on your local machine or on remote machine that you access over the web. Notebooks are nice tools for sharing and collaborating on code. They are perfect for a classroom of students to code together.
Google Colab - A Jupyter Notebook environment that is hosted on a Google virtual machine (VM). The free tier gives each user a relatively small allocation of RAM (memory), disk storage, and CPU. This VM allocation is temporary and lasts less than a day, after which it disappears forever. Colab is not place to store your data. But Colab is great place to do some coding analysis without having to install software on your local machine.