[go: up one dir, main page]

Skip to main content

NASA’s groundbreaking open data policy provides unrestricted access to more than 100 petabytes of Earth science data in NASA’s Earth Observing System Data and Information System (EOSDIS) collection. NASA's Earth Science Data Systems (ESDS) Program ensures that these data are fully available to any user for any purpose, and promotes and facilitates the open sharing of all metadata, documentation, models, images, and research results along with the source code used to generate, manipulate, and analyze these data. The agency's underlying objectives are that openness of data is fundamental, security of these data is essential, and freedom and integrity for using these data are crucial.

Learn more about how ESDS defines open science and the evolving paradigm of open-source science, facilitates the unrestricted use of NASA Earth science data, and supports agency-wide open science initiatives.

Defining Open Science and Open-Source Science

ESDS defines open science as a collaborative culture enabled by technology that empowers the open sharing of data, information, and knowledge within the scientific community and the wider public to accelerate scientific research and understanding. A system based on open science aims to make the scientific process as transparent (or open) as possible by making all elements of a claimed discovery readily accessible, which enables results to be repeated and validated.

Out of this open science concept, an evolving paradigm called open-source science is emerging. Open-source science accelerates discovery by conducting science openly from project initiation through implementation. The result is the inclusion of a wider, more diverse community in the scientific process as close to the start of research activities as possible. This increased level of commitment to conducting the full research process openly and without restriction enhances transparency and reproducibility, which engenders trust in the scientific process. It also represents a cultural shift that encourages collaboration and participation among practitioners of diverse backgrounds, including scientific discipline, gender, ethnicity, and expertise.

A Continuum of Open Science diagram.
Image Caption

NASA open-source science practices place the agency closer to a fully open system (right side of image). New technologies and practices will enable NASA to continue to become more fully open-source. Credit: NASA ESDS.

Since 1994, NASA Earth science data have been available without restriction to all users for any purpose, and since 2015, ESDS has ensured that all data systems software developed through NASA research and technology awards have been made available as open-source software.

Open Data

The unrestricted availability of NASA Earth science data is the foundation of all ESDS activities, and the program has remained at the forefront of technological advances to ensure the efficient delivery and use of these data. As the volume of EOSDIS data continue to grow, ESDS is undertaking a groundbreaking effort to move this Big Data collection into the Earthdata Cloud. Having this data collection in the cloud will provide more efficient use of this vast archive, including the ability to conduct analyses in the cloud and merely download the results—a tremendous savings in computing time and processing requirements. New missions that are part of NASA's Earth System Observatory will generate higher volumes of data than any previous missions, all of which will be openly available through the Earthdata Cloud as early in the scientific process as possible.

Chart showing projected data archive growth from ~72 in 2022 to an estimated ~600 PB by 2029.
Image Caption

The red area starting to the right of the Fiscal Year 2023 (FY23) dot indicates the enormous volume of data expected from upcoming high-data-volume missions that are projected to grow NASA's Earth science data collection to almost 600 petabytes (PB) by 2030, based on current launch schedules. Credit: NASA EMS.

The ESDS commitment to open-source data is also predicated on the collaborative use of these data. Through the Earthdata Cloud as well as efforts such as the cloud-based Multi-Mission Algorithm and Analysis Platform (MAAP), ESDS is enabling a broader base of users to interact with these data early in the scientific process. Using a standard internet connection, users in Arizona and Abu Dhabi, for example, can work together to analyze an EOSDIS dataset in real-time without having to download these data.

Open Tools

Along with open data, NASA's ESDS also provides a wide range of tools and applications for working with these data along with the code behind these tools and applications. The Earthdata Data Tools page provides descriptions and links to resources created by EOSDIS Distributed Active Archive Centers (DAACs) for functions such as searching for and subsetting data. In addition, specialized tools such as NASA Worldview and Giovanni enable users to interactively explore hundreds of visualized data layers, overlay multiple layers, do comparisons, create animations, and much more.

Additional resources for using and working with NASA data are listed below.

Algorithm Publication Tool

The Algorithm Publication Tool (APT) developed by NASA's Interagency Implementation and Advanced Concepts Team (IMPACT) enables open, reproducible science by helping scientists write standardized, high-quality algorithm documentation collaboratively. Algorithm Theoretical Basis Documents, or ATBDs, help ensure reproducible science by documenting key scientific assumptions made when writing algorithms and by promoting better understanding of Earth observation data.

Pangeo

The Pangeo project is helping the Earth science community analyze data in the cloud so they can spend less time downloading and managing data. The project is partially funded by NASA's Advancing Collaborative Connections for Earth System Science (ACCESS) Program, which develops technologies to effectively manage, discover, and utilize NASA’s archive of Earth observations for scientific research and applications. Pangeo’s collaborative tools allow researchers to access, process, and analyze NASA data in the commercial cloud without having to download the data. Their ecosystem of interconnected open-source tools use software from Project Jupyter. Project Jupyter software allows users to create and share collaborative workflows in open-source notebooks that contain code, equations, and visualizations.

Multi-Mission Algorithm and Analysis Tool

The Multi-Mission Algorithm and Analysis Tool (MAAP) is a joint effort of NASA and ESA (European Space Agency), and brings together data, algorithms, and computing capabilities in a common cloud environment to facilitate the sharing and processing of data from field, airborne, and satellite measurements. Key features of MAAP are full and open access to mission data through the MAAP Dashboard, the use of open-source code, and unrestricted access to data and ancillary information.

Geographic Information Systems

Geographic Information Systems (GIS) is a collection of computer-based tools for organizing information from a variety of data sources to map and examine changes on Earth. The ESDS vision is to identify and deliver high value Earth science data in formats compliant and compatible with GIS standards; to ensure data are interactive, interoperable, accessible, and GIS-enabled through primary GIS platforms; and to provide the maximum impact to research, education, and public user communities requiring visualization and spatial analysis. The ESDS GIS Team (EGIST) was created to provide sustained program-wide support to enable the appropriate use and adoption of GIS technology in support of Earth science research and applied science for EOSDIS data. Learn more about GIS at NASA.

Providing Open Data for NASA’s Earth System Observatory

NASA’s Earth System Observatory (ESO) is a coordinated series of complementary missions designed to obtain measurements of multiple Earth processes to help address and mitigate climate change. In keeping with NASA’s open data policies, ESO data will be available as early in the mission process as feasible.

Developing a Mission Data Processing System (MDPS) that will ensure the most transparent processing and delivery of ESO mission data is vital. The MDPS is the set of algorithms, software, compute infrastructure, operational procedures, documentation, and teams that process raw instrument data into science quality data products. The MDPS also includes the software tools that support the development of processing algorithms and the validation and analysis of processed data.

 

Three side-by-side blue boxes with outer boxes a darker blue; Left box = raw instrument data in; center box = 4 data processing icons; right box = science data products out
Image Caption

Basic NASA data processing flow. After raw satellite data are downloaded (left box), MDPS elements (center box) facilitate the transformation and processing of instrument data into science data products (right box).

NASA Chief Science Data Officer Kevin Murphy set a challenge to the mission processing community to identify and assess potential architectures that can meet the ESO mission science processing objectives, enable data system efficiencies, promote open science principles, and seek opportunities that support Earth system science.

Addressing this challenge is being accomplished through the Multi-Mission Data Processing System Study. The study is composed of several phases:

  • Phase 1: MDPS Architecture Recommendation (October 2021 to March 2023)
  • Phase 2: Detailed Analysis of Recommended MDPS (March 2023 through September 2024)
  • Phase 3: Baseline Technical Solution and Implementation Plan
  • Phase 4: Implementation

Learn more about Phase 1 and Phase 2 of this ongoing study, and read an article about the challenges of processing massive amounts of data and how NASA plans for readiness.

Agency-Wide Open Science Initiatives and Resources

NASA's Open-Source Science Initiative (OSSI) is a comprehensive program of agency activities to enable and support moving science towards openness, including enhancing existing data policies, supporting open-source software, and enabling cyberinfrastructure. OSSI aims to implement NASA’s Strategy for Data Management and Computing for Groundbreaking Science 2019-2024, which was developed through community input.

NASA's Transform to Open Science (TOPS) mission, which is part of OSSI, is working to accelerate the engagement of the scientific community in open science practices through open science events and activities aimed at:

  • Lowering barriers to entry for historically excluded communities
  • Developing a better understanding of how people use NASA data and code to take advantage of the agency's Big Data collections
  • Increasing opportunities for collaboration while promoting scientific innovation, transparency, and reproducibility

More information about TOPS activities is available on the TOPS GitHub page.

Other Resources

NASA Open Innovation sites provide access to agency-wide data, application programming interfaces (APIs), and code, and are under the Office of the Chief Information Officer:

  • NASA's Open Data Portal is the agency's central public open data site for the public
  • NASA's API Portal is a clearinghouse site for information about the agency's APIs and serves as a passthrough site to NASA APIs located elsewhere
  • NASA's Code Portal contains information on links to all open-sourced NASA code projects

Get Involved

NASA and ESDS have many ways you can benefit from and contribute to open science efforts. Since all code used in ESDS applications and tools is open source, you can create your own instance of Worldview to pull imagery from Global Imagery Browse Services (GIBS) or download code from any of the NASA GitHub sites. In addition, ESDS competitive programs such as the Advancing Collaborative Connections for Earth System Science (ACCESS) Program, Citizen Science for Earth Systems Program (CSESP), and Making Earth System Data Records for Use in Research Environments (MEaSUREs) Program provide opportunities for you to contribute data and observations to ongoing scientific investigations or compete for funding opportunities to help advance open science initiatives. 

Learn more about ESDS collaborations and data processes.