The DAACs currently hold more than 9 Petabytes (PB) of Earth science data. To put this into perspective, 1 PB is equivalent to roughly 20 million four-drawer file cabinets filled with text. Metadata enables users to find specific EOSDIS data products and understand their quality. This vast collection of metadata is the foundation of the EOSDIS Earthdata Search data discovery interface (Figure 1).
NASA is in the process of applying uniform international standards to Earth science metadata with the goal of making it even easier for data users to discover, understand, and use these vast data collections. This is not a simple process, and affects the design of metadata systems for new NASA Earth observing missions as well as existing data files and products that currently use a variety of metadata standards.
A Closer Look at Metadata and its Uses
Metadata are used to describe a wide range of data attributes, including:
- Data quality, such as how complete the data are, where gaps in the data exist, and characteristics that might affect the reliability of the data
- Data lineage, which includes provenance or processing input as well as information that tracks data through transformations, analyses, and interpretations
- Data acquisition parameters. These may include specific parameters that impact algorithm behavior as well as information about the location of the satellite when data were acquired
Metadata also includes documentation to help users better understand data, including tools that enable data interpretation and analysis. This documentation may include instrumentation details and algorithmic descriptions as well as recommended viewing software and color keys.
Until recently, different data providers developed their own standards for metadata content and representation. To take advantage of the metadata, users of these products needed to learn the standard in use. With the exponential growth of Earth science data sets, this approach has become problematic. “Currently, different organizations use very different metadata models and structures,” says Weiss. “The Earth science community has come to recognize the need for a common model and representation of metadata.”
Indeed, even NASA uses several metadata standards, many of which were developed specifically for NASA data. Two of the most common NASA standards for Earth science metadata are the Directory Interchange Format (DIF), which is used by the Global Change Master Directory (GCMD), and the ECHO Metadata Standard (which has now been replaced by the Common Metadata Repository).
While the NASA-developed metadata standards do a good job addressing data discovery, this is only one facet of metadata. “Discovering data is important,” says Ted Habermann, the Director of Earth Science at The HDF Group, an independent non-profit organization that develops and manages the Hierarchical Data Format (HDF) set of Earth science metadata conventions. “But we really need standards that go beyond discovery to cover access, use, and understanding of data.”
Re-Evaluating NASA Metadata - The MENDS Project
In 2010, Andrew Mitchell, the ESDIS Project Science Systems Development Manager, initiated the Metadata Evolution for NASA Data Systems (MENDS) Project. Members of the MENDS Project were asked to assess the metadata needs and current practices of EOSDIS datasets, with particular attention to discovery, archive management, citation, provenance and lineage, data quality, semantics, and data services. They also were asked to provide recommendations for determining the optimal path for integrating current NASA Earth science data systems using a common metadata standard. “It was my vision to have a working group that involved the DAACs, the missions, and the current systems to all get together to talk about our metadata issues,” Mitchell says. “The purpose was to look at all things metadata.”
The MENDS Project recommended that NASA Earth science metadata should be based on the International Standards Organization (ISO) 19100 series of standards, which describe geographic data.
The base metadata model of the ISO 19100 series is represented by ISO 19115, which integrates multiple metadata standards. In addition, other standards in the ISO 19100 series, such as ISO 19139, describe how the ISO 19115 models are represented. To avoid confusion, the term “ISO standards” will be used to mean ISO 19115 and associated standards.
Implementing ISO Standards for NASA Missions and Existing Data
The MENDS Project recommendations are codified in the ESDIS Project’s Metadata Requirements-Base Reference for NASA Earth Science Data Projects, which states that NASA Earth Science Division (ESD) base metadata requirements for science data products created using NASA satellite mission data systems will contain metadata conforming to ISO standards. In addition, the ESDIS Standards Coordination Office (ESCO) provides guidance and vision in the utilization of NASA standards for data format, metadata content, and required documentation for EOSDIS data.
NASA directed that the recently launched SMAP mission would use metadata based on ISO standards (Figure 2). The SMAP mission successfully developed a software architecture that incorporates ISO metadata into SMAP Earth science data products. The SMAP mission success demonstrated the feasibility of implementing ISO standards in other NASA Earth science missions.