Over the past quarter-century, NASA’s Global Change Master Directory (GCMD) has become an integral system facilitating Earth science and global change studies. The metadata and keyword structures of GCMD are pivotal components of NASA’s Earth science data collection. GCMD also is a cornerstone of NASA’s international collaboration, and one of NASA’s contributions to the international Committee on Earth Observation Satellites (CEOS), where it is known as the CEOS International Directory Network (IDN).
Since its inception, GCMD and NASA’s Earth Observing System Data and Information System (EOSDIS) have remained separate systems. Now, through the development of the EOSDIS Common Metadata Repository (CMR), these systems are in the process of becoming unified through the use of CMR as the metadata source for both systems. For GCMD’s broad base of international data users, this means a more robust system and the ability to drill-down even more deeply in their searches for Earth science and environmental collection-level data. “GCMD’s original purpose, and its continuing purpose, is to support the discovery of Earth science and environmental data collections,” says Dr. Stephen Wharton, the former GCMD Project Manager and Chief of NASA’s Global Change Data Center (GCDC).
To put this significant recent evolution of GCMD into perspective, it is worth reviewing the development of GCMD and the many innovations adopted and created by the GCMD team. A look at the future direction of GCMD shows how this directory will remain a premiere collaborative international resource linking scientists, researchers, policy makers, and the general public with Earth science and environmental data.
GCMD was established at a fortuitous time, and filled a need for discovering Earth science and environmental data. The 1980s saw not only the development of computers with the required power and cost-effectiveness to support such a directory, but also a literal turning point in Earth’s environmental systems. Earth observing data from numerous sources indicate that during this decade “abrupt, substantial, and persistent changes in the state of natural systems” occurred, according to recent research. This, in turn, led to a growing need for researchers, scientists, and managers to discover Earth science data related to these changes. However, this was easier said than done. “I think it’s fair to say that these [data] collections were not necessarily searchable online [in the late-1980s],” says Dr. Wharton.
In 1987, NASA released the NASA Master Directory (NMD) as a source for Earth and space data described at the collection level. While collections of data were being exposed as available to users, users would have obtained the file-level data by ordering media offline in the form of tapes or the then-new technology of compact disks; there was no easy way to find file-level data. By the early 1990s, NASA Earth science data were separated into their own directory—GCMD. In 1994, GCMD became part of NASA’s Global Change Data Center at NASA’s Goddard Space Flight Center in Greenbelt, MD. Also around this same time, EOSDIS was conceived as NASA’s premiere system for archiving and disseminating Earth science data at the file level. It was natural that EOSDIS and GCMD would be managed under the same program, yet remain separate entities.
It is important to note the distinction between collection-level and file-level (or what EOSDIS refers to as “granular”) data. A data collection is a description of data where people can understand what the data are about. A data granule, on the other hand, is an individual data value that is part of a larger collection. For example, you might have a data collection comprising 10 years of data, but you might want one day of data from one month in this 10-year collection; this is the data granule. As established, GCMD and EOSDIS served different needs (collection-level data searches vs. file-level/granular data searches). This, in turn, required that GCMD and EOSDIS have separate systems describing their data and enabling these searches. Data used to describe data are called “metadata,” and are what make data discoverable and searchable. As a result, GCMD and EOSDIS remained separate systems.
The recent development of the EOSDIS Common Metadata Repository (CMR) created the opportunity to finally unify the separate metadata systems used by GCMD and EOSDIS into a single system. CMR was developed by NASA’s Earth Science Data and Information System (ESDIS) Project to be the authoritative management system for all EOSDIS metadata and facilitate rapid searches through the EOSDIS archive. CMR serves as the metadata source for EOSDIS’ Earthdata Search and now also serves as the metadata source for GCMD.
Having CMR as the metadata source for GCMD is considered to be a win-win for data users by GCMD staff in that CMR not only speeds up GCMD searches, but also enables GCMD users to drill-down even more deeply into Earth science data collections. “Prior to this, GCMD had its own backend system for serving data and information on the GCMD website,” says Alicia Aleman, GCMD Senior Science Coordinator. “Once CMR was in place, we migrated all of our content from our own servers and databases to CMR. Now we’re part of this much stronger, more robust infrastructure.”
The use of CMR as the source for GCMD metadata is only the latest evolution of the directory, and builds on many innovations developed by the GCMD team. These include the establishment of Science Keywords and data portals for easy data collection discovery, the adoption of the Directory Interchange Format (DIF) standard for exchanging information about scientific datasets and the development of docBUILDER for ensuring complete dataset metadata, and the implementation of automated quality assurance (QA) rules to ensure the highest quality metadata.