Getting Petabytes to People: How the EOSDIS Facilitates Earth Observing Data Discovery and Use
NASA’s EOSDIS is constantly developing new techniques and strategies that enable users around the world to efficiently use NASA Earth observing data.
Josh Blumenfeld, EOSDIS Science Writer
NASA’s Earth Observation data—collected continuously from satellites, aircraft, and ground-based missions for more than a half-century—constitute an invaluable record of Earth processes and a critical resource for scientists and researchers. The techniques and strategies developed by NASA for processing, organizing, archiving, and disseminating these data have led to a national network of interconnected data repositories along with systems that efficiently and effectively deliver these data in a wide range of formats to users around the world.
Managing NASA Earth observing data is the responsibility of the Earth Observing System Data and Information System (EOSDIS), which provides end-to-end capabilities for managing NASA’s Earth science data. According to EOSDIS metrics for 2014, EOSDIS manages more than 9 petabytes (PB) of data. To put this into perspective, 1 PB is equivalent to about 20 million four-drawer filing cabinets filled with text. Even when you go to the next lower order of magnitude of data, the terabyte (TB), you still are talking about a lot of data—10 TB can hold the entire printed collection of the Library of Congress. The EOSDIS adds about 6.4 TB of data to its archives and distributes almost 28 TB worth of data to an average of 11,000 unique users around the world every day.
“You can look at the EOSDIS as a giant library,” says Kevin Murphy, who served as the EOSDIS System Architect and is now the NASA Program Executive for Earth Science Data Systems (ESDS). “This means you need to know where all the data are and then you have to process the data to make sure they are all consistent and make sure you’re not making changes to the data.”
The foundations of this giant data library are NASA’s Distributed Active Archive Centers (DAACs). Due to the size of the data holdings and the breadth of science disciplines represented, EOSDIS data collections are stored in 12 discipline-specific DAACs (Figure 1). For example, the Land Processes DAAC (LP DAAC) located in Sioux Falls, SD, is home to NASA Earth science data related to surface reflectance, radiance and temperature; topography; radiation budget; ecosystem variables; land cover; and vegetation indices. The DAACs provide a “concierge”-type of data service support for NASA’s Earth science customers, which is an important service given the complexities of remote sensing data.
In fact, the sheer size of the EOSDIS data collections and the number of individual data products can make finding the right data sets and products to meet specific research needs a daunting task. It is critical for NASA to make searching for and retrieving EOSDIS data rapid, accurate, and efficient. Earthdata Search is the latest interface developed by the EOSDIS to enable data search and discovery for the entire data collection. It features the ability to search the millions of DAAC data records in sub-seconds.
The EOSDIS team also is in the process of building the Common Metadata Repository (CMR), which will combine several existing metadata systems, such as the EOS Clearing House (ECHO), into a single unified metadata model (UMM). This metadata model will be able to support the growing needs of the EOSDIS in the future.
Another development by EOSDIS data providers to facilitate efficient data use is the ability to convert billions of bytes of sensor data into easy to analyze data visualizations, often within three to four hours of a sensor observation. These data visualizations are provided through the EOSDIS Global Imagery Browse Services (GIBS) and can be viewed using the EOSDIS Worldview interactive data visualization interface. To provide this imagery as rapidly and as responsively as possible, GIBS continually ingests imagery from NASA data providers, creates a global mosaic of these data, and then partitions this mosaic into an image tile pyramid. Users request these pregenerated tiles that are already rescaled and cropped, an innovative technique that saves time and storage space.
The ability to retrieve satellite data rapidly for viewing is accomplished through the EOSDIS Land, Atmosphere Near real-time Capability for EOS (LANCE) system. LANCE provides more than 100 near real-time data products created from data collected by sensors aboard NASA’s orbiting Aqua, Aura, and Terra satellites, all of which can be interactively browsed using Worldview. Through LANCE, GIBS, and Worldview, data users can conduct near real-time visual analysis of dynamic global processes as they are occurring, such as wildland fires, dust storms, and volcanic eruptions (Figure 2).
The key EOSDIS public-facing interface is the Earthdata website. Earthdata provides open access to NASA’s extensive collection of Earth observing data through links to search and discovery applications, announcements about new data products, and connections to NASA DAACs and the broader Earth data user community.
With the demand for NASA Earth observing data continually growing, the EOSDIS is constantly developing new data delivery strategies and improving existing technologies to enable more efficient use of data products. Improvements over the next few months include a redesigned Earthdata website and the full release of Earthdata Search, which currently is in beta testing. In addition, as NASA launches new Earth observing missions and builds improved sensors to collect data, the EOSDIS also will be creating new services and tools to enable efficient discovery and use of these new data products, including data from the joint NASA/Japan Aerospace Exploration Agency (JAXA) Global Precipitation Measurement (GPM) and NASA’s Soil Moisture Active Passive (SMAP) missions.
As Murphy observes, the EOSDIS success has led to more data products being wanted by ever increasing user communities. This spurs the continuous development of new data applications, better data delivery strategies, and improved technologies to enhance the user experience.
“If data are very expensive, then the only people who can afford to use the data are the folks who already have the tools to process the data,” he says. “Once you make the data free [as is the case with NASA Earth Observation data], you need better processing systems and more processing systems to cater to all these diverse groups that want and need the data.”
Earthdata Search: https://search.earthdata.nasa.gov/
Brennan, J. 2006. “NASA’s EOSDIS Data Centers Offer Something for Everyone.” In Earth Imaging Journal, September/October 2006, Vol. 3, No. 5, pp. 30-35.
Ramapriyan, H., et al, 2013. “Managing Big Data.” In Earth Imaging Journal. http://eijournal.com/2013/managing-big-data
Wanchoo, L., Young-In Won, and Brian Krupp. 2015. “EOSDIS FY2014 Annual Metrics Report.” https://earthdata.nasa.gov/about/system-performance/eosdis-annual-metrics-reports.
Last Updated: Aug 9, 2017 at 12:24 PM EDT