NASA’s Earth Observing System Data and Information System (EOSDIS) currently archives more than 17.5 petabytes of Earth science data collected from satellites, airborne campaigns, and field observations. The more than 11,000 unique data products produced from these data represent one of the largest collections of Earth observing data in the world. Overseeing the ingesting, processing, archiving, and distributing of these products is one of the many responsibilities of Kevin Murphy, the Program Executive for the Earth Science Data Systems (ESDS) Program within NASA’s Earth Science Division. The fact that all of these data are delivered under a free and open data policy entails a number of unique challenges for Mr. Murphy and NASA, along with the prospect of exciting new products being available in the not-too-distant future.
You are the 2016 recipient of the American Geophysical Union’s (AGU) Charles S. Falkenberg Award, which recognizes “an early- to middle-career scientist who has contributed to the quality of life, economic opportunities and stewardship of the planet through the use of Earth science information and to the public awareness of the importance of understanding our planet.” What does this award mean to you in your work developing NASA’s EOSDIS Earth science data collection?
I originally wanted to become a geographer to work on trees and look at land cover and understand how the environment works. One of the things I saw that needed to be done is better access to environmental information so people can be more informed about their decisions. I’m just really honored that the path that led me to this award was able to help people better understand their environment and make better decisions about their environment. This is one of the things I think you see throughout NASA and this whole community—they are really interested in looking at the whole context of decisions and making this information available so people can make informed decisions. Sometimes we don’t have much time to make these important decisions about the environment.
What is the overall significance of NASA’s Earth science data?
Climate signals and long-term global processes happen over decades. You need these long-term data to tease the inter-annual variability from the trend. Earth systems science is an integrated science. You want to be able to go and look at multiple measurements to get a global picture of what’s going on at a variety of spatial and temporal resolutions. Through Earth systems science, we try to understand how the Earth works and how we can understand the trends in these processes. This is not just learning for the sake of learning. We have to live here. We have to understand how hurricanes form, we have to understand how crops are grown, we have to understand how much fresh water there is and how clean this water is, we have to understand aerosols and how these impact human health. These are all things we can do.
How do you see NASA accomplishing this?
The first thing we have to do is to maintain stewardship [of these data]. This means more than just putting something in a box and knowing where it is. You have to make the data available to people as technology changes. When NASA’s Earth Observing System first got started, you would order a data tape from somebody over the phone and they would mail the physical data tape to you. Today we’ve moved well beyond this. All our data are on spinning disks and you can go to an internet address and download these data. We have to make sure that we maintain a sufficient amount of investment to make sure that we keep the technology current. This is a continuous evolution process. Also, NASA is not the only organization, public or private, that collects these data. So we have to have a lot of relationships with international communities, other government agencies, and with the private sector.
A key component of NASA data is that they are free and open to the public. What are the challenges of this policy?
There are a variety of challenges. Not only are our data free and open, they also are provided as soon as feasible after the launch of a mission or the completion of a ground survey or airborne campaign. This means that our data also have to have a low latency to really make this policy work. We have millions of [data] users, and that’s a lot of products that go out every year. It’s not only to U.S. users; these products go out all over the world. There’s no period of exclusive access for anyone. We have to be able to maintain our systems in a fashion that allows the users to get these data pretty easily. We take very seriously the need to engineer our products and systems so they are robust. They have to have a lot of up time; they can’t go down.
Also, the algorithms and the software that go into creating [different data products] also have to be free and open or else you don’t know how you got from one [data product] to the next. The calibration information has to be free and open. This becomes very difficult when you’re working across international communities that have different legal frameworks or when you’re working with commercial companies that want to protect their investments.
These data are produced, archived, and distributed by NASA's Distributed Active Archive Centers (DAACs). Does this create any additional challenges?
The good news is that we have a great set of people doing this work at the DAACs. This makes it possible for us to support an undertaking of this size.
Certainly a challenge is that each discipline has its unique requirements. When you want to bring together products from different disciplines to address an integrated science question, some of those formats, or projections, or technical aspects of data that one community might want may not meet the same needs of another community.
Additionally, each DAAC has a User Working Group comprising scientists in the discipline supported by that DAAC. These working groups serve as a bridge to the scientists in that community. We have to make sure that we balance the requirements, needs, and priorities that the working groups give to the DAACs among our programmatic requirements of stewardship and distribution.
This is all on top of maintaining a technology footing that’s secure, managing things efficiently and effectively so we don’t have duplication of activities, and supporting the user requirements so our data users can do their fundamental job, which is science.
You spoke earlier about international collaborations. Tell me about some of these and how they will affect data products our users might see in the near future.
There are a lot of new international data products coming. In fact, once we bring on some of the international partners, we may almost double our ingest rate from the original [Earth Observing System] missions. We have a bilateral agreement with the European Space Agency (ESA), we also have a bilateral with the European Commission for Sentinel satellite mission products to be distributed from NASA.
Not only do we get their data, but they get ours; it’s a two-way street. We’ve been providing free and open data for a long time, but we have increasing interoperability among our metadata catalogs. If you search from a portal in Europe, you can find NASA data and vice versa here in the U.S. We’ve supported the repatriation of some of our data products, such as from the Alaska Satellite Facility Distributed Active Archive Center (ASF DAAC) to ESA so they can look at glacier calving events back to 1979 or 1980 to the present using SAR data.
We work very closely with the Committee on Earth Observation Satellites (CEOS). This effort is really aimed at merging the capabilities of space agencies that are interested in Earth and have assets to provide. There’s interoperability from about 26 different space agency’s metadata catalogs so we can look across the variety of different areas and capabilities of these different countries.
We also work with the Group on Earth Observations (GEO) in not only advocating for free and open data, but also looking at infrastructures so our products and other nation’s products can be combined - not only space observations, but also ground-based observations, economic observations, maps, and other observations. You not only need the environmental context, but also the socioeconomic context to provide the full picture of how humans are utilizing the Earth and how the environment influences their usage patterns.
Where do you see NASA Earth science headed?
I think we’re making a lot of effort to see how we can enhance our products so a broader base of users can use our data products in a scientifically valid way. The other thing that I hope to see is the merging of observational data with model output in a more cohesive manner, making it so that the forecast we have and the hindcast we have can be compared with the observations that we take. This is a little difficult since these are totally different things; one is a model and the other is a measurement. It would be nice to see if these two streams of information can be used together.
I think that at some point in the future, I’d like to see environmental information as ingrained in people’s decision making as looking at what today’s weather is going to be - something that you almost take for granted that it’s part of your decision making process. Maybe that’s not tomorrow, but you can see this happening as our capabilities improve over the next 10 to 15 years.