Reflections on Leading NASA's IMPACT Project

The founding manager of NASA's Interagency Implementation and Advanced Concepts Team (IMPACT) looks back on five years of project accomplishments with the Earth Science Data Systems Program.
author-share

It has been almost six years since NASA's Interagency Implementation and Advanced Concepts Team (IMPACT) was chartered as a project. It has evolved from a small initiative under NASA's Earth Science Data Systems (ESDS) Program into an ESDS element known for its innovation and efficiency. As I step away from my leadership role at IMPACT, I want to share some of the significant contributions made by IMPACT to the ESDS Program and the agency.

IMPACT Beginnings

IMPACT started as an innovation hub with a focus on new technologies while improving existing processes in the data lifecycle.

As I began putting together the IMPACT project team, I envisioned IMPACT tackling problems that fell between basic research and pure applied work with a goal of improving existing processes. I also put significant thought into selecting the right people for the team, defining core values, and creating an ideal work environment. The need for speed was crucial if we wanted to stay current with technological changes, even if it meant failing quickly. I instilled a mindset that resisted the urge for uniformity, recognizing that hard challenges require different approaches.

IMPACT has consistently proven its value. I want to highlight some of our specific individual successes and broader accomplishments over the last five years.

IMPACT Highlights and Accomplishments

Improving Metadata

The Analysis and Review for the Common Metadata Repository (ARC) project emerged as a self-reflective endeavor within the ESDS Program. While leading the Climate Data Initiative (CDI), NASA found itself in a paradoxical situation: advising other federal agencies on improving their metadata and stewardship practices, while working to improve our own Common Metadata Repository (CMR). ARC reviewed 76% of CMR metadata collection records and one random sample of file level metadata.

But ARC is much more than just a metadata checking operation. The team developed a systematic metadata quality assessment methodology and Python tools to automate these checks. They also created an automated metadata quality assessment tool designed specifically for data stewards, streamlining the process of maintaining high-quality metadata.

Supporting International Collaboration

The Multi-Mission Algorithm and Analysis Platform (MAAP) represents a groundbreaking collaboration between NASA and ESA (European Space Agency). Designed as a cloud-based science computing platform, MAAP offers a unified platform to access biomass data, tools, and resources. IMPACT has been at the forefront of this endeavor, spearheading the data systems component of NASA's platform.

MAAP is an empowering infrastructure that has catalyzed a shift within the research community towards collaborative science computing in the cloud. The platform has already yielded tangible results. It has empowered the University of Maryland's science team to generate a new ABoVE Boreal ICESat-2 Aboveground Biomass Density data product. Additionally, MAAP has showcased cloud-optimized point clouds for the ICESat-2 Canopy Height product using Entwine Point Tiles. It also hosts a dashboard that supports the Committee on Earth Observation Satellites (CEOS) Biomass Harmonization effort, featuring five biomass data products for visualization at conferences and working sessions.

One of MAAP's most utilized features is the Global Ecosystem Dynamics Investigation (GEDI) Subsetter, which has become an indispensable tool for the scientific community. Furthermore, the platform has demonstrated the feasibility of running MAAP algorithms on NASA's Ames High-End Computing Platform. MAAP is a testament to IMPACT's ability to foster international collaboration while driving scientific innovation.

Supercharging Scientific Discovery

The Science Discovery Engine (SDE) is a significant step towards enabling open science at NASA. Developed under the umbrella of IMPACT at ESDS, the SDE serves as a one-stop portal for discovering complex scientific data and information across NASA's Science Mission Directorate (SMD).

The journey to create the SDE faced many challenges, the most formidable of which was integrating diverse scientific content into a unified search capability—on a tight schedule. Traditional aggregation methods were quickly ruled out due to time constraints, complexity, and limited resources. The solution? The SDE team extensively researched cutting-edge search technologies and chose an insight engine to develop an SMD-wide search capability rapidly. The result was the beta release of the SDE in December 2022 with over 84,000 metadata records and 700,000 documents.

The SDE faced another challenge when a leadership change occurred with the agency's Chief Information Office (CIO) Enterprise Data Platform (EDP) team. Just months before the SDE's scheduled launch, the new EDP leadership recommended migrating the SDE to a new environment. The IMPACT SDE team collaborated closely with the EDP and the Mission Cloud Platform (MCP) teams to ensure a smooth transition. 

This rapid pivot exemplifies IMPACT's ethos of adapting and fostering successful cross-organizational partnerships to meet mission-critical needs. The SDE's beta launch is a testament to the continuous collaboration and active engagement among SMD, EDP, and MCP, and marks another milestone in IMPACT's journey of innovation and excellence.

Cataloging Airborne and Field Campaigns

The Airborne Data Management Group (ADMG) at IMPACT was created to serve the diverse needs of scientists, modelers, and interdisciplinary researchers in the airborne and field campaign community. 

One of ADMG's flagship initiatives is the Catalog of Archived Suborbital Earth Science Investigations (CASEI). Since most airborne and field data are collected during field investigations or campaigns, CASEI content is organized around these events. Working with NASA's Earth Science Data and Information System (ESDIS) Project, the ADMG team meticulously constructed a detailed metadata catalog to streamline data discovery, expedite access, and facilitate appropriate reuse of these observations.

But ADMG's work isn't just about the present; it's also a rescue mission for the past. In cataloging over a half-century of airborne and field activities, ADMG found that a significant portion of pre-digital era data had been lost to time. To address this, ADMG spearheaded data rescue activities, which involve converting physical records to digital formats, navigating federal record retention requirements, and preserving invaluable institutional memory.

Beyond these specific projects, ADMG has been instrumental in elevating the overall standards of data stewardship for the airborne and field data community and has introduced organizational and procedural improvements that have streamlined data archival, discovery, access, and use.

Assessing Satellite Needs

The Satellite Needs Working Group (SNWG) biennial assessment is a critical exercise to evaluate the satellite data needs of federal civilian agencies. While IMPACT's initial involvement was limited to detailed analysis of the surveys, the 2020 assessment cycle marked a turning point. With an unexpected 50% surge in survey submissions, IMPACT ramped up its participation, developing tools in real-time and offering multifaceted support to NASA Headquarters and other participants. IMPACT led a lessons-learned retrospective post-assessment and proposed a revamped, scalable process for the 2022 assessment cycle. This new approach eased the workload at NASA Headquarters and fostered greater collaboration and transparency across NASA, NOAA, and the USGS.

Recognizing the need for streamlined workflow management to efficiently complete the SNWG assessment, the IMPACT team identified Asana, an off-the-shelf technology, as the ideal tool for the job. Despite initial hurdles—Asana was on NASA's unapproved software list—IMPACT navigated the complexities to secure its use. Customized with automation scripts, Asana became an invaluable asset in the 2022 assessment cycle. IMPACT also addressed the need for a collaborative report-writing tool. Building on IMPACT's previous work with the Algorithm Publication Tool (APT), the Report Generation Tool (RGT) was developed and integrated with Asana. This tool streamlined the creation of 115 reports, significantly reducing manual effort through effective boilerplate copy management.

Lastly, IMPACT introduced automation into the thematic area assignment process. Leveraging machine learning algorithms, IMPACT provided top thematic area suggestions. IMPACT's heightened involvement in the SNWG assessment has optimized the process and showcased the team's ability to handle pressure and increased workloads.

Managing Massive Data Production in the Cloud

The Harmonized Landsat and Sentinel-2 (HLS) project is a significant effort in cloud-based data production and management. The project faced the challenge of scaling its compute infrastructure to produce around 10 to 20 thousand granules per day for ongoing data production and up to 100 thousand granules per day for historical data. Since July 2021, the system has added nearly 25 million granules and almost 400 million band-separated data files. The data volume for these granules is about 4 petabytes (PB), and is growing at about 1 PB per year.

When the global HLS production project started in 2019, there was no end-to-end cloud-based data production system capable of supporting HLS production at the required scale. IMPACT took the algorithm implementation code provided by the HLS science team and optimized it for cloud implementation. The IMPACT team also developed a framework for executing the algorithm components with detailed logging. Several interfaces and components were designed to acquire the input and ancillary data needed for production and to stage the data for consumption and ingestion by NASA's Land Processes Distributed Active Archive Center (LP DAAC).

Access to archived Sentinel-2 data was a significant challenge due to restrictions imposed by ESA. In collaboration with ESA, IMPACT developed data transfer software capable of transferring nearly 6 PB of Sentinel-2 data in three to six months. The software was designed to be cloud-agnostic and was successfully deployed and tested in both the Amazon Web Services (AWS) and ESA cloud environments. It transferred 5.6 PB of data in just one month, far exceeding the original target. The HLS project successfully navigated data scaling, software optimization, and large-scale data transfer challenges, setting new standards in each area. 

Making SmallSat Data Accessible

The Commercial SmallSat Data Acquisition (CSDA) program is a pioneering initiative for acquiring and managing Earth observation data from commercial small satellites (SmallSats). Established in 2017 as a pilot program, CSDA aimed to assess the feasibility of using commercial satellite data for NASA's research and applications.

IMPACT's CSDA Data Management Team (DMT) had to quickly organize to ensure that data purchased from vendors like Planet Labs, Inc. were fully utilized. The team developed a cloud-based mirroring solution capable of acquiring large-scale data. It successfully transferred approximately 2.3 PB of data to NASA's managed cloud environment in less than two months.

As of June 2023, the CSDA DMT has moved 11.1 PB of Maxar data from the NASA Center for Climate Simulation (NCCS) to AWS, received 176.1 terabytes (TB) directly from partner organizations, acquired 4.7 TB of new data for NASA researchers, and distributed 18.3 TB of Maxar data to users. Given the varying access and use restrictions of the disparate data managed by CSDA, the DMT team developed a centralized tool for user access requests, data discovery, and distribution.

The DMT team has also developed a repeatable process for onboarding new vendors into the program, known as "onramp." As of August 2024, this process supports six vendors/data providers with over 3.5 PB of data mirrored to the CSDA archive, with an additional four vendors in the process of onboarding. The CSDA Program DMT has archived 2 PB of data in the Earthdata Cloud, making these data discoverable through CMR and Earthdata Search. CSDA is the first instance of a non-DAAC becoming a sustained, integrated entity within NASA's ESDIS Project.

Providing a Dashboard into Environmental Changes

The Earth Observing (EO) Dashboard project is a remarkable collaboration between NASA, ESA, and JAXA (Japan Aerospace Exploration Agency). The international team, consisting of Earth scientists, data engineers, communication personnel, and software developers, aimed to create a tool to help understand Earth's changes resulting from COVID-19 with an ambitious two-month timeline for the project.

Given the urgency of the pandemic and the rapidly changing global situation, the team worked tirelessly to collate data, create stories, and develop interactive tools around five key themes: economic activities, air quality, greenhouse gas emissions, agricultural activity, and water quality. The team successfully launched the COVID-19 Dashboard in June 2020.

Encouraged by the success of the initial dashboard, the team enriched the tool to create the comprehensive Earth Observing (EO) Dashboard. This included adding new indicators and developing more stories around environmental changes. The EO Dashboard received the International Astronautical Federation (IAF) Special Award on Space for Climate Protection at the Global Space Conference on Climate Change. The success of the EO Dashboard led to the initiation of NASA's Visualization, Exploration, and Data Analysis (VEDA) project, which extends the capability to allow full analytics at scale on the cloud.

Innovation Now and in the Future

IMPACT's contributions to NASA's ESDS Program go beyond the individual projects it manages. IMPACT has provided 15 independent cost estimates for various missions, which have enabled NASA Headquarters to make informed decisions about the viability of proposed budgets, ensuring that resources are allocated efficiently.

IMPACT has made its mark in the research sphere with over 56 peer-reviewed publications. Additionally, IMPACT staff have developed five major open-source software capabilities, including HLS, MAAP, and Image Labeler, which are being used by researchers worldwide.

However, IMPACT's influence extends beyond research and development. The project has brokered non-reimbursable Space Act Agreements with five leading commercial cloud companies—AWS, Microsoft, IBM, Google, and NVIDIA. These partnerships aim to advance open science by making high-value NASA datasets accessible. The AWS collaboration has been particularly fruitful, resulting in storing about 2 PB of diverse NASA science data and providing computing resources for educational events.

A groundbreaking initiative began in 2023 when IMPACT partnered with IBM Research to develop an artificial intelligence (AI) model for HLS data capable of many downstream applications. Within six months, the collaboration yielded a geospatial foundation model with potential disaster management applications, such as flood detection, burn scar analysis, and environmental monitoring.

IMPACT has also played pivotal leadership roles in professional societies like the Institute of Electrical and Electronics Engineers Geoscience and Remote Sensing Society (IEEE GRSS) and the American Geophysical Union (AGU). IMPACT members have organized educational initiatives to engage the next generation of scientists, including summer schools, Space Apps challenges, and webinars. In recognition of its multifaceted contributions, IMPACT members have received eight individual agency awards, including the NASA Silver Achievement Award and Exceptional Service Medal, and eight NASA group awards.

Reflecting on these productive years, I take immense pride in the IMPACT project's numerous contributions with the ESDS Program. The time is right for a new voice and new leadership for IMPACT. As I pass the baton, I look forward to seeing the project move in new directions and continue to support the agency in its science mission. I will always be proud of the project, the team, and the core values they embody.

Last Updated