Principal Investigator (PI): Christopher Lynnes, NASA's Goddard Space Flight Center
We propose to establish a federation of cooperating data centers, each maintaining its own Geospatial Interactive Online Visualization and Analysis Interface (Giovanni), while sharing datasets among each other to support data inter-comparison. Giovanni has long been a popular tool among remote sensing data users of NASA's Goddard Earth Sciences Data and Information Services Center (GES DISC). It supports about two dozen visualization and analysis services that enable interactive exploration of the data, a key early step in data analysis.
Giovanni brings usability, a low learning curve, and the ability to provide visualizations without downloading large volumes of data to the user's location. The trend toward more multi-sensor analysis has further elevated the importance of Giovanni's comparison services (scatterplots, correlation map, difference maps, map overlay, time series overlay). However, one of Giovanni’s limitations is the relative dearth of datasets from the Distributed Active Archive Centers (DAACs) other than GES DISC. Providing Giovanni’s capabilities to other data centers through federation could dramatically increase the number of datasets that are available to interactive data exploration and inter-comparison. This is particularly beneficial to science researchers studying data from multiple sensors and satellites.
Most data variables in Giovanni have come from datasets archived at GES DISC. However, scientist and user requests for external datasets in Giovanni (one of the more common user requests) have led to the inclusion of several data sets from other DAACs (e.g., the Level 1 and Atmospheric Archive Data System DAAC (LAADS DAAC) and Atmospheric Science Data Center (ASDC)). Unfortunately, supporting external datasets by GES DISC is difficult to sustain over the long run for reasons of efficiency, funding and policy.
In order to ensure that multi-instrument, multi-mission data comparisons in Giovanni can be sustained in the long term, it is essential to enable other DAACs to employ Giovanni themselves for their own datasets and to share those datasets with each other through the Giovanni infrastructure. This Federated Giovanni would answer the Advancing Collaborative Connections for Earth System Science (ACCESS) call for "Tools and Technologies That Improve Users Ability to Efficiently Discover, Find, Access, and Readily Use Multimission, Multiinstrument Earth Science Data" (sec. 1.2.1).
The recent evolution of Giovanni has led to a modular system based on common community standards, giving rise to the possibility of a Federated Giovanni. Federated Giovanni would allow each DAAC to configure and deploy its own Giovanni, while also allowing the various Giovanni deployments to incorporate data sets from each other for data inter-comparisons. In order to meet the different goals and needs of the DAACs, Giovanni federation would be enabled at three different support tiers, with increasing levels of investment and autonomy in offering Giovanni services:
- Tier 1 would have GES DISC hosting a Giovanni portal for DAACs looking to support only a small amount of data; however, the other DAAC would configure, test and maintain the dataset and portal configuration.
- Tier 2 would allow the other DAAC to easily deploy a Giovanni Virtual Machine on the hardware of their choice.
- Tier 3 would allow DAACs to customize or otherwise enhance the source code, and contribute the enhancement back to the Giovanni source baseline.
In the Federated Giovanni, long-term sustainability is attained by distributing the ability and responsibility for supporting datasets in Giovanni out to the designated DAACs. These DAACs are thus able to decide how much effort to expend on supporting a given dataset in Giovanni (vs. other services, or other datasets). They are also more efficient at supporting the dataset, with better knowledge about the domain, the dataset and the user community for that data.