Easier Access to NASA Earth Science Data in the Cloud

A NASA-funded effort created a way to streamline access to NASA cloud-based data for users working in the R programming language.

Accessing cloud-based NASA Earth science data is now easier for researchers working in the R programming language thanks to the creation of the NASA-funded earthdatalogin package in R

This new login package improves access to cloud-native computing for users working in R. The package also makes it easier to use NASA Earth science data that have been migrated to the Earthdata Cloud, which is hosted by Amazon Web Services (AWS). Development of the earthdatalogin package in R was funded as part of a NASA grant provided through the Transform to Open Science Training (TOPST) solicitation.

The earthdatalogin package in R builds on a NASA-funded Python application programming interface (API) library called earthaccess, which was created for easier access to NASA Earth science data for users of the Python programming language. earthaccess is a community-driven library whose development is being led through NASA Openscapes. NASA Openscapes supports researchers using data distributed by NASA Distributed Active Archive Centers (DAACs) as these researchers migrate their workflows to the cloud.

“It quickly became apparent that the R community lacked appropriate tooling to streamline cloud-native use of NASA data, especially in an educational context where users were not already familiar with navigating the Earthdata ecosystem,” says Dr. Carl Boettiger, co-author of earthdatalogin and leader of the Boettiger Group in the Department of Environmental Science, Policy, and Management at the University of California, Berkeley. “Through interactions in the NASA TOPS community, we were introduced to the Openscapes community and the development team of the earthaccess Python package.”

Boettiger led the development of earthdatalogin with co-authors Luis López, Yuvi Panda, and Bri Lind. Additional collaboration and feedback came from the Openscapes community, particularly Dr. Eli Holmes at NOAA Fisheries. The earthdatalogin package in R provides public access to cloud-based NASA Earth science data from any networked device by using the NASA Earthdata Login API for authentication behind the scenes. This makes access to a wide range of Earth observation data from any location straightforward and compatible with R packages that are widely used with cloud-native Earth observation data.

“Our approach to earthdatalogin seeks to use community-standards-driven mechanisms that are highly interoperable with existing tools, minimizing the learning curve for new and established users,” says Boettiger. “earthdatalogin operates primarily behind the scenes, interacting with lower-level system libraries, such as gdal, to enable authentication and virtual filesystem access whether a user is operating inside the AWS center hosting NASA Earthdata or working from anywhere else.”

The earthdatalogin package in R can be accessed through the NASA Earthdata Cloud Cookbook. The Cookbook was created by NASA Openscapes mentors from across NASA DAACs to support scientific researchers using cloud-based NASA Earth science data. 

Boettiger notes that earthdatalogin works nicely with NASA Cloud-Optimized GeoTIFF (COG) products and most netCDF products. The team looks forward to extending support for netCDF and Zarr-formatted data, and exploring large vector and discrete global grid (h3, s2) products.

Learn More