A Self-Supervised Learning (SSL) Framework for Discovery

Enabling a machine to find phenomena of interest in vast quantities of satellite imagery begins with training the machine on what it’s seeking.
author-share

(Editor’s note: NASA’s Interagency Implementation and Advanced Concepts Team [IMPACT] is a component of NASA’s Earth Science Data Systems [ESDS] Program, and works to further the ESDS goal of overseeing the lifecycle of Earth science data to maximize the scientific return of NASA’s missions and experiments for scientists, decision makers, and society.)

Machine learning can use human-labeled datasets as training datasets to achieve impressive results. However, hard problems exist in domains with sparse amounts of labeled data, such as in Earth science. Self-supervised learning (SSL) is a method designed to address this challenge. Using clever tricks that range from representation clustering to random transform comparisons, self-supervised learning for computer vision is a growing area of machine learning whose goal is simple: learn meaningful vector representations of images without having human labels associated with each image such that similar images have similar vector representations.

Image
Image demonstrating SSL using an image of a hurricane (left image) and an image of a desert coastal area (right image).
The SimCLR architecture trains a Convolutional Neural Network (CNN) to represent the image similarly across various photo augmentations (like rotation, cropping, or change in brightness) while simultaneously repelling the embeddings from other transformed images, resulting in representations without needing any labeled data. MLP represents a Multi-Layer Perceptron neural network. Click on image for larger view. SpaceML SSL team image based on an image from the SimCLR blog.

 

In particular, remote sensing is characterized by a huge amount of images and, depending on the data survey, a reasonable amount of metadata contextualizing the image such as location, time of day, temperature, and wind. However, when a phenomenon of interest cannot be found from a metadata search alone, research teams will often spend hundreds of hours conducting visual inspections, combing through imagery using applications such as NASA Worldview, which enables the interactive exploration of all 197 million square miles of Earth’s surface with more than 20 years of daily global satellite imagery.

 

This is the fundamental challenge addressed by a collaboration between IMPACT and the SpaceML initiative. This collaboration produced the Worldview image search pipeline. A key component of the pipeline is the self-supervised learner, which employs SSL to build the model store. The SSL model sits on top of an unlabeled pool of data and circumvents the random search process. Leveraging the vector representations generated by the SSL, researchers can provide a single reference image and search for similar images, thus enabling rapid curation of datasets of interest from massive unlabeled datasets.

The impetus behind this collaboration is to streamline and increase the efficiency of Earth science research. SSL developer, and winner of the Exceptional Contribution Award from the IMPACT team, Rudy Venguswamy explains:

Machine learning has the potential to radically transform how we find out about things happening in our universe to, proverbially, more quickly find needles in our various haystacks. When I started building the SSL as a package, I wanted to build something for scientists in diverse fields, not just machine learning experts.

 

The SSL tool was released as an open-source package built on PyTorch Lightning. The Worldview image search pipeline uses compatible Graphics Processing Unit (GPU)-based transforms from the NVIDIA Dali package for augmentation and can be trained across multiple GPUs leading to a 5-to-10 times increase in the speed of self-supervised training. New transforms in each epoch are critical to model learning, so improvements to the speed of transforms have a direct impact on training speed.

 

Image
Mosaic of Worldview imagery tiles.
A two-dimensional t-SNE visualization of how the Self-Supervised Learner (SSL) tool represents satellite imagery acquired from NASA's Global Imagery Browse Services (GIBS) that can be viewed using NASA Worldview. Notice how the machine was able to cluster similar image tiles, such as tiles with patterns for ocean, cloud, and land features. Click on image for larger view. SpaceML SSL team image.

 

The package is built with customizability in mind. Currently, the SimCLR and SimSiam SSL architectures are supported, and the package allows users to specify custom encoder architectures to the model as well as change model parameters with optional arguments specific to each model. For instance, researchers can specify their own pre-trained encoder or use one of the defaults provided that are pre-trained on ImageNet data.

As an example of the capabilities of the SSL, the image at right represents data from the Worldview website. The team trained SimCLR using the SSL on a data sample of approximately 50,000 images and plotted a t-Distributed Stochastic Neighbor Embedding (t-SNE) visualization, reducing the dimensionality of the embeddings to plot on a 2D plane. With no human labels for the imagery, the machine manages to cluster images intuitively.

As part of the larger Worldview imagery pipeline, the SSL helps streamline the research process as scientists study phenomena such as wildfires, oil spills, desertification, and the polar vortex.

Article originally published April 15, 2021, on the IMPACT Blog and reprinted with permission.


Additional Resources:

SpaceML

Last Updated