Skip to main content

A Self-Supervised Learning Framework for Discovery

IMPACT and the SpaceML initiative produced the Worldview image search pipeline, a key component of which is the self-supervised learner which employs self-supervised learning to build the model store.

Machine learning can use human-labeled datasets as training datasets to achieve impressive results. However, hard problems exist in domains with sparse amounts of labeled data, such as in Earth science. Self-supervised learning (SSL) is a method designed to address this challenge. Using clever tricks that range from representation clustering to random transform comparisons, self-supervised learning for computer vision is a growing area of machine learning whose goal is simple: learn meaningful vector representations of images without having human labels associated with each image such that similar images have similar vector representations.

In particular, remote sensing is characterized by a huge amount of images and, depending on the data survey, a reasonable amount of metadata contextualizing the image such as location, time of day, temperature, and wind. However, when a phenomena of interest cannot be found from a metadata search alone, research teams will often spend hundreds of hours conducting visual inspections, combing through data such as on NASA’s Worldview, which covers all 197 million square miles of the Earth’s surface per day across 20 years.

This is the fundamental challenge addressed by the collaboration between IMPACT and the SpaceML initiative. This collaboration produced the Worldview image search pipeline. A key component of that pipeline is the self-supervised learner (SSL) which employs self-supervised learning to build the model store. The SSL model sits on top of an unlabeled pool of data and circumvents the random search process. Leveraging the vector representations generated by the SSL, researchers can provide a single reference image and search for similar images, thus enabling rapid curation of datasets of interest from massive unlabeled datasets.

Schematic of the SIMCLR architecture
Image Caption

The SIMCLR architecture trains an encoder (CNN) to represent the image similarly across various photo augmentations (like rotation, cropping, change in brightness) while simultaneously repelling the embeddings from other transformed images, resulting in representations without needing any labeled data.

The impetus behind this collaboration is to streamline and increase the efficiency of Earth science research. SSL developer and winner of the Exceptional Contribution Award from the NASA IMPACT team, Rudy Venguswamy explains:

"Machine learning has the potential to radically transform how we find out about things happening in our universe to, proverbially, more quickly find needles in our various haystacks. When I started building the SSL as a package, I wanted to build something for scientists in diverse fields, not just machine learning experts."

The SSL tool was released as an open-source package built on PyTorch Lightning. The pipeline uses compatible GPU-based transforms from the NVIDIA Dali package for augmentation and can be trained across multiple GPUs leading to a five to ten times increase in the speed of self-supervised training. New transforms in each epoch are critical to model learning, so improvements to the speed of transforms have a direct impact on training speed.

The package is built with customizability in mind. Currently, SimCLR and SimSiam are supported, and the package allows users to specify custom encoder architectures to the model as well as change in depth model parameters with optional arguments specific to each model. For instance, researchers can specify their own pretrained encoder or use one of the defaults provided that are pre-trained on imagenet data.

A 2 dimensional t-SNE visualization of how the self supervised learner represents satellite imagery from NASA GIBS
Image Caption

A 2 dimensional t-SNE visualization of how the self supervised learner represents satellite imagery from NASA GIBS

As an example of the capabilities of the SSL, the above diagram represents data from the Worldview website. The team trained SimCLR using the SSL on a data sample of approximately 50,000 images and plotted a t-Distributed Stochastic Neighbor Embedding visualization, reducing the dimensionality of the embeddings to plot on a 2D plane. With no labels, it manages to cluster images intuitively.

As part of the larger pipeline, the SSL helps streamline the research process as scientists work to research phenomena such as wildfires, oil spills, desertification, and the polar vortex.

More information about IMPACT can be found at NASA Earthdata and the IMPACT project website.

Details

Last Updated

Published