Skip to main content

An Open Source Labeler for Machine Learning

The ImageLabeler provides a catalog of candidate events for scientific investigation and gathers the training data required to train machine learning algorithms to automatically detect specific events.

IMPACT has released the open source, web-based tool ImageLabeler that allows users to create tagged images for use in training image-based machine learning (ML) models for Earth science phenomena. Training data for Earth science is as scarce as it is essential. One of the most challenging parts of scientific research is the collection of event cases for both detailed scientific investigation and climatological trend analysis. The ImageLabeler is designed to serve two purposes: 1) to provide a catalog of candidate events for scientific investigation, and 2) to gather, in a central location, the training data required to train machine learning algorithms to automatically detect specific events. Both purposes help to reduce the time it takes for researchers to identify candidate events and data suitable for scientific research.

The ImageLabeler supports GeoTIFFs and shapefiles along with the Web Map Service Interface Standard (WMS) and allows researchers to draw bounding boxes over images. An administrator can upload images and create teams with members responsible for tagging the images for the presence or absence of a phenomena. This allows for multiple members to speed the time-consuming activity of image labeling.

Screenshot of the image labeler website
Image Caption

The ImageLabeler website

Dr. Brian Freitag expands on the utility of ImageLabeler:

"The beauty of the ImageLabeler is that it is relevant to a broad range of research topics. As an atmospheric scientist, my research interests include a number of events tied to clouds and cloud formations that can be detected in satellite imagery (thunderstorms, tropical cyclones, extratropical cyclones, etc.). The ImageLabeler not only provides an opportunity to gain a better understanding of spatiotemporal characteristics of these events, but also to more deeply understand the physical processes that drive these events."

ImageLabeler provides researchers with existing training sets and allows for the creation of new training sets. Though initially built for use in Earth science contexts, the images loaded into ImageLabeler don’t need to be of an Earth science origin or intended for Earth science applications. ImageLabeler can be modified and deployed by researchers for ML models in any field as per their requirements for the creation and management of labelled image datasets.

Documentation for ImageLabeler is available on GitHub.

More information about IMPACT can be found at NASA Earthdata and the IMPACT project website.

Details

Last Updated

Published