NASA embraces open science. NASA’s Interagency Implementation and Advanced Concepts Team (IMPACT) works to enable open data for NASA tools such as Worldview, which gives users access to over 450 terabytes of satellite imagery. Open data is critical to research. Before embarking on a scientific study related to particular phenomena, such as wildfires, scientists need to collect numerous examples of these phenomena. Locating global examples requires searching through 197 million square miles of satellite imagery across more than 20 years of data. Such an effort can produce a valuable trove of data, but the act of manually searching the data is cumbersome and laborious. Making large amounts of data more discoverable and usable for specific parameter extraction is a hard problem. A question such as “Can we use new techniques, such as self-supervised learning, to tackle our data discovery problem?” has a number of hidden questions:
• Can we find a needle in a haystack? • Can we teach a machine to search fine-grained data without labels? • Can we get artificial intelligence (AI) to present examples to a human when it gets confused? • Can we scale up the search from gigabytes to terabytes to petabytes? • Can we create tools that make it simple to ingest the data? • Can we learn to represent rare events? • Can we teach AI to focus on the interesting parts? • Can we search several years of data covering the entire planet in under a second?