Scientists wish handling science data were as easy as surfing the Internet, so they could focus on algorithms and analyses, rather than on the bookkeeping and format conversions that take so much of their time. Getting data is a chore. Not only must scientists sift through terabytes of data to obtain a few essential bytes, but after sifting they must often convert these data to a format their software can read.
But what if you could sift through the huge amounts of Earth science data available and select whatever small pieces you need directly into your computer program for processing, as easily as you could browse the web? What if all that work were handled behind-scenes, as adroitly as your web browser seeks and presents pages without troubling you for help in orchestrating the nuances—even though the details (text in its myriad headlines, fonts, tables, columns; imagery in different file formats, sounds, animations, and interactive virtual reality animations) are seriously complex? Would we be in Kansas anymore if science data files, no matter the complexity and no matter the provenance, were each handled swiftly and invisibly by a program in your computer—all with no help from you?
"Instead, data are typically written in different formats and subsetted according to data providers' whims," said Peter Cornillon, oceanographer at the University of Rhode Island (URI). "In one instance, I downloaded 300 megabytes of data to get the 13 megabytes I needed. Then, I had to acquire a program capable of interpreting the data, after which I had to transfer the data to my own software application."
In an effort to overcome such problems and enhance his own research collaboration capabilities, Cornillon, James Gallagher of URI, and Glenn Flierl, professor of oceanography at MIT, developed the Distributed Oceanographic Data System (DODS) to provide access to each other's oceanographic data. DODS permitted them effortless data access and interpretation despite differences in software applications. It also facilitated extraction of data subsets, such as air temperature and wind speed on a particular day.
"We wanted to move data easily from one place on a network to another without having to worry about what format the data were in," said Cornillon. "DODS allows you to open your software application, such as Matlab or IDL, and access the data you need using the World Wide Web's client/server technology."