|
A key challenge facing climate scientists is the difficulty of "scaling up" their statistical analysis to cover time periods of years to decades. If one wants to compare model grids from one or more (e.g., atmosphere) models to L2 or L3 retrievals (of temperature, water vapor, aerosol optical depth, etc.) from multiple EOS instruments, or just inter-compare the instruments, millions of EOS granules need to be located and then staged onto disk in order to perform the analysis. And inevitably, the process must be repeated as the models or comparison algorithms are refined. Currently, the data must be "ordered" (staged onto disk at the DAAC's) by a human filling out a web form; order sizes are usually limited to a week or two at a time; and the response to the user comes via email. The EOS Clearinghouse (ECHO) provides services for space & time query, order entry, and automated delivery of granules via FTP push or pull. (Each of these service requests is forwarded to the appropriate data provider.) We propose todevelop an ECHO client that will take the human out of the loop and enable transparent, machine-to-machine, automated data query and access to multiple EOS datasets on a large scale.
Each of our client services (query, locate, order, fetch, etc.) will be a composite service, which automates and hides the complexity of the multiple ECHO (SOAP) calls required to accomplish the task. Each service will have a simple SOAP interface described in standard WSDL, and be published (callable) at multiple web servers. Once these services are available, they will be assembled into an automated workflow to perform the desired scientific analysis (steps 1-8). The Order/Fetch/Analyze/Repeat service cycle will automatically adapt the size of the time "chunk" to the disk space available for staging data (at the DAAC's and client site). Using future Grid virtualization, the storage & compute resources required for a particular analysis job might be discovered and allocated on the fly, and paid for later on a utility bill. |

