Review of netCDF version 3 implementation and operational suitability NASA's Earth Science Data Systems Standards Process Group (SPG) is considering the network Common Data Form version 3 (netCDF classic), for adoption as a community standard. You are invited to review this Requests For Comment (RFC) in the context of your implementation experience with this data format specification and its suitability for operational use. Only answer questions applicable to your experience. Please send completed review to: spg-rfc-011@lists.nasa.gov. My answers are added in Arial font. Implementation Experience questions: - (Your background) Describe in a sentence or two your overall implementation experience related to the proposed specification. (e.g., specification implementer, tools developer, data provider, scientific analyst, science user, etc.) Have you directly implemented the netCDF classic format specification or modified a netCDF classic library using the specification? Did you use pre-existing software, and if so, what did you use?
I develop tools for a netCDF-based database and also use various products provided in netCDF, including those provided by NASA and NOAA. In both cases I've used the netCDF library to read and write the data. I help develop GMT (which reads and writes netCDF grids), and use various pre-existing softwares: ncBrowse, ncview, Panoply (NASA). - (Completeness) Does the specification provide all the detail you need to implement it in software? (e.g., to read or write a data file; to implement or modify the library, a profile or extension; or develop a tool such as a format translator) If not, describe what is missing in the specification.
Yes. - (Accuracy) Do any parts of the specification contain inaccuracies, or internal inconsistencies? If so, please provide details.
See 4. - (Clarity) Is any part of the specification ambiguous, or poorly explained? If so, please provide details.
In the library manpage it says that data will be converted between data types where needed. This I interpreted as: when I write a floating point value into an integer field, or when reading floating point field into an integer value, the number will be rounded. This is not done: the floating point values are truncated instead. That's been clarified upon my recommendation in version 3.6.2 and later, but the functionality remains the same. - (Balance) Does the standard describe the right set of concepts and data types, and enable the appropriate data operations for its intended users? Is this set of concepts and data types an overly broad set (requiring excessive complexity) or narrowly simplistic set?
Yes. No. - (Usefulness) How well does this specification meet your information sharing needs? (e.g., does it work well with the data types and data manipulations in your application? Does it properly represent your datasets? What are the pros and cons of this data format?)
The netCDF data format is very useful for my data exchange needs. Binary data are much more compact and faster to read than ASCII. NetCDF library takes away problems of interchange of binary data between machine/processor types. Built-in metadata is very useful to explain the data content, data history, etc. - (Implementation) What implementation challenges does the proposed standard present? (e.g., does it require advanced processing power, large amounts of memory, complex configuration, etc.? Does it scale to a production environment?)
On the contrary: the netCDF library allows subsampling. So if only subset of data is needed, no additional memory is needed to read entire data file. It does work very well in a production environment as is proven by many NOAA data products. - (Flexibility) In what software environment(s) have you used netCDF classic (e.g., Solaris, Linux, Windows, Mac OS X)? Have you implemented, tested or deployed netCDF classic or packages other than those provided by the original netCDF classic and developers?
I implemented netCDF classic tools on Linux and MacOS X and others have deployed the same tools on other Unixes as well as Windows, all with success. Operational Suitability questions: - Do you currently use or plan to use netCDF classic in a production setting? What types of applications do you use with netCDF classic? Is netCDF classic applicable to your applications (e.g., Does it work well with the data types and data manipulations in your application?)
NOAA will continue to produce Jason-2 data in netCDF classic, as well as a database of all altimeter data till present. These are generally along-track satellite data, but more complex data models are included as well. NetCDF is very suitable for distributing this type of binary data. - Why do you choose to use netCDF classic over other data formats for your applications?
Flexibility to expand into the future. Easy exchange. Many existing tools already available. - Have you or your users encountered any difficulty when using some of the data access or visualization tools (e.g., IDL, GrADS, etc.) on netCDF classic data files? If you have, please provide a brief description of your experience.
No. I have just heard from less experienced users from developing countries, that netCDF is actually very practical. - Does the netCDF file format meet your requirements for storing and accessing data? (e.g., Can it handle the data types in your applications?)
Yes. - What operational challenges or limitations does netCDF classic present? (e.g., Does it take a long time to learn how to use it? Does it require advanced processing power, large amounts of memory, complex configuration, etc)
No. Particularly learning to read a data file using the netCDF library (available in C, C++ and Fortran) is fairly easy. Writing datasets with the proper meta data is a bit more complex. But that is not a fault of netCDF, but is due to meta data standards devised outside the netCDF development team, like COARDS, CF-1.0, etc. - What benefits does netCDF classic present? Do the benefits of netCDF classic outweigh the challenges? (e.g., Does it offer the flexibility you want to package the data types in your applications? Does it facilitate interdisciplinary studies?)
Advantages of netCDF are numerous: availability of generic tools, wide use in atmospheric and climate community, ease of exchange of binary data, OpenDAP server ability to subsample data on demand. - How much data do/will you provide or archive in netCDF classic? (number of distinct data products or data sets, total data volume, number of files.)
I download numerous netCDF data files daily (approx. 50 with a total size of approx. 300 MB). I produce a similar number of netCDF files daily and maintain/manage a large altimeter database (RADS) of about 150 GB of netCDF data. The operational Jason-2 production facility at NESDIS produced approximately 1 GB of netCDF data per day. - How many users do you have or expect to have for data in netCDF classic, and what is your expected user community? Currently there are about 100 users of the RADS altimeter data base.
There are about 100 registered users of the Jason-2 operational data. The GMT software that I help develop has auxiliary data in netCDF and has over 1000 users worldwide. - (User comments) Any additional comments, observations or criticisms of netCDF classic and the RFC can be provided here.
The most complicated issue with netCDF data is the establishment of conventions. Currently the most widespread are COARDS and CF-1.0. Using one of those conventions for setting up metadata extends the usability of the data and facilitates reading of netCDF data by third party and generic netCDF tools.
|
|