Review of HDF5 / HDF-EOS5 specifications- Please indicate which RFC this response applies to:
_X__ ESE-RFC-007 HDF5___ ESE-RFC-008 HDF-EOS5 - (Your background) Describe in a sentence or two your overall implementation experience related to the proposed specification. (e.g., specification implementer, tools developer, data provider, scientific analyst, science user, etc.) Have you directly implemented or modified an HDF5 or HDF-EOS5 library using the specification? Did you use pre-existing software, and if so, what did you use?
I have been working with HDF since approximately 1995, though not continuously. When I started few tools supported HDF and so I had to work in either C or Fortran. Since then more vendors have offered HDF(5) support in their high level language products and I have used those as well. I have been using HDF(5) since June 2004 in relation to the Data Center. - (Compatibility) The latest versions of the HDF and associated HDF-EOS specifications were submitted to the standards process; these are HDF5 and HDF-EOS5. HDF5 represents a significant departure from previous versions of HDF. HDF-EOS5 supports HDF5, but maintains the previous HDF-EOS interface to the extent possible. What version(s) of the specification(s) have you evaluated or are you using now? If you are not using the latest version(s), why not?
Our data processing model uses HDF4, HDF5, HDF-EOS and HDF-EOS5. The earlest components were designed before HDF-EOS5 became stable and remain in HDF/HDF-EOS. Later components (Level 2 and higher products) use HDF-EOS5. Thus I have experience with both major HDF and HDF-EOS versions. - (Completeness) Does the specification provide all the detail you need to implement it in software? (e.g., to read or write a data file; to implement the library, a profile or extension; or develop a tool such as a format translator) If not, describe what is missing in the specification.
I do not know if the standard provides enough information to implement libraries which can read and write the data. This is a large undertaking and re-implementing the libraries has never been a goal for me. - (Accuracy) Do any parts of the specification contain inaccuracies, or internal inconsistencies? If so, please provide details.
I do not know of any inaccuracies in the standards. - (Clarity) Is any part of the specification ambiguous, or poorly explained? If so, please provide details.
The API documentation describes input variables to functions as being either IN or OUT (or perhaps both). In the case of variables which are passed as reference (e.g.an array) and are labeled as IN type parameters it has never been clear if the functions will modify the referent or not. Are IN type parameter referents constants? This question applies both to HDF and HDF5. - (Balance) Does the standard describe the right set of concepts, behavior, data types, and data operations for its intended users? An overly broad set (requiring excessive complexity)? A narrowly simplistic set?
No complaints. - (Usefulness) How well does this specification meet your information sharing needs? (e.g., does it work well with the data types and data manipulations in your application? Does it properly represent your datasets? What are the pros and cons of this data format?)
HDF5 provides an effective method for managing scientific data. - (Implementation) What implementation challenges does the proposed standard present? (e.g., does it require advanced processing power, large amounts of memory, complex configuration, etc.? Does it scale to a production environment?)
The implementation challenge is the large amount of time needed to write codes which use HDF or HDF-EOS (either version). The programming process in either C or Fortran is not a particularly fast one. For scientific codes which need to run for long periods of times the development time can amortized over the long execution time. For analysis or data management needs using these programming languages is costly. Consistent, (i.e. funded) support for an open source scripting language would be helpful. A second implementation challenge is the need to have codes which support both HDF and HDF5. Being able to write single applications which can handle both data formats would be valuable in particular projects such as the OMI where both formats are used or when a transition is being made from HDF to HDF5 benefit from codes which only need to be written once. I think an object oriented programming API such as C++ or perhaps some modern version of Fortran would make this feasible. Apparently one exists for HDF5, but HDF? - (Flexibility) Into what software environment(s) have you integrated HDF5 or HDF-EOS5 (e.g., Solaris, Linux, Windows, Mac OS X)? Have you implemented, tested or deployed HDF5 or HDF-EOS5 packages other than those provided by the original HDF5 and HDF-EOS5 developers?
I have used HDF under Solaris, Linux, and Mac OS X and Cray T3E. It has performed well under all of these. I have used HDF5 under Solaris and Linux and expect to use it soon under Mac OS X. The OMI SIPS group has also used HDF4 as provided with the SDP toolkit.
|
|