Today's information modeler can rely on well-established methodologies for information model development and expressive languages for their formalization. When it comes to implementing ISO information models the picture appears much less settled. Implementation methods such as databases and XML assist the implementer with sopisticated tool chains and machinery, adding to their appeal. Unfortunately, the information model is only one of two key ingredients on the road to a valid implementation. The second, equally important, ingredient, and not part of the information model, is usage. A particular implementation method may deliver a verified implementation, but it may turn out to be not valid (i.e., not fit for its intended purpose). It may turn out to be valid, but the community may not accept it or a hostile ecosystem may prevent it from gaining a foothold.
Since its initial release in 1998, HDF5 has been in the sights of implementers in search of suitable targets for their models. HDF5 distinguishes itself from competitors through a unique combination of features related to its data model, its software stack, and its file format:
These and other capabilities have led to the widespread adoption of HDF5 in many fields, ranging from earth science to product models.
The HDF5 ecosystem includes a wide variety of APIs, tools, and applications. An HDF5 implementation of any ISO information model will be constrained by community traditions and preferences, and the pervasiveness of the support of certain HDF5 features.
For example, if the support of user-defined data types (UDT) is generally limited or diverging throughout the ecosystem, a solution solely based on technical excellence may not find wide adoption. Even with the best support for UDTs this might be a poor choice if, not informed by its intended use, it was made as a result of taking a formal specification too literal.
What looks like a perfect match on paper may incur a hefty performance penalty when the implementation is brought into its intended environment. To make matters worse, but to meet its performance targets, a successful implementation may have to opt for different HDF5 represenations of formally similar concepts in the same information model. For example, a sparsely populated record type might be more efficiently represented by an HDF5 group decorated with HDF5 attributes or, depending on use, a collection of HDF5 datasets.
As there is no "royal road to geometry" (Euclid's alleged reply to King Ptolemy's inquiry about a shortcut to learning the subject), there is no "easy way" to implementing rich ISO information models in HDF5. On the upside, you can always count on the advise and a helping hand from the dedicated individuals at The HDF Group and the wider HDF community. Talk to them at mailto:help@hdfgroup.org or the HDF-Forum !