Hierarchical Data Format (HDF) development began in 1988 at the National Center for Supercomputing Applications (NCSA) to manage scientific data. Its mission is "To develop, promote, deploy, and support open and free technologies that facilitate scientific data exchange, access, analysis, archiving, and discovery." HDF is a file format and subroutine library for sharing data in a distributed environment and is especially oriented toward storing scientific data. It is a self-describing, portable data format that is well suited to sharing data within heterogeneous computing environments.
In 1998, after careful consideration, HDF underwent a dramatic version upgrade with the release of HDF5. As computer systems rapidly evolved in the 1990's, data storage needs became even more demanding. As a result, HDF was reaching limitations, which when it was first designed, had seemed sufficiently adequate. Based on these limitations as well as lessons learned, it was decided to have the next version (HDF5) be a complete rewrite. As a result, while HDF5 is not backward compatible with HDF4 and previous versions, it is still conceptually related. NCSA recommends that new projects use HDF5. It has the benefits of new features as well as being more efficient. In addition to the HDF4 capabilities, it is thread safe and has full parallel I/O support.
HDF is a library-based standard that specifically targets the storage of scientific data. It consists of a data format specification and a supporting library. It also includes tools for analyzing, visualizing and converting scientific data. HDF is a self-describing data format with all information about the data being stored internally within the file itself. To the file user, this information is available via the HDF Application Program Interface (API). Tools that allow the user to look into an unknown HDF file and determine the contents are also available.
In 1993, after comparison of over a dozen different formats, NASA's Earth Science Data Information Systems (ESDIS) project chose the Hierarchical Data Format (HDF) as the format for EOS standard data products. However, while HDF was well suited for scientific, cross-platform data in general, its specification was too broad to assure product interoperability needed for interdiciplinary science use. Within the domain of satellite based Earth observation, conventions would be required for associating spatial and temporal information with the science data and for associating descriptive metadata. The EOSDIS Core System contractor developed the EOS profile of HDF to model data structures commonly encountered in Earth observation analysis.
HDF-EOS provides standard conventions for storing geo-referencing and temporal information, data organization, and metadata storage. HDF-EOS implements three data models using the HDF library. They are Point, Swath and Grid. HDF-EOS, like HDF, is accessible to a user by a software library. The library can be used by FORTRAN, C, and C++ users. The library provides subroutines for reading, writing, querying, subsetting, and subsampling data files. The Point, Swath and Grid model objects are composites of native HDF objects. The Point format stores data that is irregularly spaced in time and geolocation. Geolocation and temporal information in Point data is stored explicitly for each data point. The Swath format is used to store data that is ordered by a single variable track parameter such as time, latitude, or index. Geolocation and temporal information is stored explicitly and is related to the science data by internal structural metadata. The Grid format stores gridded data in one of many projections. Geolocation information in Grid data is stored implicitly by a mathematical projection formula.
As HDF evolved from HDF4 to HDF5, HDF-EOS evolved as well. One benefit that users of HDF-EOS have discovered is that the transformation from HDF-EOS based on HDF4 (HE4) to HDF-EOS based on HDF5 (HE5) does not require a significantly new interface. Because HDF-EOS acts as wrapper for HDF, C and Fortran users are insulated from many of the changes required in migrating code from HDF4 to HDF5. For the most part, calls in HE5 appear to be almost identical to the corresponding HE4 call. At most, one new parameter has been added to HE5 calling sequences, and in a majority of calls, no parameter changes are required. HE4 and HE5 subroutines perform the same logical function in both versions.
HE5 exposes several features not present in HE4 that take advantage of new HDF5 features. Examples are multiple hierarchies of metadata and a thread-safe library. HE5 also provides an new interface for storing profile data. This is for data taken from a profiling or sounding instrument, which measures physical parameters at various altitudes in the atmosphere. The profile data structure is based on the swath structure.
NCSA HDF Home Page
HDF-EOS libraries and documentation