Hyrax Data Server Installation and Configuration Guide: Chapter 3—Hyrax Installation

Hyrax Data Server Installation and Configuration Guide

3. Hyrax Installation

  • Download the latest Hyrax release usually composed of:
    • 2 RPM files (one for libdap, one for the BES).
    • The OLFS binary distribution file.
  • Install the libdap RPM.
  • Install the BES RPM.
  • Unpack the OLFS distribution file, and install the opendap.war file into your Tomcat instance’s webapps directory.
Light Bulb Icon The detailed download and installation instructions for Hyrax are published on the download page for each release of the server. Find the latest release and its associated installation details on the Hyrax downloads page.

3.1. BES Installation

3.1.1. Download

It is necessary that you download and install both the libdap and BES binaries.

  1. Visit the Hyrax Data Server Page.
  2. Select the most recent in the list of Available Versions.
  3. Scroll down the following page until you reach the section entitled Binaries for Hyrax x.x.x, then continue scrolling until you see the heading titled BES.
  4. You need to download both the libdap and BES RPMs which should be named libdap-x.x.x and bes-x.x.x.
  5. The downloaded files should be named something like libdap-x.x.x.el6.x86_64.rpm and bes-x.x.x.static.el6.x86_64.rpm.
yellow triangle icon In order to install the RPMs on your system, you must be running a 64bit OS. If you are running 32bit OS, attempting to install the libdap and BES RPMs will result in errors.
3.1.2. Install
  1. Use yum to install the libdap and bes RPMs:
    sudo yum install libdap-3.x.x.rpm bes-3.x.x.rpm).
  2. At this point you can test the BES by typing the following into a terminal:
    1. start it:
      sudo service besd start
      (Or use the script in /etc/init.d with sudo: /etc/init.d/besd start)
    2. connect using a simple client:
      bescmdln
    3. get version information:
      BESClient> show version
    4. exit from bescmdln:
      BESClient> exit
Information Icon If you are upgrading to Hyrax 1.13.4 or newer from an existing installation older than 1.13.0, in the bes.conf file the keys BES.CacheDir, BES.CacheSize, and BES.CachePrefix have been replaced withBES.UncompressCache.dir, BES.UncompressCache.size, and BES.UncompressCache.prefix respectively. Other changes include the gateway cache configuration (gateway.conf) which now uses the keys Gateway.Cache.dir, Gateway.Cache.size, and Gateway.Cache.prefix to configure its cache. Changing the names enabled the BES to use separate parameters for each of its several caches, which fixes the problem of 'cache collisions.'

3.2. OLFS Installation

3.2.1. Introduction

The OLFS comes with a default configuration that is compatible with the default configuration of the BES. If you perform a default installation of both, you should get a running Hyrax server that will be pre-populated with test data suitable for running integrity tests.

3.2.2. Install Tomcat
  1. Use yum to install tomcat.noarch: sudo yum install tomcat.noarch.
  2. Create the directory /etc/olfs, change its group to tomcat, and set it group writable:
    mkdir /etc/olfs
    chgrp tomcat /etc/olfs
    chmod g+w /etc/olfs)
        
    Alternatively, get Apache Tomcat-8.x from the Apache Software Foundation and install it wherever you’d like—for example, /usr/local/.
3.2.3. Download

Follow the steps below to download the latest OLFS distribution:

  1. Visit the Hyrax Data Server Page.
  2. Select the most recent in the list of Available Versions.
  3. Scroll down the following page until you reach the section titled Binaries for Hyrax x.x.x.
  4. Directly underneath, you should see the OLFS download link, named something like OLFS_x.x.x._Web_Archive_File. Click to download.
  5. The downloaded file will be named something like: olfs-x.x.x-webapp.tgz.
3.2.4. Unpack

Unpack the jar file with the command tar -xvf olfs-x.x.x-webapp.tgz, which will unpack the files directory called olfs-x.x.x-webapp.

3.2.5. Install

Inside of the olfs-x.x.x-webapp directory, locate opendap.war and copy it into Tomcat’s webapps directory:

cp opendap.war /usr/share/tomcat/webapps

Or, if you installed tomcat from the ASF DAAC distribution, its web application directory, for example…

/usr/local/apache-tomcat-8.5.34/webapps
CentOS-7/SELinux and Yum installed Tomcat

Recent versions of CentOS-7 are shipped with default SELinux settings that prohibit Tomcat from reading or opening the opendap.war file. This can be addressed by issuing the following two commands:

sudo semanage fcontext -a -t tomcat_var_lib_t /var/lib/tomcat/webapps/opendap.war
sudo restorecon -rv /var/lib/tomcat/webapps/

After this you will need to restart Tomcat:

sudo service tomcat restart
3.2.6. Starting and Stopping the OLFS/Tomcat

If you followed this tutorial and are using a YUM-installed Tomcat, it should already be integrated into the system with a tomcat entry in /etc/init.d and you should be able to…

  • Start Tomcat: sudo service tomcat start
  • Stop Tomcat: sudo service tomcat stop

You can verify that the server is running by visiting http://localhost:8080/opendap/. If you have installed Hyrax on a virtual machine, replace localhost with the virtual machine’s IP address.

3.3. WCS Installation

The WCS 2 service comes bundled as part of Hyrax-1.14.0 and newer. For more information about configuring WCS with your installation of Hyrax, please refer to the WCS Installation Guide that appears later in this document.

3.4. Source Code Builds

If you are interested in working on Hyrax or want to build the server from source code (as opposed to using the prebuilt binaries that we provide), you can get signed source distributions from the download page referenced above. See the For Software Developers section.

3.5. Configuration

When you install Hyrax for the first time it is pre-configured to serve test data sets that come with each of the installed data handlers. This will allow you to test the server and make sure it is functioning correctly. After that you can customize it for your data.

3.5.1. Deploying Robots for Hyrax

Deploying a robots.txt file for Hyrax is synonymous with deploying it for Tomcat. This means that your robots.txt file must be accessible here:

   http://you.host:port/robots.txt

For example:

   http://www.opendap.org/robots.txt

Note: Placing robots.txt lower in the URL path does not seem to work

In order to get Tomcat to serve the file from that location you must place it in $CATALINA_HOME/webapps/ROOT.

If you find that you system is still burdened with robot traffic then you might want to try the BotBlocker handler for the OLFS.

3.5.2. BES Configuration

Building the BES and its data handlers from source (or installing from the Linux RPMs) will provide the default installation with data and a valid configuration. This is suitable for testing. The following details how you go about customizing it for your data.

Location of the BES Configuation File

The BES configuration file is called bes.conf and can be found in $prefix/etc/bes/ if you built the software from source or in/etc/bes/ if you used our RPM packages. By default $prefix is in /usr/local.

Basic format of parameters

Parameters set in the BES configuration file have the following format:

Name=Value

If you wish to add to the value of a parameter, then you would use += instead of =

Name=Value1
Name+=Value2

The above would return the values Value1 and Value2 in the software.

And if you would like to include another configuration file you would use the following:

BES.Include=/path/to/configuration/file/blee.conf

The bes.conf file includes all .conf files in the modules directory with the following:

BES.Include=modules/.*\.conf$

Note: Regular expressions can be used in the Include parameter to match a set of files.

Administration and Logging

In the bes.conf file, the BES.ServerAdministrator parameter is the address used in various mail messages returned to clients. Set this so that the email’s recipient will be able to fix problems and/or respond to user questions. Also set the log file and log level. If the BES.LogName is set to a relative path, it will be treated as relative to the directory where the BES is started. (That is, if the BES is installed in /usr/local/bin but you start it in your home directory using the parameter value below, the log file will be bes.log in your home directory.)

BES.ServerAdministrator=webmaster@some.place.edu
BES.LogName=./bes.log
BES.LogVerbose=no

Because the BES is a server in its own right, you will need to tell it which network port and interface to use. Assuming you are running the BES and OLFS (i.e., all of Hyrax) on one machine, do the following:

User and Group Parameters

In the bes.conf file, the BES must be started as root. One of the things that the BES does first is to start a listener that listens for requests to the BES. This listener is started as root, but then the User and Group of the process is set using parameters in the BES configuration file:

BES.User=user_name
BES.Group=group_name

You can also set these to a user id and a group id. For example:

BES.User=#172
BES.Group=#14

Setting the Networking Parameters

In the bes.conf configuration file, we have settings for how the BES should listen for requests:

BES.ServerPort=10022
# BES.ServerUnixSocket=/tmp/opendap.socket

The BES.ServerPort tells the BES which TCP/IP port to use when listening for commands. Unless you need to use a different port, use the default. Ports with numbers less than 1024 are special, otherwise you can use any number under 65536. That being said, stick with the default unless you know you need to change it.

In the default bes.conf file we have commented the ServerUnixSocket parameter, which disables I/O over that device. If you need UNIX socket I/O, uncomment this line, otherwise leave it commented. The fewer open network I/O ports, the easier it is to make sure the server is secure.

If both ServerPort and ServerUnixSocket are defined, the BES listens on both the TCP port and the Unix Socket. Local clients on the same machine as the BES can use the unix socket for a faster connection. Otherwise, clients on other machines will connect to the BES using the BES.ServerPort value.

Information Icon The OLFS always uses only the TCP socket, even if the UNIX socket is present.
Debugging Tip

In bes.conf, use the BES.ProcessManagerMethod parameter to control whether the BES acts like a normal Unix server. The default value of multiple causes the BES to accept many connections at once, like a typical server. The value singlecauses it to accept a single connection (process the commands sent to it and exit), greatly simplifying troubleshooting.

BES.ProcessManagerMethod=multiple

Controlling how compressed files are treated

Compression parameters are configured in the bes.conf configuration file.

The BES will automatically recognize compressed files using the bz2, gzip, and Unix compress (Z) compression schemes. However, you need to configure the BES to accept these file types as valid data by making sure that the filenames are associated with a data handler. For example, if you’re serving netCDF files, you would setBES.Catalog.catalog.TypeMatch so that it includes nc:.*\.(nc|NC)(\.gz|\.bz2|\.Z)?$;. The first part of the regular expression must match both the filename and the '.nc' extension, and the second part must match the suffix, indicating the file is compressed (either .gz, .bz2 or .Z).

When the BES is asked to serve a file that has been compressed, it first must decompress it before passing it to the correct data handler (except for those formats which support 'internal' compression, such as HDF4). The BES.CacheDir parameter tells the BES where to store the uncompressed file. Note that the default value of /tmp is probably less safe than a directory that is used only by the BES for this purpose. You might, for example, want to set this to <prefix>/var/bes/cache.

The BES.CachePrefix parameter is used to set a prefix for the cached files so that when a directory like /tmp is used, it is easy for the BES to recognize which files are its responsibility.

The BES.CacheSize parameter sets the size of the cache in megabytes. When the size of the cached files exceeds this value, the cache will be purged using a least-recently-used approach, where the file’s access time is the 'use time'. Because it is usually impossible to determine the sizes of data files before decompressing them, there may be times when the cache holds more data than this value. Ideally this value should be several times the size of the largest file you plan to serve.

Loading Software Modules

Virtually all of the BES’s functions are contained in modules that are loaded when the server starts up. Each module is a shared-object library. The configuration for each of these modules is contained in its own configuration file and is stored in a directory called modules. This directory is located in the same directory as the bes.conf file: $prefix/etc/bes/modules/.

By default, all .conf files located in the modules are loaded by the BES per this parameter in the bes.conf configuration file:

BES.Include=modules/.*\.conf$

So, if you don’t want one of the modules to be loaded, simply change its name to, say, nc.conf.sav and it won’t be loaded.

For example, if you are installing the general purpose server module (the dap-server module) then a dap-server.conf file will be installed in the modules directory. Also, most installations will include the dap module, allowing the BES to serve OPeNDAP data. This configuration file, called dap.conf, is also included in the modules directory. For a data handler, say netcdf, there will be an nc.conf file located in the modules directory.

Each module should contain within it a line that tells the BES to load the module at startup:

BES.modules+=nc
BES.module.nc=/usr/local/lib/bes/libnc_module.so

Module specific parameters will be included in its own configuration file. For example, any parameters specific to the netcdf data handler will be included in the nc.conf file.

Pointing to data

There are two parameters that can be used to tell the BES where your data are stored. Which one you use depends on whether you are setting up the BES to work as part of Hyrax (and thus with THREDDS catalogs) or as a standalone server. In either case, set the value of the .RootDirectory parameter to point to the root directory of your data files (only one may be specified). If the BES is being used as part of Hyrax, use BES.Catalog.catalog.RootDirectory in dap.conf, which is stored in the modules directory; otherwise, use BES.Data.RootDirectory in bes.conf itself. So, if you are setting up Hyrax, set the value of BES.Catalog.catalog.RootDirectory but be sure to set BES.Data.RootDirectory to some value or the BES will not start.

In bes.conf set the following:

BES.Data.RootDirectory=/full/path/data/root/directory

Also in bes.conf set the following if using Hyrax (usually the case):

BES.Catalog.catalog.RootDirectory=/full/path/data/root/directory

By default, the RootDirectory parameters are set to point to the test data supplied with the data handlers.

Next, configure the mapping between data source names and data handlers. This is usually taken care of for you already, so you probably won’t have to set this parameter. Each data handler module (netcdf, hdf4, hdf5, freeform, etc…) will have this set depending on the extension of the data files for the data.

For example, in nc.conf, for the netcdf data handler module, you’ll find the line:

BES.Catalog.catalog.TypeMatch+=nc:.*\.nc(\.bz2|\.gz|\.Z)?$;

When the BES is asked to perform some commands on a particular data source, it uses regular expressions to figure out which data handler should be used to carry out the commands. The value of the BES.Catalog.catalog.TypeMatchparameter holds the set of regular expressions. The value of this parameter is a list of handlers and expressions in the form handler expression;. Note that these regular expressions are like those used by grep on Unix and are somewhat cryptic, but once you see the pattern it’s not that bad. Below, the TypeMatch parameter is being told the following:

  • Any data source with a name that ends in .nc should be handled by the nc (netcdf) handler (see BES.module.nc above)
  • Any file with a .hdf, .HDF or .eos suffix should be processed using the HDF4 handler (note that case matters)
  • Data sources ending in .dat should use the FreeForm handler

Here’s the one for the hdf4 data handler module:

BES.Catalog.catalog.TypeMatch+=h4:.*\.(hdf|HDF|eos)(\.bz2|\.gz|\.Z)?$;

And for the FreeForm handler:

BES.Catalog.catalog.TypeMatch+=ff:.*\.dat(\.bz2|\.gz|\.Z)?$;

If you fail to configure this correctly, the BES will return error messages stating that the type information has to be provided. It won’t tell you this, however when it starts, only when the OLFS (or some other software) makes a data request. This is because it is possible to use BES commands in place of these regular expressions, although the Hyrax won’t.

Including and Excluding files and directories

Finally, you can configure the types of information that the BES sends back when a client requests catalog information. The Include and Exclude parameters provide this mechanism, also using a list of regular expressions (with each element of the list separated by a semicolon). In the example below, files that begin with a dot are excluded. These parameters are set in the dap.conf configuration file.

The Include expressions are applied to the node first, followed by the Exclude expressions. For collections of nodes, only the Exclude expressions are applied.

BES.Catalog.catalog.Include=;
BES.Catalog.catalog.Exclude=^\..*;

Symbolic Links

If you would like symbolic links to be followed when retrieving data and for viewing catalog entries, then you need to set the following two parameters: the BES.FollowSymLinks parameter and the BES.RootDirectory parameter. The BES.FollowSymLinks parameter is for non-catalog containers and is used in conjunction with the BES.RootDirectoryparameter. It is not a general setting. The BES.Catalog.catalog.FollowSymLinks is for catalog requests and data containers in the catalog. It is used in conjunction with the BES.Catalog.catalog.RootDirectory parameter above. The default is set to No in the installed configuration file. To allow for symbolic links to be followed you need to set this to Yes.

The following is set in the bes.conf file:

BES.FollowSymLinks=No|Yes

And this one is set in the dap.conf file in the modules directory:

BES.Catalog.catalog.FollowSymLinks=No|Yes

Parameters for Specific Handlers

Parameters for specific modules can be added to the BES configuration file for that specific module. No module-specific parameters should be added to bes.conf.

3.6. OLFS Configuration

The OLFS is the outward facing component of the Hyrax server. This section provides OLFS configuration instructions.

Information Icon The OLFS web application relies on one or more instances of the BES to provide it with data access and basic catalog metadata.

The OLFS web application stores its configuration state in a number of files. You can change the server’s default configuration by modifying the content of one or more of these files and then restarting Tomcat or the web application. These configuration files include the following:

  • olfs.xml: Contains the primary OLFS configuration, such as BES associations, directory view instructions, gateway service location, and static THREDDS catalog behavior. Located at /etc/olfs/olfs.xml. For more information about olfs.xml, please see the olfs.xml configuration section.
  • catalog.xml: Master(top-level) THREDDS catalog content for static THREDDS catalogs. Located at /etc/olfs/catalog.xml.
  • viewers.xml: Contains the localized viewers configuration. Located at /etc/olfs/viewers.xml.

Generally, you can meet your configuration needs by making changes to olfs.xml and catalog.xml. For more information about where these files might be located, please see the following section, OLFS Configuration Files.

3.6.1. OLFS Configuration Files

If the default configuration of the OLFS works for your intended use, there is no need to create a persistent localized configuration; however, if you need to change the configuration, we strongly recommend that you enable a persistent local configuration. This way, updating the web application won’t override your custom configuration.

The OLFS locates its configuration file by looking at the value of the OLFS_CONFIG_DIR user environment variable:

  • If the variable is set and its value is the pathname of an existing directory that is both readable and writable by Tomcat, the OLFS will use it.
  • If the directory /etc/olfs exists and is readable and writable by Tomcat, the OLFS will use it.
  • If the directory /usr/share/olfs exists and is readable and writable by Tomcat, then the OLFS will use it. (This was added for Hyrax 1.14.1.)

If none of the above directories exist or the variable has not been set, the OLFS uses the default configuration bundled in the web application web archive file (opendap.war). In this way, the OLFS can start without a persistent local configuration.

3.6.2. Create a Persistent Local Configuration

You can easily enable a persistent local configuration for the OLFS by creating an empty directory and identifying it with the OLFS_CONFIG_DIR environment variable:

export OLFS_CONFIG_DIR="/home/tomcat/hyrax"

Alternately, you can create /etc/olfs or /usr/share/olfs.

Once you have created the directory (and, in the first case, set the environment variable), restart Tomcat. Restarting Tomcat prompts the OLFS move a copy of its default configuration into the empty directory and then use it. You can then edit the local copy.

yellow triangle icon The directory that you create must be both readable and writable by the user who is running Tomcat.
3.6.3. olfs.xml Configuration File

The olfs.xml file contains the core configuration of the Hyrax front-end service. The following subsections detailed its contents.

At the document’s root is the <OLFSConfig> element. It contains several elements that supply the configuration for the OLFS. The following is an example OLFS Configuration file:

<?xml version="1.0" encoding="UTF-8"?>
<OLFSConfig>
    <BESManager>
        <BES>
            <prefix>/</prefix>
            <host>localhost</host>
            <port>10022</port>
            <timeOut>300</timeOut>
            <adminPort>11002</adminPort>
            <maxResponseSize>0</maxResponseSize>
            <ClientPool maximum="200" maxCmds="2000" />
        </BES>
        <NodeCache maxEntries="20000" refreshInterval="600"/>
        <SiteMapCache refreshInterval="600" />
    </BESManager>
    <ThreddsService  prefix="thredds" useMemoryCache="true" allowRemote="true" />
    <GatewayService  prefix="gateway" useMemoryCache="true" />
    <UseDAP2ResourceUrlResponse />
    <HttpPost enabled="true" max="2000000"/>
    <!-- AddFileoutTypeSuffixToDownloadFilename / -->
    <!-- AllowDirectDataSourceAccess / -->
    <!-- PreloadNcmlIntoBes -->
    <!-- CatalogCache>
        <maxEntries>10000</maxEntries>
        <updateIntervalSeconds>10000</updateIntervalSeconds>
    </CatalogCache -->
    <!--
       'Bot Blocker' is used to block access from specific IP addresses
       and by a range of IP addresses using a regular expression.
    -->
    <!-- BotBlocker -->
    <!-- <IpAddress>127.0.0.1</IpAddress> -->
    <!-- This matches all IPv4 addresses, work yours out from here.... -->
    <!-- <IpMatch>[012]?\d?\d\.[012]?\d?\d\.[012]?\d?\d\.[012]?\d?\d</IpMatch> -->
    <!-- Any IP starting with 65.55 (MSN bots the don't respect robots.txt  -->
    <!-- <IpMatch>65\.55\.[012]?\d?\d\.[012]?\d?\d</IpMatch>   -->
    <!-- /BotBlocker -->
    <!--
      'Timer' enables or disables the generation of internal timing metrics for the OLFS
      If commented out the timing is disabled. If you want timing metrics to be output
      to the log then uncomment the Timer and set the enabled attribute's value to "true"
      WARNING: There is some performance cost to utilizing the Timer.
    -->
    <!-- Timer enabled="false" / -->
</OLFSConfig>
<BESManager> Element (required)

The BESManager information is used whenever the software needs to access the BES’s services. This configuration is key to the function of Hyrax, for in it is defined each BES instance that is connected to a given Hyrax installation. The following examples will show a single BES example. For more information on configuring Hyrax to use multiple BESs look at section 3.6.9. below.

Each BES is identified using a seperate <BES> child element inside of the <BESManager> element:

<BESManager>
    <BES>
        <prefix>/</prefix>
        <host>localhost</host>
        <port>10022</port>
        <timeOut>300</timeOut>
        <maxResponseSize>0</maxResponseSize>
        <ClientPool maximum="10" maxCmds="2000" />
        <adminPort>11002</adminPort>
    </BES>
    <NodeCache maxEntries="20000" refreshInterval="600"/>
    <SiteMapCache cacheFile="/tmp/SiteMap.cache" refreshInterval="600" />
</BESManager>

<BES> Child Elements

The <BES> child elements provide the OLFS with connection and control information for a BES. There are three required child elements within a <BES> element and four optional child elements:

  • <prefix> element (required): This element contains the URL prefix that the OLFS will associate with this BES. It also maps this BES to the URI space that the OLFS services. The prefix is a token that is placed between the host:port/context/ part of the Hyrax URL and the catalog root. The catalog root is used to designate a particular BES instance in the event that multiple BESs are available to a single OLFS. If you have maintained the default configuration of a single BES, the tag must be designated by a forward slash: <prefix>/</prefix>.
    exclamation icon There must be at least one BES element in the BESManager handler configuration whose prefix has a value of /. There may be more than one <BES>, but only this one is required.
    When using multiple BESs, each BES must have an exposed mount point as a directory (aka collection) in the URI space where it is going to appear. It is important to note that the prefix string must always begin with the slash (/) character: <prefix>/data/nc</prefix>. For more information, see Section 3.6.9. Configuring With The OLFS To Work With Multiple BESs.
  • <host> element (required): Contains the host name or IP address of the BES, such as <host>test.opendap.org</host>.
  • <port> element (required): Contains port number on which the BES is listening, such as <port>10022</port>.
  • <timeOut> element (optional): Contains the timeout time, in seconds, for the OLFS to wait for this BES to respond, such as <timeOut>600</timeOut>. Its default value is 300.
  • <maxResponseSize> element (optional): Contains in bytes the maximum response size allowed for this BES. Requests that produce a larger response will receive an error. Its default value of zero indicates that there is no imposed limit: <maxResponseSize>0</maxResponseSize>.
  • <ClientPool> element (optional): Configures the behavior of the pool of client connections that the OLFS maintains with this particular BES. These connections are pooled for efficiency and speed: <ClientPool maximum="200" maxCmds="2000" />. Notice that this element has two attributes, maximum and maxCmds:
    • The maximum attribute specifies the maximum number of concurrent BES client connections that the OLFS can make. Its default value is 200.
    • The maxCmds attribute specifies the maximum number of commands that can be issued over a particular BESClient connection. The default is 2000.
    If the <ClientPool> element is missing, the pool (maximum) size defaults to 200 and maxCmds defaults to 2000.
  • <adminPort> element (optional): Contains the port on the BES system that can be used by the Hyrax Admin Interface to control the BES, such as <adminPort>11002</adminPort>. The BES must also be configured to open and use this admin port.
<NodeCache> Child Element (optional)

The NodeCache element controls the state of the in-memory LRU cache for BES catalog/node responses. It has two attributes, refreshInterval and maxEntries.

The refreshInterval attribute specifies the time (in seconds) that any particular item remains in the cache. If the underlying system has a lot of change (model result output etc) then making this number smaller will increase the rate at which the change becomes "available" through the Hyrax service, at the expense of more cache churn and slower responses. If the underlying system is fairly stable (undergoes little change) then refreshInterval can be larger which will mean less cache churn and faster responses.

The maxEntries attribute defines the maximum number of entries to allowed in the cache. If the serviced collection is large then making this larger will definitely improve response times for catalogs etc.

Example:

<NodeCache maxEntries="20000" refreshInterval="600"/>
<SiteMapCache> Child Element (optional)

The SiteMapCache element defines the location and life span of the SiteMap response cache. A cache for the BES SiteMap response can be time consuming to produce for larger systems (~4 minutes for a system with 110k directories and 560k files) This configuration element addresses this by providing a location and refresh interval for a SiteMap cache. SiteMapCache has two attributes, cacheFile and refreshInterval.

The optional cacheFile attribute may be used to identify a particular location for the SiteMap cache file, if not provided it will be placed by default into cache directory located in the active OLFS configuration directory.

The refreshInterval attribute expresses, in seconds, the time that a SiteMap is held in the cache before the system generates a new one.

Example:

<SiteMapCache cacheFile="/tmp/SiteMap.cache" refreshInterval="600" />
<ThreddsService> Element (optional)

This configuration parameter controls the following:

  • The location of the static THREDDS catalog root in the URI space serviced by Hyrax.
  • Whether the static THREDDS catalogs are held in memory or read from disk for each request.
  • If the server will broker remote THREDDS catalogs and their data by following thredds:catalogRef links that point to THREDDS catalogs on other systems.

The following is an example configuration for the <ThreddsService> element:

<ThreddsService  prefix="thredds" useMemoryCache="true" allowRemote="false" />

Notice that <ThreddsService> has several attributes:

  • prefix attribute (optional): Sets the name of the static THREDDS catalogs' root in Hyrax. For example, if the prefix is thredds, then <a href="http://localhost:8080/opendap/thredds/" class="bare">http://localhost:8080/opendap/thredds/</a> will give you the top-level static catalog, which is typically the contents of the file /etc/olfs/opendap/catalog.xml. This attribute’s default value is thredds.
  • useMemoryCache attribute (optional): This is a boolean value with a default value of true.
    • If the value of this attribute is set to true, the servlet will ingest all of the static catalog files at startup and hold their contents in memory, which is faster but more memory intensive.
    • If set to false, each request for a static THREDDS catalog will cause the server to read and parse the catalog from disk, which is slower but uses less memory.
    See Section 3.6.11. for more information about the memory caching operations.
  • allowRemote attribute (optional): If this attribute is present and its value is set to true, then the server will "broker" remote THREDDS catalogs and the data that they serve. This means that the server, not the client, will perform the following steps:
    1. Retrieve the remote catalogs.
    2. Render them for the requesting client.
    3. Provide an interface for retrieving the remote data.
    4. Allow Hyrax to perform any subsequent processing before returning the result to the requesting client.
    This attribute has a default value of false.
<GatewayService> (optional)

Directs requests to the Gateway Service:

<GatewayService  prefix="gateway" useMemoryCache="true" />
</a></p><p><a href="/collaborate/open-data-services-and-software/api/hyrax-guide/hyrax-appendices-p3#11c12>Gateway Service</a>:
</p>
<pre style=" white-space:="" pre-wrap;="" display:="" flex;="" normal;="" word-break:="" break-word;"="">The following are the attributes of <em><GatewayService></em>:
</a></p><a href="/collaborate/open-data-services-and-software/api/hyrax-guide/hyrax-appendices-p3#11c12>Gateway Service</a>:<p><pre style=" white-space:="" pre-wrap;="" display:="" flex;="" normal;="" word-break:="" break-word;"=""></a><p><a href="/collaborate/open-data-services-and-software/api/hyrax-guide/hyrax-appendices-p3#11c12>Gateway Service</a>:<p><pre style=" white-space:="" pre-wrap;="" display:="" flex;="" normal;="" word-break:="" break-word;"=""></a></p><ul><a href="/collaborate/open-data-services-and-software/api/hyrax-guide/hyrax-appendices-p3#11c12>Gateway Service</a>:
</p>
<pre style=" white-space:="" pre-wrap;="" display:="" flex;="" normal;="" word-break:="" break-word;"="">
    <li>
    <strong><em>prefix</em> attribute (optional)</strong>: Sets location of the gateway service in the URI space serviced by Hyrax. For example, if the prefix is <em>gateway</em>, then <em><a href="http://localhost:8080/opendap/gateway/" class="bare">http://localhost:8080/opendap/gateway/</a></em> should give you the Gateway Service page. This attribute’s default value is <em>gateway</em>.</li>
    </a><li><a href="/collaborate/open-data-services-and-software/api/hyrax-guide/hyrax-appendices-p3#11c12>Gateway Service</a>:
</p>
<pre style=" white-space:="" pre-wrap;="" display:="" flex;="" normal;="" word-break:="" break-word;"="">
    <strong><em>useMemoryCache</em> attribute (optional)</strong>: See </a><a href="#usememorycache">the useMemoryCache attribute</a> for more information.</li>
</ul><h6>
<em><UseDAP2ResourceUrlResponse /></em> element (optional)</h6><p>This element controls the type of response that Hyrax will provide to a client’s request for the data resource URL:
</p><pre><UseDAP2ResourceUrlResponse />
</a></p><p><a href="/collaborate/open-data-services-and-software/api/hyrax-guide/hyrax-appendices-p3#11c12>Gateway Service</a>:
</p>
<pre style=" white-space:=" " pre-wrap;="" display:="" flex;="" normal;="" word-break:="" break-word;"="">When this element is present, the server will respond to requests for data resource URLs by returning the DAP2 response (either an error or the underlying data object). Commenting out or removing the <em><UseDAP2ResourceUrlResponse /></em>element will cause the server to return the DAP4 DSR response when a dataset resource URL is requested.
</a></p><a href="/collaborate/open-data-services-and-software/api/hyrax-guide/hyrax-appendices-p3#11c12>Gateway Service</a>:<p><pre style=" white-space:=" " pre-wrap;="" display:="" flex;="" normal;="" word-break:="" break-word;"=""><table>
<tbody>
<tr>
    <td>
        <img data-id="4462" alt="Information Icon" src="https://cdn.earthdata.nasa.gov/conduit/upload/12353/icon_-_information.png" data-link="" data-position="block" data-size="original" data-source="uploaded">
    </td>
    <td>DAP2 responses are not clearly defined by any specification, whereas DAP4 DSR responses are well-defined by a specification.
    </td>
</tr>
</tbody>
</table><p>This element has no attributes or child elements and is enabled by default.
</p><h6>
<em><AddFileoutTypeSuffixToDownloadFilename /></em> element (optional)</h6><p>This optional element controls how the server constructs the download file name that is transmitted in the HTTP Content-Disposition header:
</p><pre><AddFileoutTypeSuffixToDownloadFilename />
</a></p><p><a href="/collaborate/open-data-services-and-software/api/hyrax-guide/hyrax-appendices-p3#11c12>Gateway Service</a>:
</p>
<pre style=" white-space:=" " pre-wrap;="" display:="" flex;="" normal;="" word-break:="" break-word;"="">For example, suppose the <em><AddFileoutTypeSuffixToDownloadFilename /></em> element is either commented out or not present. When a user requests a data response from <em>somedatafile.hdf</em> in netCDF-3 format, the HTTP Content-Disposition header will be set like this:
</a></p><a href="/collaborate/open-data-services-and-software/api/hyrax-guide/hyrax-appendices-p3#11c12>Gateway Service</a>:<p><pre style=" white-space:=" " pre-wrap;="" display:="" flex;="" normal;="" word-break:="" break-word;"=""><pre>Content-Disposition: attachment; filename="somedatafile.hdf"

However, if the <AddFileoutTypeSuffixToDownloadFilename /> is present, then the resulting response will have an HTTP Content-Disposition header:

Content-Disposition: attachment; filename="somedatafile.hdf.nc"

By default the server ships with this disabled.

<AllowDirectDataSourceAccess/> element (optional)

The <AllowDirectDataSourceAccess/> element controls the user’s ability to directly access data sources via the Hyrax web interface:

<!-- AllowDirectDataSourceAccess / -->

If this element is present and not commented out, a client can retrieve an entire data source (such as an HDF file) by requesting it through the HTTP URL interface.

This element has no attributes or child elements and is disabled by default. We recommend that you leave it unchanged, unless you want users to be able to circumvent the OPeNDAP request interface and have direct access to the data products stored on your server.

<BotBlocker> (optional)

This optional element can be used to block access from specific IP addresses or a range of IP addresses using regular expressions:

<BotBlocker>
    <IpAddress>128.193.64.33</IpAddress>
    <IpMatch>65\.55\.[012]?\d?\d\.[012]?\d?\d</IpMatch>
</BotBlocker>

<BotBlocker> has the following child elements:

  • <IpAddress> element: The text value of this element should be the IP address of a system that you would like to block from accessing your service. For example, <IpAddress>128.193.64.33</IPAddress> Will block the system located at 128.193.64.33 from accessing your server. There can be zero or more <IpAddress> child elements in the <BotBlocker> element.
  • <IpMatch> element: The text value of this element should be the regular expression that will be used to match the IP addresses of clients attempting to access Hyrax. For example, <IpMatch>65\.55\.[012]?\d?\d\.[012]?\d?\d</IpMatch> matches all IP addresses beginning with 65.55, and thus blocks access for clients whose IP addresses lie in that range. There can be zero or more <IpMatch> child elements in <BotBlocker element.
Developer Options

These configuration options are intended to be used by developers that are engaged in code developement for components of Hyrax. They are not meant to be enabled in any kind of production environment. They are included here for transparency and to help potential contributors to the Hyrax project.

<Timer>

The <Timer> attribute enables or disables the generation of internal timing metrics for the OLFS:

 <Timer enabled="true"/>

Timer has a single attribute, enabled, which is a boolean value. Uncommenting this value and setting it to true will output timing metrics to the log.

yellow triangle icon Enabling the Timer will impose significant performance overhead on the server’s operation and should ony be done in an effort to understand the relative times spent in different operations--not as a mechanism for measuring the server’s objective performance.

<ingestTransformFile> child element (developer)

This child element of the ThreddsService element is a special code development option that allows a developer to specify the fully qualified path to an XSLT file that will be used to preprocess each THREDDS catalog file read from disk:

Example:

<ingestTransformFile>/fully/qualified/path/to/transfrm.xsl</ingestTransformFile>

The default version of this file, found in $CATALINA_HOME/webapps/opendap/xsl/threddsCatalogIngest.xsl, processes the thredds:datasetScan elements in each THREDDS catalog so that they contain specific content for Hyrax.

3.6.4. Viewers Service (viewers.xml file)

The Viewers service provides, for each dataset, an HTML page that contains links to Java WebStart applications and to WebServices, such as WMS and WCS, that can be used in conjunction with the dataset. The Viewers service is configured via the contents of the viewers.xml file, typically located at the following location: /etc/olfs/viewers.xml.

viewers.xml Configuration File

The viewers.xml contains a list of two types of elements:

  • <JwsHandler> elements
  • <WebServiceHandler> elements

The details of these are discussed elsewhere in the documentation. The following is an example configuration:

<ViewersConfig>
    <JwsHandler className="opendap.webstart.IdvViewerRequestHandler">
        <JnlpFileName>idv.jnlp</JnlpFileName>
    </JwsHandler>
    <JwsHandler className="opendap.webstart.NetCdfToolsViewerRequestHandler">
        <JnlpFileName>idv.jnlp</JnlpFileName>
    </JwsHandler>
    <JwsHandler className="opendap.webstart.AutoplotRequestHandler" />
    <WebServiceHandler className="opendap.viewers.NcWmsService" serviceId="ncWms">
        <applicationName>Web Mapping Service</applicationName>
        <NcWmsService href="/ncWMS/wms" base="/ncWMS/wms" ncWmsDynamicServiceId="lds" />
    </WebServiceHandler>
    <WebServiceHandler className="opendap.viewers.GodivaWebService" serviceId="godiva">
        <applicationName>Godiva WMS GUI</applicationName>
        <NcWmsService href="http://localhost:8080/ncWMS/wms" base="/ncWMS/wms" ncWmsDynamicServiceId="lds"/>
        <Godiva href="/ncWMS/godiva2.html" base="/ncWMS/godiva2.html"/>
    </WebServiceHandler>
</ViewersConfig>
3.6.5. Logging

Gateway Service:

</a></p><p><a href="/collaborate/open-data-services-and-software/api/hyrax-guide/hyrax-appendices-p3#11c12>Gateway Service</a>:
</p>
<pre style=" white-space:=" " pre-wrap;="" display:="" flex;="" normal;="" word-break:="" break-word;"="">For information about logging, see the </a><a href="#hyrax3.6.10">Hyrax Logging Configuration Documentation</a>.
</p><h5>3.6.6. Authentication and Authorization</h5><p>The following subsections detail authentication and authorization.
</p><h6>Apache Web Server (httpd)</h6><p>If your organization desires secure access and authentication layers for Hyrax, the recommended method is to use Hyrax in conjunction the Apache Web Server (httpd).
</p><p>Most organizations that use secure access and authentication for their web presence are already doing so via Apache Web Server, and Hyrax can be integrated nicely with this existing infrastructure.
</p><p>More about integrating Hyrax with Apache Web Server can be found at these pages:
</p><ul>
    <li><a href="/collaborate/open-data-services-and-software/api/hyrax-guidehyrax-guide-p5">Integrating Hyrax with Apache Web Server</a></li>
    <li><a href="/collaborate/open-data-services-and-software/api/hyrax-guidehyrax-guide-p7#hyrax7.1.3">Configuring Hyrax and Apache for User Authentication and Authorization</a></li>
</ul><h6>Tomcat</h6><p>Hyrax may be used with the security features implemented by Tomcat for authentication and authorization services. We recommend that you read carefully and understand the Tomcat security documentation.
</p><p>For Tomcat 7.x see:
</p><ul>
    <li>
    <a href="https://tomcat.apache.org/tomcat-7.0-doc/index.html">Tomcat 7.x Documentation</a>
    <ul>
        <li><a href="https://tomcat.apache.org/tomcat-7.0-doc/realm-howto.html">Section 7: Realm Configuration HOW-TO</a></li>
        <li><a href="https://tomcat.apache.org/tomcat-7.0-doc/ssl-howto.html">Section 13: SSL/TLS Configuration HOW-TO</a></li>
    </ul>
    </li>
</ul><p>For Tomcat 8.5.x see:
</p><ul>
    <li>
    <a href="http://tomcat.apache.org/tomcat-8.5-doc/index.html">Tomcat 8.5.x Documentation</a>
    <ul>
        <li><a href="https://tomcat.apache.org/tomcat-8.5-doc/realm-howto.html">Section 7: Realm Configuration HOW-TO</a></li>
        <li><a href="https://tomcat.apache.org/tomcat-8.5-doc/ssl-howto.html">Section 13: SSL/TLS Configuration HOW-TO</a></li>
    </ul>
    </li>
</ul><p>We also recommend that you read chapter 12 of the <a href="http://jcp.org/aboutJava/communityprocess/final/jsr154/index.html">Java Servlet Specification 2.4</a> that decribes how to configure security constraints at the web-application-level.
</p><p>Tomcat security requires fairly extensive additions to the <em>web.xml</em> file located here: <em>${CATALINA_HOME}/webapps/opendap/WEB-INF/web.xml</em>
</p><table>
<tbody>
<tr>
    <td>
        <img data-id="4465" alt="yellow triangle icon" src="https://cdn.earthdata.nasa.gov/conduit/upload/12347/icon_-_yellow_triangle.png" data-link="" data-position="block" data-size="original" data-source="uploaded">
    </td>
    <td>
        <strong><em>Altering the <em><servlet></em> definitions may render your Hyrax server inoperable.</em></strong>
    </td>
</tr>
</tbody>
</table><p>Examples of security content for the <em>web.xml</em> file can be found in the persistent content directory of the Hyrax server, which by default is located here <em>$CATALINA_HOME/webapps/opendap/WEB-INF/conf/TomcatSecurityExample.xml</em>
</p><h6>Limitations</h6><p>Tomcat security officially supports <em>context</em>-level authentication. This means that you can restrict access to the collection of servlets running in a single web application (i.e. all of the stuff that is defined in a single <em>web.xml</em> file). You can call out different authentication rules for different `<url-pattern>`s within the web application, but only clients that do not cache ANY security information will be able to easily access the different areas.
</p><p>For example, in your <em>web.xml</em> file you might have the following:
</p><pre style="white-space: pre-wrap; display: flex; white-space: normal; word-break: break-word;"><security-constraint>
        <web-resource-collection>
            <web-resource-name>fnoc1</web-resource-name>
            <url-pattern>/hyrax/nc/fnoc1.txt</url-pattern>
        </web-resource-collection>
        <auth-constraint>
            <role-name>fn1</role-name>
        </auth-constraint>
    </security-constraint>
    <security-constraint>
        <web-resource-collection>
             <web-resource-name>fnoc2</web-resource-name>
             <url-pattern>/hyrax/nc/fnoc2.txt</url-pattern>
         </web-resource-collection>
         <auth-constraint>
             <role-name>fn2</role-name>
          </auth-constraint>
    </security-constraint>
    <login-config>
        <auth-method>BASIC</auth-method>
        <realm-name>MyApplicationRealm</realm-name>
    </login-config>
        </a></p><p><a href="/collaborate/open-data-services-and-software/api/hyrax-guide/hyrax-appendices-p3#11c12>Gateway Service</a>:
    <p>
        <pre style=" white-space:=" " pre-wrap;="" display:="" flex;="" normal;="" word-break:="" break-word;"="">Where the security roles fn1 and fn2 (defined in the <strong>tomcat-users.xml</strong> file) have no common members.
        </a></p><a href="/collaborate/open-data-services-and-software/api/hyrax-guide/hyrax-appendices-p3#11c12>Gateway Service</a>:<p>        <pre style=" white-space:=" " pre-wrap;="" display:="" flex;="" normal;="" word-break:="" break-word;"=""><p>The complete URI’s would be…
        </p><pre style="white-space: pre-wrap; display: flex; white-space: normal; word-break: break-word;">http://localhost:8080/mycontext/hyrax/nc/fnoc1.txt
http://localhost:8080/mycontext/hyrax/nc/fnoc2.txt
        </a></p><p><a href="/collaborate/open-data-services-and-software/api/hyrax-guide/hyrax-appendices-p3#11c12>Gateway Service</a>:
    <p>
        <pre style=" white-space:=" " pre-wrap;="" display:="" flex;="" normal;="" word-break:="" break-word;"="">This works for clients that do not cache anything; however, if you access these URLs with a typical internet browser, authenticating one URI would lock you out of the other URI until you "reset" the browser by purging all caches. This happens, because, in the exchange between Tomcat and the client, Tomcat sends the header <em>WWW-Authenticate: Basic realm="MyApplicationRealm"</em>, and the client authenticates.
        </a></p><a href="/collaborate/open-data-services-and-software/api/hyrax-guide/hyrax-appendices-p3#11c12>Gateway Service</a>:<p>        <pre style=" white-space:=" " pre-wrap;="" display:="" flex;="" normal;="" word-break:="" break-word;"=""><p>When you access the second URI, Tomcat sends the same authentication challenge with the same <em>WWW-Authenticate</em>header. The client, having recently authenticated to this <em>realm-name</em> (defined in the <em><login-config></em> element in the web.xml file - see above), resends the authentication information, and, since it is not valid for that url pattern, the request is denied.
        </p><h6>Persistence</h6><p>Be sure to back up your modified <em>web.xml</em> file to a location outside of the <em>$CATALINA_HOME/webapps/opendap</em> directory, as newly-installed versions of Hyrax will overwrite it.
        </p><p>You could, for example, use an <em>XML ENTITY</em> and an <em>entity reference</em> in the <em>web.xml</em>. This will cause a local file containing the security configuration to be included in the <em>web.xml</em>. For example…
        </p><ol>
            <li>Add the <em>ENTITY</em>
            <pre style="white-space: pre-wrap; display: flex; white-space: normal; word-break: break-word;">[<!ENTITY securityConfig SYSTEM "file:/fully/qualified/path/to/your/security/config.xml">]

to the !DOCTYPE declaration at the top of the web.xml.

  • Add an entity reference (&securityConfig;) to the content of the web-app element. This would cause your external security configuration to be included in the web.xml file.
  • The following is an example ENTITY configuration:

     <?xml version="1.0" encoding="ISO-8859-1"?>
        <!DOCTYPE web-app
            PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.2//EN"
            "http://java.sun.com/j2ee/dtds/web-app_2_2.dtd"
            [<!ENTITY securityConfig      SYSTEM "file:/fully/qualified/path/to/your/security/config.xml">]
        >
        <web-app>
            <!--
                Loads a persistent security configuration from the content directory.
                This configuration may be empty, in which case no security constraints will be
                applied by Tomcat.
            -->
            &securityConfig;
            .
            .
            .
        </web-app>
    

    This will not prevent you from losing your web.xml file when a new version of Hyrax is installed, but adding the ENTITYto the new web.xml file is easier than remembering an extensive security configuration.

    3.6.7. Compressed Responses and Tomcat

    Many OPeNDAP clients accept compressed responses. This can greatly increase the efficiency of the client/server interaction by diminishing the number of bytes actually transmitted over "the wire." Tomcat provides native compression support for the GZIP compression mechanism; however, it is NOT turned on by default.

    The following example is based on Tomcat 7.0.76. We recommend that you carefully read the Tomcat documentation related to this topic before proceeding:

    Gateway Service:

            </a></p><ul><a href="/collaborate/open-data-services-and-software/api/hyrax-guide/hyrax-appendices-p3#11c12>Gateway Service</a>:
        <p>
            <pre style=" white-space:=" " pre-wrap;="" display:="" flex;="" normal;="" word-break:="" break-word;"="">
                </a><li><a href="/collaborate/open-data-services-and-software/api/hyrax-guide/hyrax-appendices-p3#11c12>Gateway Service</a>:
        <p>
            <pre style=" white-space:=" " pre-wrap;="" display:="" flex;="" normal;="" word-break:="" break-word;"=""></a><a href="http://tomcat.apache.org/">Tomcat Home</a></li>
                <li>
                <a href="https://tomcat.apache.org/tomcat-7.0-doc/config/http.html">Tomcat 7.x documentation for the HTTP Connector</a> (see Standard Implementation section)</li>
                <li>
                <a href="https://tomcat.apache.org/tomcat-8.5-doc/config/http.html">Tomcat 8.5.x documentation for the HTTP/1.1 Connector</a>(see Standard Implementation section)</li>
            </ul><h6>Details</h6><p>To enable compression, you will need to edit the <em>$CATALINA_HOME/conf/server.xml</em> file. Locate the <em><Connector></em> element associated with your server. It is typically the only <em><Connector></em> element whose <em>port</em> attribute is set equal to 8080. You will need to add or change several of its attributes to enable compression.
            </p><p>With our Tomcat 7.0.76 distribution, we found this default <em><Connector></em> element definition in our <em>server.xml</em> file:
            </p><pre style="white-space: pre-wrap; display: flex; white-space: normal; word-break: break-word;"><Connector
            port="8080"
            protocol="HTTP/1.1"
            connectionTimeout="20000"
            redirectPort="8443"
        />
                    </a></p><p><a href="/collaborate/open-data-services-and-software/api/hyrax-guide/hyrax-appendices-p3#11c12>Gateway Service</a>:
                <p>
                    <pre style=" white-space:=" " pre-wrap;="" display:="" flex;="" normal;="" word-break:="" break-word;"="">You will need to add four attributes:
                    </a></p><a href="/collaborate/open-data-services-and-software/api/hyrax-guide/hyrax-appendices-p3#11c12>Gateway Service</a>:<p>            <p>                <pre style=" white-space:=" " pre-wrap;="" display:="" flex;="" normal;="" word-break:="" break-word;"=""><pre style="white-space: pre-wrap; display: flex; white-space: normal; word-break: break-word;">compression="force"
    compressionMinSize="2048"
    compressableMimeType="text/html,text/xml,text/plain,text/css,text/javascript,application/javascript,application/octet-stream,application/vnd.opendap.dap4.dataset-services+xml,application/vnd.opendap.dap4.dataset-metadata+xml,application/vnd.opendap.dap4.data,application/vnd.opendap.dap4.error+xml,application/json,application/prs.coverage+json,application/rdf+xml,application/x-netcdf;ver=4,application/x-netcdf,image/tiff;application=geotiff"
                </a></p><p><a href="/collaborate/open-data-services-and-software/api/hyrax-guide/hyrax-appendices-p3#11c12>Gateway Service</a>:
            <p>
                <pre style=" white-space:=" " pre-wrap;="" display:="" flex;="" normal;="" word-break:="" break-word;"="">The list of compressible MIME types includes all known response types for Hyrax. The <em>compression</em> attribute may have the following values:
                </a></p><a href="/collaborate/open-data-services-and-software/api/hyrax-guide/hyrax-appendices-p3#11c12>Gateway Service</a>:<p>            <pre style=" white-space:=" " pre-wrap;="" display:="" flex;="" normal;="" word-break:="" break-word;"=""><ul>
                    <li>
                    <strong><em>compression="no"</em></strong>: Nothing is compressed (default if not provided).</li>
                    <li>
                    <strong><em>compression="yes"</em></strong>: Only the compressible MIME types are compressed.</li>
                    <li>
                    <strong><em>compression="force"</em></strong>: Everything gets compressed (assuming the client accepts gzip and the response is bigger than compressionMinSize).
                    <table>
                    <tbody>
                    <tr>
                        <td>
                            <img data-id="4462" alt="Information Icon" src="https://cdn.earthdata.nasa.gov/conduit/upload/12353/icon_-_information.png" data-link="" data-position="block" data-size="original" data-source="uploaded">
                        </td>
                        <td>You <strong>must</strong> set <em>compression="force"</em> for compression to work with the OPeNDAP data transport.
                        </td>
                    </tr>
                    </tbody>
                    </table>
                    </li>
                </ul><p>When you are finished, your <em><Connector></em> element should look like the following:
                </p><pre style="white-space: pre-wrap; display: flex; white-space: normal; word-break: break-word;"><Connector
            port="8080"
            protocol="HTTP/1.1"
            connectionTimeout="20000"
            redirectPort="8443"
            compression="force"
            compressionMinSize="2048"
            compressableMimeType="text/html,text/xml,text/plain,text/css,text/javascript,application/javascript,application/octet-stream,application/vnd.opendap.dap4.dataset-services+xml,application/vnd.opendap.dap4.dataset-metadata+xml,application/vnd.opendap.dap4.data,application/vnd.opendap.dap4.error+xml,application/json,application/prs.coverage+json,application/rdf+xml,application/x-netcdf;ver=4,application/x-netcdf,image/tiff;application=geotiff"
         />
            

    Restart Tomcat for these changes to take effect.

    You can verify the change by using curl as follows:

    curl -H "Accept-Encoding: gzip" -I http://localhost:8080/opendap/data/nc/fnoc1.nc.ascii
            
    Information Icon The above URL is for Hyrax running on your local system and accessing a dataset that ships with the server.

    You’ll know that compression is enabled if the response to the curl command contains:

    Content-Encoding: gzip
            
    Information Icon If you are using Tomcat in conjunction with the Apache Web Server (our friend httpd) via AJP, you will need to also Localizing the OLFS Configuration under SELinux

    When using a yum-installed Tomcat on CentOS-7.x (or any other Linux environment that is essentially an SELinuxvariant), neither the /etc/olfs or the /usr/share/olfs configuration locations will work without taking extra steps. You must alter the SELinux access policies to give the Tomcat user permission to read and write to one of these directories.

    The following code block will configures the /usr/share/olfs directory for reading and writing by the Tomcat user:

    #!/bin/sh
    # You must be the super user to do this stuff...
    sudo -s
    # Create the location for the local configuration
    mkdir -p /usr/share/olfs
    # Change the group ownership to the tomcat group.
    # (SELinux will not allow you make the owner tomcat.)
    chgrp tomcat /usr/share/olfs
    # Make it writable by the tomcat group
    sudo chmod g+w /usr/share/olfs
    # Use semanage to change the context of the target
    # directory and any (future) child dirs
    semanage fcontext -a -t tomcat_var_lib_t "/usr/share/olfs(/.*)?"
    # Use restorecon to commit/do the labeling.
    restorecon -rv /usr/share/olfs
                

    For further reading about SELinux and its permissions issues, see the following:

    Tomcat Logs

    In SELinux the yum-installed Tomcat does not produce a catalina.out file; rather, the output is sent to the journal and can be viewed with the following command:

    journalctl -u tomcat
                
    3.6.9. Configuring The OLFS To Work With Multiple BES’s

    Configuring Hyrax to use multiple BES backends is straight forward. It will require that you edit the olfs.xml file and possible the catalog.xml file.

    Top Level (root) BES

    Every installation of Hyrax requires a top level (or root level) BES. This BES has a prefix of "/" (the forward slash character). The prefix is a URL token between the server address/port and catalog root used to designate a particular BES instance in the case that multiple Back-End-Servers are available to a single OLFS. The default (for a single BES) is no additional tag, designated by "/". The prefix is used to provide a mapping for each BES connected to the OLFS to URI space serviced by the OLFS.

    In a single BES deployment this BES would contain all of the data resources to be made visible in Hyrax. In the THREDDS catalog.xml file each top level directory/collection would have it’s own <datasetScan> element.

    Note: The word root here has absolutely nothing to do with the login account called root associated with the super user or system administrator.

    Single BES Example (Default)

    Here is the <Handler> element in an olfs.xml that defines the opendap.bes.BESManager file that configures the OLFS to use a single BES, the default configuration arrangement for Hyrax:

        <Handler className="opendap.bes.BESManager">
            <BES>
                <prefix>/</prefix>
                <host>localhost</host>
                <port>10022</port>
                <ClientPool maximum="10" />
            </BES>
        </Handler>
                

    The BES is running on the same system as the OLFS, and it’s prefix is correctly set to "/". This BES will handle all data requests directed at the OLFS and will expose it’s top level directory/collection/catalog in the URI space of the OLFS here:

    http://localhost:8080/opendap/
                

    The THREDDS catalog.xml file for this should contain a <datasetScan> element for each of the top level directories | collections | catalogs that the BES exposes at the above URI.

    *Remember*: There must be one (but only one) BES configured with the <prefix> set to "/" in your olf.xml file.

    Multiple BES examples

    Here is a BESManager <Handler> element that defines two BES’s:

        <Handler className="opendap.bes.BESManager">
            <BES>
                <prefix>/</prefix>
                <host>localhost</host>
                <port>10022</port>
                <ClientPool maximum="10" />
            </BES>
            <BES>
                <prefix>/sst</prefix>
                <host>comet.test.org</host>
                <port>10022</port>
                <ClientPool maximum="10" />
            </BES>
        </Handler>
                

    The first one is running on the same system as the OLFS, the second on comet. test.org. The second BES is mapped to the prefix /sst. So the URL:

    http://localhost:8080/opendap/
                

    Will return the directory view at the top level of the first BES, running on the same system as the OLFS. The URL:

    http://localhost:8080/opendap/sst
                

    Will return the directory view at the top level of the second BES, running on comet.test.org.

    You can repeat this pattern to add more BES’s to the configuration. This next example shows a configuration with 4 BES’s: The root BES, and 3 others:

        <Handler className="opendap.bes.BESManager">
            <BES>
                <prefix>/</prefix>
                <host>server0.test.org</host>
                <port>10022</port>
                <ClientPool maximum="10" />
            </BES>
            <BES>
                <prefix>/sst</prefix>
                <host>server1.test.org</host>
                <port>10022</port>
                <ClientPool maximum="10" />
            </BES>
            <BES>
                <prefix>/chl-a</prefix>
                <host>server2.test.org</host>
                <port>10022</port>
                <ClientPool maximum="10" />
            </BES>
            <BES>
                <prefix>/salinity</prefix>
                <host>server3.test.org</host>
                <port>10022</port>
                <ClientPool maximum="10" />
            </BES>
        </Handler>
                

    Note that in this example:

    1. The root BES is not necessarily running on the same host as the OLFS.
    2. Every BES has a different prefix.
    3. The OLFS would direct requests so that requests to:
    Mount Points

    In a multiple BES installation each additional BES must have a mount point within the exposed hierarchy of collections for it to be visible in Hyrax.

    Consider, if you have this configuration:

        <Handler className="opendap.bes.BESManager">
            <BES>
                <prefix>/</prefix>
                <host>server0.test.org</host>
                <port>10022</port>
                <ClientPool maximum="10" />
            </BES>
        </Handler>
                

    And the top level directory for the root BES looks like this:

    Hyrax 2

    If you add another BES, like this:

        <Handler className="opendap.bes.BESManager">
            <BES>
                <prefix>/</prefix>
                <host>server0.test.org</host>
                <port>10022</port>
                <ClientPool maximum="10" />
            </BES>
            <BES>
                <prefix>/sst</prefix>
                <host>server5.test.org</host>
                <port>10022</port>
                <ClientPool maximum="10" />
            </BES>
        </Handler>
                

    It will not appear in the top level directory unless you create a mount point. This simply means that on the file system served by the root BES you would need to create a directory called "sst" in the top of the directory tree that the rootBES is exposing. In other words, simply create a directory called "sst" in the same directory that contains the "Test" and "data" directories on server0.test.org. After you did that your top level directory would look like this:

    Hyrax 3

    This holds true for any arrangement of BESs that you make. The location of the mount point will depend on your configuration, and how you organize things. Here is a more complex example.

    Consider this configuration:

        <Handler className="opendap.bes.BESManager">
            <BES>
                <prefix>/</prefix>
                <host>server0.test.org</host>
                <port>10022</port>
                <ClientPool maximum="10" />
            </BES>
            <BES>
                <prefix>/GlobalTemperature </prefix>
                <host>server1.test.org</host>
                <port>10022</port>
                <ClientPool maximum="10" />
            </BES>
            <BES>
                <prefix>/GlobalTemperature/NorthAmerica</prefix>
                <host>server2.test.org</host>
                <port>10022</port>
                <ClientPool maximum="10" />
            </BES>
            <BES>
                <prefix>/GlobalTemperature/NorthAmerica/Canada </prefix>
                <host>server3.test.org</host>
                <port>10022</port>
                <ClientPool maximum="10" />
            </BES>
            <BES>
                <prefix>/GlobalTemperature/NorthAmerica/USA </prefix>
                <host>server4.test.org</host>
                <port>10022</port>
                <ClientPool maximum="10" />
            </BES>
            <BES>
                <prefix>/GlobalTemperature/Europe/France </prefix>
                <host>server4.test.org</host>
                <port>10022</port>
                <ClientPool maximum="10" />
            </BES>
        </Handler>
                
    • The mount point "GlobalTemperature" must be in the top of the directory tree that the root BES on server0.test.org is exposing.
    • The mount point "NorthAmerica" must be in the top of the directory tree that the BES on server1.test.org is exposing.
    • The mount point "Canada" must be in the top of the directory tree that the BES on server2.test.org is exposing.
    • The mount point "USA" must be in the top of the directory tree that the BES on server2.test.org is exposing.
    • The mount point "France" must be located at "GlobalTemperature/Europe/France" relative to the top of the directory tree that the BES on server0.test.org is exposing.
    Complete olfs.xml with multiple BES installations example
    <?xml version="1.0" encoding="UTF-8"?>
    <OLFSConfig>
        <DispatchHandlers>
            <HttpGetHandlers>
                <Handler className="opendap.bes.BESManager">
                    <BES>
                        <prefix>/</prefix>
                        <host>server0.test.org</host>
                        <port>10022</port>
                        <ClientPool maximum="10" />
                    </BES>
                    <BES>
                        <prefix>/GlobalTemperature </prefix>
                        <host>server1.test.org</host>
                        <port>10022</port>
                        <ClientPool maximum="10" />
                    </BES>
                    <BES>
                        <prefix>/GlobalTemperature/NorthAmerica</prefix>
                        <host>server2.test.org</host>
                        <port>10022</port>
                        <ClientPool maximum="10" />
                    </BES>
                    <BES>
                        <prefix>/GlobalTemperature/NorthAmerica/Canada </prefix>
                        <host>server3.test.org</host>
                        <port>10022</port>
                        <ClientPool maximum="10" />
                    </BES>
                    <BES>
                        <prefix>/GlobalTemperature/NorthAmerica/USA </prefix>
                        <host>server4.test.org</host>
                        <port>10022</port>
                        <ClientPool maximum="10" />
                    </BES>
                    <BES>
                        <prefix>/GlobalTemperature/Europe/France </prefix>
                        <host>server4.test.org</host>
                        <port>10022</port>
                        <ClientPool maximum="10" />
                    </BES>
                </Handler>
                <Handler className="opendap.coreServlet.SpecialRequestDispatchHandler" />
                <Handler className="opendap.bes.VersionDispatchHandler" />
                <Handler className="opendap.bes.DirectoryDispatchHandler">
                    <DefaultDirectoryView>OPeNDAP</DefaultDirectoryView>
                </Handler>
                <Handler className="opendap.bes.DapDispatchHandler" />
                <Handler className="opendap.bes.FileDispatchHandler" >
                    <!-- <AllowDirectDataSourceAccess /> -->
                </Handler>
                <Handler className="opendap.bes.ThreddsDispatchHandler" />
            </HttpGetHandlers>
            <HttpPostHandlers>
                <Handler className="opendap.coreServlet.SOAPRequestDispatcher" > <OpendapSoapDispatchHandler>opendap.bes.SoapDispatchHandler</OpendapSoapDispatchHandler>
                </Handler>
            </HttpPostHandlers>
        </DispatchHandlers>
    </OLFSConfig>
                
    3.6.10. Logging Configuration Introduction

    We see logging activities falling into two categories:

    • Access Logging - Is used to monitor server usage, server performance, and to see which resources are receiving the most attention. Tomcat has a very nice built-in Access Logging mechanism; all you have to do is turn it on.
    • Informational and debug logging - Most developers (myself included) rely on a collection of imbedded "instrumentation" that allows them to monitor their code and see what parts are being executed. Typically we like to design this instrumentation so that it can be enabled or disabled at runtime. Hyrax has this type of debugging instrumentation and ships with it disabled, but you could enable it. If you were to encounter an internal problem with Hyrax, you should enable different aspects of the instrumentation at you site, so that we can review the output to determine the issue.
    Access Logging

    Many people will want to record access logs for their Hyrax server. We want you to keep access logs for your Hyrax server. The easiest way to get a simple access log for Hyrax is to utilize the Tomcat/Catalina Valve Component.

    AccessLogValve

    Since Hyrax’s public facade is provided by the OLFS running inside of the Tomcat servlet container, you may utilize Tomcat’s handy access logging which relies on the org.apache.catalina.valves.AccessLogValve class. By default Tomcat comes with this turned off.

    To turn it on,

    1. Locate the file $CATALINA_HOME/conf/servlet.html.
    2. Find the commented out section for the access log inside the <Host> element. The server.xml file contains a good deal of comments, both for instruction and containing code examples. The part you are looking for is nested inside of the <Service> and the <Engine> elements. Typically it will look like:
    <Service ...>
        .
        .
        .
        <Engine...>
            .
            .
            .
            <Host name="localhost" appBase="webapps"
                unpackWARs="true" autoDeploy="true"
                xmlValidation="false" xmlNamespaceAware="false">
                .
                .
                .
                <!-- Access log processes all requests for this virtual host.
                     By default, log files are created in the "logs"
                     directory relative to $CATALINA_HOME.  If you wish, you can
                     specify a different directory with the "directory"
                     attribute.  Specify either a relative (to $CATALINA_HOME)
                     or absolute path to the desired directory. -->
                <!--
                <Valve className="org.apache.catalina.valves.AccessLogValve"
                       directory="logs"  prefix="localhost_access_log." suffix=".txt"
                       pattern="common" resolveHosts="false"/>
                --/>
                .
                .
                .
            </Host>
            .
            .
            .
        </Engine>
        .
        .
        .
    </Service>
                

    You can uncomment the <Valve> element to enable it, and you can change the values of the various attributes to suite your localization. For example:

                <Valve className="org.apache.catalina.valves.AccessLogValve"
                       directory="logs"
                       prefix="access_log."
                       suffix=".log"
                       pattern="%h %l %u %t "%r" %s %b %D"
                       resolveHosts="false"/>
                
    1. Save the file.
    2. Restart Tomcat.
    3. Read your log files.

    Note that the pattern atribute allows you to customize the content of the access log entries. It is documented in the javadocs for Tomcat/Catalina as part of the org.apache.catalina.valves.AccessLogValve class and here in the Server Configuration Reference. The pattern shown above will provide log output that looks like the example below:

    69.59.200.52 - - [05/Mar/2007:16:29:14 -0800] "GET /opendap/data/nc/contents.html HTTP/1.1" 200 13014 234
    69.59.200.52 - - [05/Mar/2007:16:29:14 -0800] "GET /opendap/docs/images/logo.gif HTTP/1.1" 200 8114 2
    69.59.200.52 - - [05/Mar/2007:16:29:51 -0800] "GET /opendap/data/nc/TestPatDbl.nc.html HTTP/1.1" 200 11565 137
    69.59.200.52 - - [05/Mar/2007:16:29:56 -0800] "GET /opendap/data/nc/data.nc.ddx HTTP/1.1" 200 2167 121
                

    The last column is the time in milliseconds it took to service the request and the next to the last column is the number of bytes returned.

    Informational and Debug Logging (Using the Logback implementation of Log4j)

    In general you shouldn’t have to modify the default logging configuration for Hyrax. It may become necessary if you encounter problems, but otherwise we suggest you leave it be.

    Having said that, Hyrax uses the Logback logging package to provide an easily configurable and flexible logging environment. All "console" output is routed through the Logback package and can be controlled using the Logback configuration file.

    There are several logging levels available:

    • TRACE
    • DEBUG
    • INFO
    • WARN
    • ERROR
    • FATAL

    Hyrax ships with a default logging level of ERROR.

    Additionally, Hyrax maintains its own access log using Logback.

    Information Icon We strongly recommend that you take the time to read about Logback and Log4j before you attempt to manipulate the Logback configuration.

    Configuration File Location

    Logback gets its configuration from an XML file. Hyrax locates this file in the following manner:

    1. Checks the <init-parameter> list for the hyrax servlet (in the web.xml) for a an <init-parameter> called "logbackConfig". If found, the value of this parameter is assumed to be a fully qualified path name for the file. This can be used to specify alternate Logback config files.
      Note: This configuration will not be persistent across new installations of Hyrax. We do not recommend setting this parameter, as doing so is not persistent—it will be overridden the next time the Web ARchive file is deployed.
    2. Failing 1: Hyrax then checks in the persistent content directory (set by either the OLFS_CONFIG_DIR environment variable or in /etc/olfs) for the file "logback-test.xml". If this file is present then it will be used to configure logging, and new installations of Hyrax will detect and use this logging configuration automatically.
    3. Failing 2: Hyrax then checks in the persistent content directory (set by either the OLFS_CONFIG_DIR environment variable or in /etc/olfs) for the file "logback.xml". If this file is present then it will be used to configure logging, and new installations of Hyrax will detect and use this logging configuration automatically.
    4. Failing 3: Hyrax falls back to the logback.xml file shipped with the distribution which is located in the$CATALINA_HOME/webapps/opendap/WEB-INF directory. Changes made to this file will be lost when a new version of Hyrax is installed or the opendap.war Web ARchive file is redeployed.

    So - if you want to customize your Hyrax logging and have it be persistent, do it by copying the distributed logback.xml file ($CATALINA_HOME/webapps/opendap/WEB-INF/logback.xml) to the persistent content directory (set by either the OLFS_CONFIG_DIR environment variable or in /etc/olfs) and editing that copy.

    Configuration

    Did you read about LogBack and Log4j? Great!

    There are a number of Appenders defined in the Hyrax log4j.xml file:

    • stdout - Loggers using this Appender will send everything to the console/stdout - which in a Tomcat environment will get shunted into the file $TOMCAT_HOME/logs/catalina.out.
    • devNull - Loggers using this Appender will not log. All messages will be discarded. This is the Log4j equivalent of piping your output into /dev/null in a UNIX environment.
    • ErrorLog - Loggers using this Appender will have their log output placed in the error log file in the persistent content directory: $TOMCAT_HOME/content/opendap/logs/error.log.
    • HyraxAccessLog - Loggers using this Appender will have their log output placed in the access log file in the persistent content directory: $TOMCAT_HOME/content/opendap/logs/HyraxAccess.log

    The default configuration pushes ERROR level (and higher) messages into the ErrorLog, and logs accesses using HyraxAccessLog. You can turn on debugging level logging by changing the log level to DEBUG for the software components you are interested in. All of the OPeNDAP code is in the "opendap" package. The following configuration will cause all log messages of ERROR level or higher to be sent to the error log:

        <logger name="opendap" level="error"/>
            <appender-ref ref="ErrorLog"/>
        </logger>
                

    The following configuration will cause all messages of level INFO' or higher to be sent to stdout, which (in Tomcat) means that they will get stuck in the file $TOMCAT_HOME/logs/catalina.out.

        <logger name="opendap" level="info"/>
            <appender-ref ref="stdout"/>
        </logger>
                

    Be sure to get in touch if you have further questions about the logging configuration.

    3.6.11. THREDDS Configuration Overview

    Hyrax now uses its own implementation of the THREDDS catalog services and supports most of the THREDDS catalog service stack. The implementation relies on two DispatchHandlers in the OLFS and utilizes XSLT to provide HTML versions (presentation views) for human consumption.

    1. Dynamic THREDDS catalogs for holdings provided by the BES are provided by the opendap.bes.BESThreddsDispatchHandler.
    2. Static THREDDS catalogs are provided by the opendap.threddsHandler.StaticCatalogDispatch. The static catalogs allow catalog "graphs" to be decoupled from the filesystem "graph" of the data holdings, thus allowing data providers the ability to present and organize data collections independently of how they are organized in the underlying filesystem.

    Static THREDDS catalogs are "rooted" in a master catalog file, catalog.xml, located in the (persistent) content directory for the OLFS (Typically $CATALINA_HOME/content/opendap). The default catalog.xml that comes with Hyrax contains a simple catalogRef element that points to the dynamic THREDDS catalogs generated from the BES holdings. The default catalog example also contains a (commented out) datasetScan element that provides (if enabled) a simple demonstration of the datasetScan capabilities. Additional catalog components may be added to the catalog.xml file to build (potentially large) static catalogs.

    Light Bulb Icon THREDDS datasetScan elements are now fully supported and can be used as a tool for altering the catalog presentation of any part of the BES catalog. These alterations include (but are not limited too) renaming, auto proxy generation, filtering, and metadata injection.
    THREDDS Catalogs using XSLT

    Prior to Hyrax 1.5 THREDDS catalog functionality in Hyrax was provided using an imported implementation of THREDDS. This was a large and complex dependancy for Hyrax, and the implementation had significant scalability problems for large catalogs. (Catalogs with 20k or more entries would consume all available memory.)

    In response to this, we have written new code for Hyrax. We have replaced the imported code with 2 OLFS handlers.

    BES THREDDS Handler

    The opendap.bes.BESThreddsDispatchHandler provides THREDDS catalogs for all data served from a BES. It requires no configuration. Simply adding it to the OLFS configuration file: $CATALINA_HOME/content/opendap/olfs.xml will provide THREDDS catalogs for data served from the BES.

    This handler uses XSL transforms to convert the BES <showCatalog> response into a THREDDS catalog.

    Default Configuration

    <Handler className="opendap.bes.BESThreddsDispatchHandler" />
                

    THREDDS Dispatch Handler

    The opendap.threddsHandler.Dispatch handler provides THREDDS catalog functionality for static THREDDS catalogs located on the system with the OLFS. The handler uses XSL transforms to provide HTML presentation views of both the catalogs and individual datasets within the catalog. Much like the TDS, data access links are available on the dataset pages (if the catalog contains the information for the access links).

    Memory Caching

    The implementation can be configured to use memory caching of THREDDS catalogs to improve speed and reduce disk thrashing.

    When memory caching is enabled, the handler will traverse the local THREDDS catalogs at startup. Each catalog file will be read into a memory buffer and cached. The memory buffer is parsed to verify that the catalog represents valid XML, but the resulting document is not saved. When a thredds:catalogRef element is encountered during the traversal, its href is evaluated:

    • If the href is a relative URL (does not begin with a "/" or "http://__") then the catalog is traversed and cached.
    • If the href begins with a "/" character, it is assumed that the catalog is being provided by another service on the same system, and it is not traversed or cached.
    • If the href begins with a "http://", it is assumed to be a remotely hosted catalog provided by another service on a different system, and it is not traversed or cached.

    When a client asks for an XML catalog response, the entire cached buffer for the catalog is dumped to the client in a single write command. Since an already existing byte buffer it written to the response stream, this should be very fast.

    If the client asks for an HTML view of the catalog, the buffer is parsed and passed through an XSL transform to generate the HTML page. The thinking behind this is as follows: machines traversing the XML files require fast response times. Humans will be traversing the HTML views of the catalog. We figure that the latency generated by parsing and performing transforms will be acceptable to most users.

    If memory caching is disabled, then the startup remains the same, except no data is cached. Subsequent client requests for THREDDS products are handled in the same manner as before, only the catalog content is read from disk each time. While this means that the XML responses will be much slower, it will scale to handle much larger static catalog collections.

    Cache Updates

    Each time a catalog request is processed, the source file’s last modified date is checked. If the catalog in memory was cached prior to the last modified date, it and all of its descendants in the catalog hierarchy are purged from the cache and reloaded.

    prefix element

    This handler requires a prefix element in the configuration: <prefix>thredds</prefix>. The value of the prefix element is used by the handler to identify requests intended for it. Basically, it will claim any request whose path begins with the prefix.

    For example, if the prefix is set to "thredds", then the request http://localhost:8080/opendap/thredds/catalog.xml will be claimed by the handler, while this request: http://localhost:8080/opendap/catalog.xml will not. (Although it would be claimed by the BES THREDDS Handler.)

    Presentation View (HTML)

    Supplanting the .xml at the end of a catalog’s name with .html will cause the opendap.threddsHandler.Dispatch to return an HTML presentation view of the catalog. This is accomplished by parsing the catalog.xml document (either from memory if cached or from disk if not) and running the resulting document through an XSL transform. All the metadata for all thredds:dataset elements can be inspected in a separate HTML page that details the dataset. This page is also generated by an XSL transform applied to the catalog XML document.

    Default configuration

    <Handler className="opendap.threddsHandler.Dispatch">
        <prefix>thredds</prefix>
        <useMemoryCache>true</useMemoryCache>
    </Handler>
                
    THREDDS Catalog Documentation

    Rather than provide an exhaustive explanation of the THREDDS catalog functionality and configuration, we will appeal to the existing documents provided by our fine colleagues at UNIDATA:

    Configuration Instructions
    • The current default (olfs.xml) file comes with THREDDS configured correctly.
    • The THREDDS master catalog is stored in the file $CATALINA_HOME/content/opendap/catalog.xml. It can be edited to provide additional static catalog access.
    datasetScan Support

    The datasetScan element is a powerful tool that can be used to sculpt the catalog’s presentation of the BES catalog content. The Hyrax implementation has a couple of key points that need to be considered when developing an instance of the datasetScan element in your THREDDS catalog.

    location attribute

    The location attribute specifies the place in the BES catalog graph where the datasetScan will be rooted. This value must beexpressed relative to the BES catalog root (BES.Catalog.catalog.RootDirectory) and not in terms of the underlying BES host file system.

    For example, if BES.Catalog.catalog.RootDirectory=/usr/share/hyrax and the data directory to which you wish to apply the datasetScan is (in filesystem terms) located at /Users/share/hyrax/data/nc, then the associated datasetScan element’s location attribute would have a value of /data/nc:

    <datasetScan name="DatasetScanExample" path="hyrax" location="/data/nc">
                

    name attribute

    The name attribute specifies the name that will be used in the presentation (HTML) view for the catalog containing the datasetScan.

    path attribute

    The path attribute specifies the place in the THREDDS catalog graph that the datasetScan will be rooted. It is effectively a relative URL for the service. If path begins with a "/", then it is an absolute path rooted at the server and port of the web server. The values of the path attribute should never contain "catalog.xml" or "catalog.html". The service will create these endpoints dynamically.

    Relative path example

    Consider a catalog accessed with the URL http://localhost:8080/opendap/thredds/v27/Landsat/catalog.xml and that contains this datasetScan element:

    <datasetScan name="DatasetScanExample" path="hyrax" location="/data/nc"
    /> </source>
                

    In the client catalog, the datasetScan becomes this catalogRef element:

    <thredds:catalogRef
        name="DatasetScanExample"
        xlink:title="DatasetScanExample"
        xlink:href="hyrax/catalog.xml"
        xlink:type="simple"
    />
                

    And the top of datasetScan catalog graph will be found at the URLhttp://localhost:8080/opendap/thredds/v27/Landsat/hyrax/catalog.xml.

    Absolute path examples

    Consider a catalog accessed with the URL http://localhost:8080/opendap/thredds/v27/Landsat/catalog.xml and that contains this datasetScan element:

    <datasetScan name="DatasetScanExample" path="/hyrax" location="/data/nc" />
                

    In the client catalog the datasetScan becomes this catalogRef element:

    <thredds:catalogRef
         name="DatasetScanExample"
         xlink:title="DatasetScanExample"
         xlink:href="/hyrax/catalog.xml"
         xlink:type="simple"
    />
                

    Then the top of datasetScan catalog graph will be found at the URL http://localhost:8080/hyrax/catalog.xml, which is probably not what you want! This catalogRef directs the catalog crawler away from the Hyrax THREDDS service and to an undefined (as far as Hyrax is concerned) endpoint, one that will most likely generate a 404 (Not Found) response from the Web Server.

    When using absolute paths you must be sure to prefix the path with the Hyrax THREDDS service path, or you will direct the clients away from the service. In these examples the Hyrax THREDDS service path would be /opendap/thredds/ (look at the URLs in the above examples). If we change the datasetScan path attribute value to /opendap/thredds/myDatasetScan:

    <datasetScan name="DatasetScanExample" path="'/opendap/thredds/myDatasetScan" location="/data/nc" />
                

    In the client catalog the datasetScan becomes this catalogRef element:

    <thredds:catalogRef
        name="DatasetScanExample"
        xlink:title="DatasetScanExample"
        xlink:href="/opendap/thredds/myDatasetScan/catalog.xml"
        xlink:type="simple"
    />
                

    Now the top of the datasetScan catalog graph will be found at the URLhttp://localhost:8080/opendap/thredds/myDatasetScan/catalog.xml, which keeps the URL referencing the Hyrax THREDDS service and not some other part of the web service stack.

    useHyraxServices attribute

    The Hyrax version of the datasetScan element employs the extra attribute useHyraxServices. This allows the datasetScanto automatically generate Hyrax data services definitions and access links for datasets in the catalog. The datasetScan can be used to augment the list of services (when useHyraxServices is set to true) or it can be used to completely replace the Hyrax service stack (when useHyraxServices is set to false).

    Keep the following in mind:

    • If no services are referenced in the datasetScan and useHyraxServices is set to true, then Hyrax will provide catalogs with service definitions and access elements for all the datasets that the BES identifies as data.
    • If no services are referenced in the datasetScan and useHyraxServices is set to false, then the catalogs generated by thedatasetScan will have no service definitions or access elements.

    By default useHyraxServices is set to true.

    Functions

    DatasetScan allows you to apply the following functions to the names of the datasets in the datasetScan catalog graph.

    filter

    A datasetScan element can specify which files and directories it will include with a filter element (also see THREDDS server catalog spec for details). The filter element allows users to specify which datasets are to be included in the generated catalogs. A filter element can contain any number of include and exclude elements. Each include or exclude element may contain either a wildcard or a regExp attribute. If the given wildcard pattern or regular expression matches a dataset name, that dataset is included or excluded as specified. By default, includes and excludes apply only to atomic datasets (regular files). You can specify that they apply to atomic and/or collection datasets (directories) by using the atomic and collection attributes.

    <filter>
        <exclude wildcard="*not_currently_supported" />
        <include regExp="/data/h5/dir2" collection="true" />
    </filter>
                

    sort

    Datasets at each collection level are listed in ascending order by name. With a sort element you can specify that they are to be sorted in reverse order:

    <sort>
        <lexigraphicByName increasing="false" />
    </sort>
                

    namer

    If no namer element is specified, all datasets are named with the corresponding BES catalog dataset name. By adding a namer element, you can specify more human readable dataset names.

    <namer>
        <regExpOnName regExp="/data/he/dir1" replaceString="AVHRR" />
        <regExpOnName regExp="(.*)\.h5" replaceString="$1.hdf5" />
        <regExpOnName regExp="(.*)\.he5" replaceString="$1.hdf5_eos" />
        <regExpOnName regExp="(.*)\.nc" replaceString="$1.netcdf" />
    </namer>
                

    addTimeCoverage

    A datasetScan element may contain an addTimeCoverage element. The addTimeCoverage element indicates that a timeCoverage metadata element should be added to each dataset in the collection and describes how to determine the time coverage for each dataset in the collection.

    <addTimeCoverage
        datasetNameMatchPattern="([0-9]{4})([0-9]{2})([0-9]{2})([0-9]{2})_gfs_211.nc$"
        startTimeSubstitutionPattern="$1-$2-$3T$4:00:00"
        duration="60 hours"
    />
                

    for the dataset named 2005071812_gfs_211.nc, results in the following timeCoverage element:

     <timeCoverage>
        <start>2005-07-18T12:00:00</start>
        <duration>60 hours</duration>
      </timeCoverage>
                

    addProxies

    For real-time data you may want to have a special link that points to the "latest" data in the collection. Here, latest is simply means the last filename in a list sorted by name, so its only the latest if the time stamp is in the filename and the name sorts correctly by time.

    <addProxies>
        <simpleLatest name="simpleLatest" />
        <latestComplete name="latestComplete" lastModifiedLimit="60.0" />
    </addProxies>
                

    Last Updated: Sep 30, 2019 at 3:44 PM EDT