Tip: To download an xsd file, right-click and select "Save Link As..."
The term "Ingest" refers to the process by which ECHO loads published metadata files into its metadata holdings for the usage of the ECHO user community. The term "Ingest job" refers to a distinguishable grouping of submitted metadata files which are processed and reported upon by ECHO Ingest.
The topics covered on this page include information regarding the metadata Ingest process performed by ECHO. It is the responsiblity of the Data Partner to regularly monitor the Ingest of their published metadata.
The information provided is broken up into the following topic areas:
- Ingest Workflow - An overview of how ECHO processes submitted metadata.
- Ingest Reporting - An overview of the available methods for tracking the progress of active Ingest jobs and the results of completed jobs.
- Ingest Policies - A listing of ingest related policies which Data Partners should be aware of.
ECHO continually detects submitted metdata and will create Ingest jobs which are then processed. Each job is transformed and validated against the ECHO 10.0 metadata schema, and sorted according to metadata action type. After these steps, a job may wait for associated browse image files or proper sequencing. After the job is released from a waiting state, it is queued for loading. When the current Ingest job completes loading, Ingest will generate a ftp accessible Ingest report, notify the Data Partner, and begin loading the next queued job.
For a more detailed description of the internal ECHO Ingest states, please refer to:
- Section 6 of the ECHO Ingest Supplementary Specification - (PDF)
Data Providers submit their metadata files to a configured ftp location on the ECHO system. The ECHO Ingest process regularly detects published metadata files and will bundle a group of detected files or a single package file into an Ingest job.
Providers may deliver metadata files in two ways:
- Package Delivery - A single zip file containing metadata files and an xml manifest file describing the contents of the package.
- Single File - Metadata files are submitted by themselves.
ECHO recommends Data Provider submit metadata using the package delivery method. This method allows providers to assign a textual job name (e.g. "Ingest Job #3 - 10/21/2008"), and a package sequence number. Each Ingest Job is associated with a sequence number in order to ensure metadata is submitted in the correct order. Providers are encouraged to utilize this mechanism by supplying their own package sequence numbers to manage the processing of their data.
For a more detailed description of metadata delivery methods, please refer to:
- Section 3.2 of the ECHO Data Partner User's Guide - (PDF)
- Package Manifest - (HTML) (XSD)
Each submitted metadata file will contain a specific type of metadata item (e.g. collection insert, granule partial update, etc...). In order to ensure that metadata items are ingested correctly, metadata actions are processed in the following order:
- Browse inserts/replacements
- Collection inserts/replacements
- Collection partial deletes
- Collection partial updates
- Collection deletes
- Granule inserts/replacement
- Granule partial deletes
- Granule partial updates
- Granule deletes
- Browse deletes
Last update usage. How does last update work for collection & granule partial update. How does last update work for browse (not required)
Ingest reports its activity to Data Providers through email and automatically generated report files. These two reporting mechanisms are described below. For a more detailed description of metadata delivery methods, please refer to:
- Section 3.14 of the ECHO Data Partner User's Guide - (PDF)
- Ingest Report Schema - (HTML) (XSD)
ECHO Ingest will notify a configurable list of individuals associated with each Data Provider when an ingest job is started and completed. The automatically generated email will include identifying information for the job and the event time. Note that when a job is 'started', this means that it has been added to the queue for a provider. Ingest will process received jobs according to the order of receipt, or sequence number (if provided). Emails will be generated for the following situations:
- Start of Ingest job
- Timeout waiting for browse image
- Timeout waiting for sequenced package
- Completion of Ingest job
- ... Needs verification ...
When an ECHO Ingest job completes, an XML report file is generated and placed in the provider's ftp output directory. This file is created according to the schema referenced below.
The XML format used for the ECHO Ingest reports facilitates automated ingest processing and reconciliation. Each Ingest report includes the following information.
- Job Errors - Errors which occur creating the job which result in the job not being processed.
- Job Processing Totals - A count of collection, granule or browse insertions, updates, replacements, deletions, and rejections for the entire job.
- File Processing Totals - A count of collection, granule or browse insertions, updates, replacements, deletions, and rejections for each metadata file.
- File Errors - Errors which occur processing a metadata file which may result in a file not being processed.
- Item Errors - Errors which occur processing a metadata item which result in an item rejection.
ECHO Ingest Accounting Tool (EIAT)
Data Providers may use the ECHO Ingest Accounting Tool (EIAT) to monitor their active jobs and to the information included in the XML Ingest reports. A provider can view an overall summary of Ingest jobs being processed and the following information for the provider's Ingest jobs:
- Detailed information for live Ingest jobs
- Detailed information for completed Ingest jobs
- Detailed information for rejected items during Ingest
- Summary Report of completed Ingest jobs
To access EIAT, visit the following links. The ECHO authentication associated with each system will be used to allow access to users who have been granted "provider role" to an ECHO Data Provider.
ECHO Ingest Policies
The following policies outline a few guidelines for interacting with ECHO Ingest. Additional policies are outlined in the Data Partner's Operation Agreement, which is signed by both ECHO and the Data Partner. For questions regarding ingest policies, please contact ECHO directly at echo @ echo.nasa.gov.
- ECHO will retain original metadata and Ingest logs for a maximum of 60 days.
- ECHO will retain original browse image files for a maximum of 60 days.
- ECHO will remove all Ingest report files in the provider output area greater than 60 days old.
- ECHO will remove all reconciliation files in the provider output area greater than 60 days old.
- ECHO may configure a provider for manual ingest processing if problems are detected with submitted metadata.
- ECHO will not manually edit metadata within the ECHO DB to correct invalid data. Data Partners should resubmit metadata with the valid data values to correct such situations.