Archive Integration¶

VidiCore has built in integration with a few archive vendors. For other vendors it is possible to write your own integration scripts which VidiCore then will invoke when a file is to be archived or restored.

How VidiCore archive integrations works
Integrating with an archive using JavaScript
Amazon Glacier
Atempo Digital Archive Integration
Telestream DIVA Integration

How VidiCore archive integrations works ¶

In order to get a working integration with an external archive, a special storage must be created with type ARCHIVE. The archive integrations present in VidiCore are selected via the <bean> element in the StorageDocument. When integrating an archive system via a script the JavaScript code is set in the <archiveScript> element.

Additionally, the storage should have a method that points to a location where both VidiCore and the archiving application can read and write files. Below, this is called the “staging area”. Please note that the browse flag of this storage method only applies to the staging area, not to the archive itself.

When VidiCore is about to archive a file, the file will first be copied to the staging area and then the integration code will be invoked, which is supposed to read the file from the staging area and archive it.

Similarly, when VidiCore is about to restore a file, the integration code is first invoked, which is supposed to restore the file to the staging area, and then the file is copied from the staging area to the destination storage.

If the archiving application already has access to some of the storages in the system the usage of the staging area can be disabled. Please refer to the docs below on how to achieve this.

Further configuration of the archive integration is done via key-value metadata on the storage. The vendor-specific options are documented below.

Archiving files¶

To archive a file managed by VidiCore start a normal copy file job copying the file to the archive storage. VidiCore will copy the job to the staging area (if applicable) and trigger archival of this file.

By default, the copy job completes once the file is present on the staging area. The archival operation will be executed in the background. The result of the archive operation can be looked up on the Transfer log. By setting the job metadata value archiveDestinationFile to true the copy job will wait for the archival process to finish.

The file state of the file on the archive storage will stay on CLOSED as long as the archive operation has not completed successfully. Once archival has completed, the file state will change to ARCHIVED.

Restoring files¶

There is no need to restore files manually. VidiCore will automatically restore archived files when they are used in VidiCore jobs. This applies to transconding, rendering, conforming, copy job etc.

Integrating with an archive using JavaScript ¶

Archive script¶

To enable integration, a JavaScript must be written which will perform the actual archive operation. In order to be as flexible as possible, this script can both make API calls to VidiCore (The api object), and invoke shell operations (The shell object).

The script also has access to a file object, which corresponds to the file entity on the ARCHIVE storage. This object has the following functions defined:

file.getMetadata(key)¶: If the specified metadata key is set on the file, the value is returned, otherwise null.

file.setMetadata(key, value)¶: Sets the specified key-value pair as metadata on the file.

file.getAllMetadata()¶: Returns a map of all file metadata.

The script must as its last assignment define an object with the following properties:

archive(uri, id, data) - Invoked when an archive is to be performed.
- uri: URI to the source file, i.e. the file that should be archived. This will be a URI to the staging area, or a URI to a storage in archiveDirectlyFromStorages (see below).
- id: the id of the file entity on the ARCHIVE storage.
- data: storage metadata as a JavaScript object.
restore(uri, id, data) - Invoked when a restore is to be performed.
- uri: URI to the destination file, i.e. the restored file. This will be a URI to the staging area, or a URI to a storage in restoreDirectlyToStorages (see below).
- id: the id of the file entity on the ARCHIVE storage.
- data: storage metadata as a JavaScript object.
remove(id, data) - Invoked when a delete is to be performed.
- id: the id of the file entity on the ARCHIVE storage.
- data: storage metadata as a JavaScript object.
restorePartial(uri, id, offset, length, data) - Invoked when a partial restore is to be performed. This function is optional. If it is missing and a partial restore is requested, the restore function will be invoked.
- uri: URI to the destination file, i.e. the restored file. This will be a URI to the staging area, or a URI to a storage in restoreDirectlyToStorages (see below).
- id: the id of the file entity on the ARCHIVE storage.
- offset: the offset in bytes.
- length: the length in bytes.
- data: storage metadata as a JavaScript object.

If the archive system already has access to some of the storages in the system the usage of the staging area can be disabled. This is configured using the storage metadata archiveDirectlyFromStorages and restoreDirectlyToStorages. They can either hold a comma-separated list of storage IDs or the wildcard *. If the source storage for archiving matches a storage in archiveDirectlyFromStorages the staging area is not used. The same applies for restore operation of the destination storage matches a storage in restoreDirectlyToStorages.

Example¶

Creating an ARCHIVE storage:

POST /storage

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>ARCHIVE</type>
  <capacity>1000000000</capacity>
  <method>
    <uri>file:///mnt/archive-staging-area/</uri>
    <read>true</read>
    <write>true</write>
    <browse>false</browse>
  </method>
  <archiveScript><![CDATA[
...
]]></archiveScript>
</StorageDocument>

A simple archive script which uses shell commands to copy files to and from a local archive directory. The archive directory is configured using storage metadata archiveDir and the script uses file metadata (archivePath) to remember where in the archive directory files have been archived. It assumes that the staging area is a local directory, or that archiveDirectlyFromStorages and restoreDirectlyToStorages is set to other local directory storages.

function getFilePath(url) {
  /* Converts the given url to a string that can be used with the cp and rm commands */
  if (url.indexOf('file:///') === 0) {
    url = url.substring(7);
  }
  if (url.indexOf('file:/') === 0) {
      url = url.substring(5);
  }
  if (url.indexOf('/C:/') === 0) {
    url = url.substring(1);
  }
  return url;
}

o = {
  "archive": function(uri, id, data) {
    const sourcePath = getFilePath(uri);
    const sourceFilename = uri.substring(uri.lastIndexOf('/') + 1);
    const archiveDir = getFilePath(data.archiveDir);

    /* Copy the source file to the archive directory */
    logger.log(`Running command "cp ${sourcePath} ${archiveDir}"`);
    const result = shell.exec('cp', sourcePath, archiveDir);
    if (result.exitcode !== 0) {
        throw `Failed to copy file to archive: ${result.err}`;
    } else {
        /* Store the uri where we archived the file so that we can later restore it */
        file.setMetadata('archivePath', `${archiveDir}${sourceFilename}`);
    }
  },
  "restore": function(uri, id, data) {
    /* Read the archivePath metadata, so we know where the archived file is located */
    const archivePath = getFilePath(file.getMetadata('archivePath'));
    const destinationPath = getFilePath(uri);

    /* Copy the file from archive directory to the destination file */
    logger.log(`Running command "cp ${archivePath} ${destinationPath}"`);
    const result = shell.exec('cp', archivePath, destinationPath);
    if (result.exitcode !== 0) {
        throw `Failed to copy file from archive: ${result.err}`;
    }
  },
  "remove": function(id, data) {
    /* Read the archivePath metadata, so we know where the archived file is located */
    const archivePath = getFilePath(file.getMetadata('archivePath'));

    /* Delete the file from archive directory */
    logger.log(`Running command "rm ${archivePath}"`);
    const result = shell.exec('rm', archivePath);
    if (result.exitcode !== 0) {
        throw `Failed to remove file: ${result.err}`;
    }
  }
};

Amazon Glacier ¶

VidiCore can archive files on Amazon Glacier. There are two different ways this can be achieved:

Use a regular S3 storage and transition objects to one of the Glacier storage classes.
Creating a separate Glacier storage and move files from other storages to be archived.

Note

Using a separate Glacier storage is deprecated. Existing installation can continue using it, but when adding Amazon S3 storages for archival the first time we recommend regular S3 storages in conjunction with S3 storage classes.

In order to archive and restore files the user needs all the following policy permission actions for the glacier vault:

glacier:InitiateJob

glacier:GetJobOutput

glacier:DescribeJob

glacier:InitiateMultipartUpload

glacier:ListMultipartUploads

glacier:UploadMultipartPart

glacier:CompleteMultipartUpload

This configuration is made entirely within AWS and instruction on how to configure it can be found here: https://docs.aws.amazon.com/amazonglacier/latest/dev/access-control-identity-based.html

Regular S3 storage in conjunction with storage classes¶

Configure a regular S3 storages of type LOCAL with an s3:/ URI as described here. All files are kept in this storage, also the archived ones. The only difference between archived and non-archived files are their S3 storage classes.

There are several options for defining or changing the storage class of the files in such a storage:

Configure a default storage class via the storageClass storage method metadata
Use S3 lifecycle rules to let AWS transition the files to another storage class - see Object Lifecycle Management
Use S3 object tagging to trigger an AWS lifecycle rules via the VidiCore API.

Regardless of the way how the storage class is set on a file, VidiCore will automatically detect storage class transitions. If a file is in storage class S3 Glacier Flexible Retrieval or Glacier Deep Archive the file state will be ARCHIVED. Archived files need to be restored before being able to use them. Files in any other storage class can be directly used; this also applies to the Glacier Instant Retrieval storage class.

Creating a dedicated Glacier storage¶

To create a storage used solely for Glacier archiving, you need to create a storage with an XML document like this:

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>ARCHIVE</type>
  <bean>GlacierBean</bean>
  <capacity>100000000000000</capacity>
  <metadata>
    <field>
      <key>glacierVaultName</key>
      <value>{vault name}</value>
    </field>
    <field>
      <key>glacierEndpoint</key>
      <value>https://glacier.us-east-1.amazonaws.com/</value>
    </field>
  </metadata>
</StorageDocument>

Configuring the storage metadata archiveDirectlyFromStorages and restoreDirectlyToStorages is not required for bypassing the staging area.

New in version 21.3.1.

From this version VidiCore has updated the client used for Glacier. In practice this means there is a minor change in how the client operates and how its created. In addition to suppying which Glacier endpoint to use, it is also adviced to supply which signing region you wish to use (the region to use for SigV4 signing of requests, e.g. us-west-1). If this is not supplied, VidiCore will try to guess signing region from the endpoint itself and try to use this.

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>ARCHIVE</type>
  <bean>GlacierBean</bean>
  <capacity>100000000000000</capacity>
  <metadata>
    <field>
      <key>glacierVaultName</key>
      <value>{vault name}</value>
    </field>
    <field>
      <key>glacierEndpoint</key>
      <value>https://glacier.us-east-1.amazonaws.com/</value>
    </field>
    <field>
      <key>glacierSigningRegion</key>
      <value>{signing region}</value>
    </field>
  </metadata>
</StorageDocument>

Glacier storages can now utilize Aws default credentials provider chain to look for credentials. This means that if no credentials are supplied using either metadata or a AwsCredentials.properties file, VidiCore will automatically try to use any credentials found using this chain.

In practice this means VidiCore looks for Glacier credentials in this order:

From metadata using glacierSecretKey and glacierAccessKeyId (if present).

Read from the AwsCredentials.properties file in the credentials directory (if this exist).

Using AWS default credentials provider chain.

Also new in this version is the ability to use IAM roles with Glacier. These can be supplied through metadata and VidiCore will try to assume these if they are set.

Note

When using IAM roles, VidiCore needs to contact Amazons STS service in a specific region to generate credentials, e.g. eu-west-1. This region can be supplied for Glacier in two ways.

Using the metadata field glacierStsRegion

Setting the VidiCore configuration property stsRegion to the desired region.

Please note that if the metadata field for STS region is set, it will take precedence over the property value. If neither of these are set VidiCore will fallback to using the Amazon default region which is us-west-2.

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>ARCHIVE</type>
  <bean>GlacierBean</bean>
  <capacity>100000000000000</capacity>
  <metadata>
    <field>
      <key>glacierRoleArn</key>
      <value>{role arn}</value>
    </field>
    <field>
      <key>glacierRoleExternalId</key>
      <value>{role external id}</value>
    </field>
    <field>
      <key>glacierStsRegion</key>
      <value>{glacier sts region}</value>
    </field>
    <field>
      <key>glacierVaultName</key>
      <value>{vault name}</value>
    </field>
    <field>
      <key>glacierEndpoint</key>
      <value>{glacier endpoint}</value>
    </field>
  </metadata>
</StorageDocument>

New in version 5.3.

Credentials for the storage can now be added as metadata as show in the example below. The secret key will be encrypted when the storage is created or updated.

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>ARCHIVE</type>
  <bean>GlacierBean</bean>
  <capacity>100000000000000</capacity>
  <metadata>
    <field>
      <key>glacierAccessKeyId</key>
      <value>{access key}</value>
    </field>
    <field>
      <key>glacierSecretKey</key>
      <value>{secret key}</value>
    </field>
    <field>
      <key>glacierVaultName</key>
      <value>{vault name}</value>
    </field>
    <field>
      <key>glacierEndpoint</key>
      <value>{glacier endpoint}</value>
    </field>
  </metadata>
</StorageDocument>

Files can then be moved to this storage either using storage rules or by initiating a copy job (Move/copy or create a hard link to another storage). Restore jobs must be initiated using storage rules.

Note that restore jobs typically take several hours, and the restore job will be put in the WAITING state while the restore initiation is in progress. This is to allow other jobs to run during this time.

New in version 5.1.6.

VidiCore adds metadata to files stored in Amazon Glacier or Glacier Deep Archive. Using the key s3ArchiveStorageClass, it is possible to distinguish what storage class the file has. This metadata is then removed if the file is restored.

Transitioning files from/to Glacier¶

There is no way, using the AWS SDK, to directly initiate a transition to the Glacier storage class for a single object. Instead, Object Lifecycle Management must be used. VidiCore will automatically detect when a transition to the Glacier Flexible Retrieval or Glacier Deep Archive class has happened and sets the file to ARCHIVED.

VidiCore jobs automatically restore ARCHIVED files when needed. To explictly restore a file so that it can be read directly, you can use the following request:

PUT {file-resource}/restore¶

Triggers a request to Glacier to initiate a restore.

Once the restore is complete, the file will be put in CLOSED state, and will be available for direct access.

The expirationInDays parameter has to be set and specifies how long the restored files should be available. Once it has expired, it will be removed from direct access and once again end up in the ARCHIVED state.

Query Parameters:	extraData (string[]) – Additional parameters relevant for the restore, in the form of `key=value`. Specify multiple parameters using multiple query parameters. `expirationInDays={number-of-days}` How long the restored files should be available. Deprecated since version 4.9: Use the `expirationInDays` query parameter instead. expirationInDays (integer) – Required. How long the restored files should be available. retrievalTier (string) – Required. Sets the `Glacier retrieval tier`_ to use when restoring the file. One of `Expedited`, `Standard` or `Bulk`.
Produces:	text/plain – Informational text.
Status Codes:	400 Invalid Input – Amazon rejects the restore request, a parameter value is invalid. 404 Not Found – The file was not found.

Atempo Digital Archive Integration ¶

VidiCore can archive from and retrieve files to any storage location which has a corresponding agent set up in an Atempo Digital Archive environment. In such a setup, each archive location would be defined as a separate storage, and any agent would also have a corresponding storage.

Configuring the storage metadata archiveDirectlyFromStorages and restoreDirectlyToStorages is not required for bypassing the staging area.

To set up an Atempo archive storage, use the following storage XML:

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>ARCHIVE</type>
  <bean>AtempoDigitalArchiveBean</bean>
  <capacity>100000000000000</capacity>
  <metadata>
    <field>
      <key>atempoWebServiceEndpoint</key>
      <value>http://atempo-lto/meta/C9ABD9AAD3883A8CFD3EEB922C90B3F3/721b786531/ADA/WS/</value>
    </field>
    <field>
      <key>atempoRootPath/key>
      <value>/</value>
    </field>
    <field>
      <key>atempoArchiveName/key>
      <value>AMM</value>
    </field>
  </metadata>
</StorageDocument>

And for each storage that has a corresponding agent, the storage XML should look like this (note the storage metadata);

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>LOCAL</type>
  <autoDetect>true</autoDetect>
  <method>
    <uri>file:///mnt/storage/</uri>
    <read>true</read>
    <write>true</write>
    <browse>true</browse>
  </method>
  <metadata>
    <field>
      <key>atempoRootPath</key>
      <value>/mnt/atempo-agent1/</value>
    </field>
    <field>
      <key>atempoAgentName</key>
      <value>atempo-agent1</value>
    </field>
  </metadata>
</StorageDocument>

Telestream DIVA Integration ¶

For accessing a DIVA system via VidiCore we recommend using DIVA’s REST API. The older Java API still is available (version 7.3.0.29.1) but deprecated.

In both cases set up a shared folder which is accessible by both VidiCore and DIVA manager. The VidiCore view on this location is defined in the uri of the storage method of the ARCHIVE storage. DIVA’s view on the shared storage is defined as DIVA unmanaged storage name in the key-value metadata field DIVAServerName of the ARCHIVE storage.

Starting with VidiCore 25.4.1 you can bypass the shared folder by configuring the storage metadata DIVAServerName on the VidiCore storages directly accesible by DIVA. This feature only is available for the REST API integration. Do not configure the storage metadata archiveDirectlyFromStorages and restoreDirectlyToStorages for a DIVA ARCHIVE storage.

For archiving into different DIVA collections (categories) or media names you may create an ARCHIVE storage for each combination of collection and media name. If you prefer working with a single DIVA ARCHIVE storage you may override the DIVA destination media name and category on a per-job basis by setting the job metadata keys DIVAMediaName or DIVACategory on the job doing the archive operation.

By default, VidiCore uses the file ID on the archive storage as DIVA object ID (object name). Since 25.4.1 you may override this and set the object ID explicitly by setting the job metadata key DIVAObjectId to the desired object ID. This feature only is available for the REST API integration.

Timecode-based partial restore operations are supported in the REST API integration starting with VidiCore 25.4.3. When VidiCore requires a specific timecode range of an archived clip for a CONFORM or EXPORT job or a subclip operation it will issue a partial restore request to DIVA specifying the requested timecode range.

Via DIVA REST API¶

Create a storage like this:

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
    <type>ARCHIVE</type>
    <capacity>1000000000000</capacity>
    <bean>DIVARestApiBean</bean>
    <metadata>
        <field>
            <!-- DIVA REST API endpoint -->
            <key>DIVAEndpoint</key>
            <value>https://our-diva-endpoint:8765</value>
        </field>
        <field>
            <!-- API username -->
            <key>username</key>
            <value>diva</value>
        </field>
        <field>
            <!-- API password -->
            <key>password</key>
            <value>diva</value>
        </field>
        <field>
            <!-- default collection/category -->
            <key>DIVACategory</key>
            <value>default</value>
        </field>
        <field>
            <!-- DIVA group of tape, or an array of disk where the instance has to be created -->
            <key>DIVAMediaName</key>
            <value>default</value>
        </field>
        <field>
             <!-- DIVA unmanaged storage name of shared storage (used for archive and restore) -->
             <key>DIVAServerName</key>
             <value>sharedstorage</value>
        </field>
    </metadata>

    <method>
        <uri>file:///shared/storage/</uri>
        <read>true</read>
        <write>true</write>
        <browse>false</browse>
    </method>
</StorageDocument>

Via DIVA Java API (deprecated)¶

Create a storage like this:

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
     <type>ARCHIVE</type>
     <capacity>1000000000000</capacity>
     <bean>DIVABean</bean>
     <metadata>
          <field>
               <!-- SSH host -->
               <!-- this is the hostname of the DIVA SSH service -->
               <key>hostname</key>
               <value>187.47.11.109</value>
          </field>
          <field>
               <!-- SSH username -->
               <!-- this is the username for the DIVA SSH service -->
               <key>username</key>
               <value>diva</value>
          </field>
          <field>
               <!-- SSH password -->
               <key>password</key>
               <value>diva</value>
          </field>
          <field>
               <!-- SSH port -->
               <key>port</key>
               <value>22</value>
          </field>
          <field>
               <!-- hostname or IP address for the DIVA manager -->
               <key>DIVAHostname</key>
               <value>187.47.11.109</value>
          </field>
          <field>
               <!-- TCP port for the DIVA manager -->
               <key>DIVAPort</key>
               <value>9065</value>
          </field>
          <field>
               <!-- DIVA group of tape, or an array of disk where the instance has to be created -->
               <key>DIVAMediaName</key>
               <value>default</value>
          </field>
          <field>
               <!-- default collection/category -->
               <key>DIVACategory</key>
               <value>default</value>
          </field>
          <field>
               <!-- DIVA unmanaged storage name of shared storage (used for archive) -->
               <key>DIVAServerName</key>
               <value>sharedstorage</value>
          </field>
          <field>
              <!-- DIVA unmanaged storage name of shared storage (used for restore) -->
              <key>DIVARestoreDestination</key>
              <value>sharedstorage</value>
          </field>
          <field>
               <!-- path to shared folder on DIVA server -->
               <key>DIVAFilePathRoot</key>
               <value>C:/shared/storage/</value>
          </field>
     </metadata>

     <method>
          <uri>file:///shared/storage/</uri>
          <read>true</read>
          <write>true</write>
          <browse>false</browse>
     </method>
</StorageDocument>

Information on archive files¶

When archiving a file to DIVA, VidiCore stores the following information on the VidiCore file entity: DIVACategory, DIVAMediaName, DIVAServerName.

New in version 25.4.1.

In case the object ID was explicity set by the DIVAObjectId job metadata value, the object ID is stored on the DIVAObjectId key-value metadata field on the VidiCore file entity to be used for later retrieval.