Archive Integration¶

VidiCore has built in integration with a few archive vendors. For other vendors it is possible to write your own integration scripts which VidiCore then will invoke when a file is to be archived or restored.

Integrating with an archive using JavaScript
Amazon Glacier
Atempo Digital Archive Integration
Front Porch Diva Integration

Integrating with an archive using JavaScript ¶

In order to get a working integration with an external archive, a special storage must be created with type ARCHIVE. For this integration to work, a script must be associated with the storage (described below).

Additionally, the storage should have a method that points to a location where both VidiCore and the archiving application can read and write files. Below, this is called the “staging area”.

When VidiCore is about to archive a file, the file will first be copied to the staging area and then the integration script will be invoked, which is supposed to read the file from the staging area and archive it.

Similarly, when VidiCore is about to restore a file, the integration script is first invoked, which is supposed to restore the file to the staging area, and then the file is copied from the staging area to the destination storage.

If the archiving application already has access to some of the storages in the system the usage of the staging area can be disabled. This is configured using the storage metadata archiveDirectlyFromStorages and restoreDirectlyToStorages.

Archive script¶

To enable integration, a JavaScript must be written which will perform the actual archive operation. In order to be as flexible as possible, this script can both make API calls to VidiCore (The api object), and invoke shell operations (The shell object).

The script also has access to a file object, which corresponds to the file entity on the ARCHIVE storage. This object has the following functions defined:

file.getMetadata(key)¶: If the specified metadata key is set on the file, the value is returned, otherwise null.

file.setMetadata(key, value)¶: Sets the specified key-value pair as metadata on the file.

file.getAllMetadata()¶: Returns a map of all file metadata.

The script must as its last assignment define an object with the following properties:

archive(uri, id, data) - Invoked when an archive is to be performed.
- uri: URI to the source file, i.e. the file that should be archived. This will be a URI to the staging area, or a URI to a storage in archiveDirectlyFromStorages.
- id: the id of the file entity on the ARCHIVE storage.
- data: storage metadata as a JavaScript object.
restore(uri, id, data) - Invoked when a restore is to be performed.
- uri: URI to the destination file, i.e. the restored file. This will be a URI to the staging area, or a URI to a storage in restoreDirectlyToStorages.
- id: the id of the file entity on the ARCHIVE storage.
- data: storage metadata as a JavaScript object.
remove(id, data) - Invoked when a delete is to be performed.
- id: the id of the file entity on the ARCHIVE storage.
- data: storage metadata as a JavaScript object.
restorePartial(uri, id, offset, length, data) - Invoked when a partial restore is to be performed. This function is optional. If it is missing and a partial restore is requested, the restore function will be invoked.
- uri: URI to the destination file, i.e. the restored file. This will be a URI to the staging area, or a URI to a storage in restoreDirectlyToStorages.
- id: the id of the file entity on the ARCHIVE storage.
- offset: the offset in bytes.
- length: the length in bytes.
- data: storage metadata as a JavaScript object.

Example¶

Creating an ARCHIVE storage:

POST /storage

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>ARCHIVE</type>
  <capacity>1000000000</capacity>
  <method>
    <uri>file:///mnt/archive-staging-area/</uri>
    <read>true</read>
    <write>true</write>
    <browse>false</browse>
  </method>
  <archiveScript><![CDATA[
...
]]></archiveScript>
</StorageDocument>

A simple archive script which uses shell commands to copy files to and from a local archive directory. The archive directory is configured using storage metadata archiveDir and the script uses file metadata (archivePath) to remember where in the archive directory files have been archived. It assumes that the staging area is a local directory, or that archiveDirectlyFromStorages and restoreDirectlyToStorages is set to other local directory storages.

function getFilePath(url) {
  /* Converts the given url to a string that can be used with the cp and rm commands */
  if (url.indexOf('file:///') === 0) {
    url = url.substring(7);
  }
  if (url.indexOf('file:/') === 0) {
      url = url.substring(5);
  }
  if (url.indexOf('/C:/') === 0) {
    url = url.substring(1);
  }
  return url;
}

o = {
  "archive": function(uri, id, data) {
    const sourcePath = getFilePath(uri);
    const sourceFilename = uri.substring(uri.lastIndexOf('/') + 1);
    const archiveDir = getFilePath(data.archiveDir);

    /* Copy the source file to the archive directory */
    logger.log(`Running command "cp ${sourcePath} ${archiveDir}"`);
    const result = shell.exec('cp', sourcePath, archiveDir);
    if (result.exitcode !== 0) {
        throw `Failed to copy file to archive: ${result.err}`;
    } else {
        /* Store the uri where we archived the file so that we can later restore it */
        file.setMetadata('archivePath', `${archiveDir}${sourceFilename}`);
    }
  },
  "restore": function(uri, id, data) {
    /* Read the archivePath metadata, so we know where the archived file is located */
    const archivePath = getFilePath(file.getMetadata('archivePath'));
    const destinationPath = getFilePath(uri);

    /* Copy the file from archive directory to the destination file */
    logger.log(`Running command "cp ${archivePath} ${destinationPath}"`);
    const result = shell.exec('cp', archivePath, destinationPath);
    if (result.exitcode !== 0) {
        throw `Failed to copy file from archive: ${result.err}`;
    }
  },
  "remove": function(id, data) {
    /* Read the archivePath metadata, so we know where the archived file is located */
    const archivePath = getFilePath(file.getMetadata('archivePath'));

    /* Delete the file from archive directory */
    logger.log(`Running command "rm ${archivePath}"`);
    const result = shell.exec('rm', archivePath);
    if (result.exitcode !== 0) {
        throw `Failed to remove file: ${result.err}`;
    }
  }
};

Amazon Glacier ¶

VidiCore can archive files on Amazon Glacier. There are two different ways this can be achieved:

Use a regular S3 storage and transition objects to one of the Glacier storage classes.
Creating a separate Glacier storage and move files from other storages to be archived.

Note

Using a separate Glacier storage is deprecated. Existing installation can continue using it, but when adding Amazon S3 storages for archival the first time we recommend regular S3 storages in conjunction with S3 storage classes.

In order to archive and restore files the user needs all the following policy permission actions for the glacier vault:

glacier:InitiateJob

glacier:GetJobOutput

glacier:DescribeJob

glacier:InitiateMultipartUpload

glacier:ListMultipartUploads

glacier:UploadMultipartPart

glacier:CompleteMultipartUpload

This configuration is made entirely within AWS and instruction on how to configure it can be found here: https://docs.aws.amazon.com/amazonglacier/latest/dev/access-control-identity-based.html

Regular S3 storage in conjunction with storage classes¶

Configure a regular S3 storages of type LOCAL with an s3:/ URI as described here. All files are kept in this storage, also the archived ones. The only difference between archived and non-archived files are their S3 storage classes.

There are several options for defining or changing the storage class of the files in such a storage:

Configure a default storage class via the storageClass storage method metadata
Use S3 lifecycle rules to let AWS transition the files to another storage class - see Object Lifecycle Management
Use S3 object tagging to trigger an AWS lifecycle rules via the VidiCore API.

Regardless of the way how the storage class is set on a file, VidiCore will automatically detect storage class transitions. If a file is in storage class S3 Glacier Flexible Retrieval or Glacier Deep Archive the file state will be ARCHIVED. Archived files need to be restored before being able to use them. Files in any other storage class can be directly used; this also applies to the Glacier Instant Retrieval storage class.

Creating a dedicated Glacier storage¶

To create a storage used solely for Glacier archiving, you need to create a storage with an XML document like this:

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>ARCHIVE</type>
  <bean>GlacierBean</bean>
  <capacity>100000000000000</capacity>
  <metadata>
    <field>
      <key>glacierVaultName</key>
      <value>{vault name}</value>
    </field>
    <field>
      <key>glacierEndpoint</key>
      <value>https://glacier.us-east-1.amazonaws.com/</value>
    </field>
  </metadata>
</StorageDocument>

New in version 21.3.1.

From this version VidiCore has updated the client used for Glacier. In practice this means there is a minor change in how the client operates and how its created. In addition to suppying which Glacier endpoint to use, it is also adviced to supply which signing region you wish to use (the region to use for SigV4 signing of requests, e.g. us-west-1). If this is not supplied, VidiCore will try to guess signing region from the endpoint itself and try to use this.

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>ARCHIVE</type>
  <bean>GlacierBean</bean>
  <capacity>100000000000000</capacity>
  <metadata>
    <field>
      <key>glacierVaultName</key>
      <value>{vault name}</value>
    </field>
    <field>
      <key>glacierEndpoint</key>
      <value>https://glacier.us-east-1.amazonaws.com/</value>
    </field>
    <field>
      <key>glacierSigningRegion</key>
      <value>{signing region}</value>
    </field>
  </metadata>
</StorageDocument>

Glacier storages can now utilize Aws default credentials provider chain to look for credentials. This means that if no credentials are supplied using either metadata or a AwsCredentials.properties file, VidiCore will automatically try to use any credentials found using this chain.

In practice this means VidiCore looks for Glacier credentials in this order:

From metadata using glacierSecretKey and glacierAccessKeyId (if present).

Read from the AwsCredentials.properties file in the credentials directory (if this exist).

Using AWS default credentials provider chain.

Also new in this version is the ability to use IAM roles with Glacier. These can be supplied through metadata and VidiCore will try to assume these if they are set.

Note

When using IAM roles, VidiCore needs to contact Amazons STS service in a specific region to generate credentials, e.g. eu-west-1. This region can be supplied for Glacier in two ways.

Using the metadata field glacierStsRegion

Setting the VidiCore configuration property stsRegion to the desired region.

Please note that if the metadata field for STS region is set, it will take precedence over the property value. If neither of these are set VidiCore will fallback to using the Amazon default region which is us-west-2.

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>ARCHIVE</type>
  <bean>GlacierBean</bean>
  <capacity>100000000000000</capacity>
  <metadata>
    <field>
      <key>glacierRoleArn</key>
      <value>{role arn}</value>
    </field>
    <field>
      <key>glacierRoleExternalId</key>
      <value>{role external id}</value>
    </field>
    <field>
      <key>glacierStsRegion</key>
      <value>{glacier sts region}</value>
    </field>
    <field>
      <key>glacierVaultName</key>
      <value>{vault name}</value>
    </field>
    <field>
      <key>glacierEndpoint</key>
      <value>{glacier endpoint}</value>
    </field>
  </metadata>
</StorageDocument>

New in version 5.3.

Credentials for the storage can now be added as metadata as show in the example below. The secret key will be encrypted when the storage is created or updated.

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>ARCHIVE</type>
  <bean>GlacierBean</bean>
  <capacity>100000000000000</capacity>
  <metadata>
    <field>
      <key>glacierAccessKeyId</key>
      <value>{access key}</value>
    </field>
    <field>
      <key>glacierSecretKey</key>
      <value>{secret key}</value>
    </field>
    <field>
      <key>glacierVaultName</key>
      <value>{vault name}</value>
    </field>
    <field>
      <key>glacierEndpoint</key>
      <value>{glacier endpoint}</value>
    </field>
  </metadata>
</StorageDocument>

Files can then be moved to this storage either using storage rules or by initiating a copy job (Move/copy or create a hard link to another storage). Restore jobs must be initiated using storage rules.

Note that restore jobs typically take several hours, and the restore job will be put in the WAITING state while the restore initiation is in progress. This is to allow other jobs to run during this time.

New in version 5.1.6.

VidiCore adds metadata to files stored in Amazon Glacier or Glacier Deep Archive. Using the key s3ArchiveStorageClass, it is possible to distinguish what storage class the file has. This metadata is then removed if the file is restored.

Transitioning files from/to Glacier¶

There is no way, using the AWS SDK, to directly initiate a transition to the Glacier storage class for a single object. Instead, Object Lifecycle Management must be used. VidiCore will automatically detect when a transition to the Glacier Flexible Retrieval or Glacier Deep Archive class has happened and sets the file to ARCHIVED.

VidiCore jobs automatically restore ARCHIVED files when needed. To explictly restore a file so that it can be read directly, you can use the following request:

PUT {file-resource}/restore¶

Triggers a request to Glacier to initiate a restore.

Once the restore is complete, the file will be put in CLOSED state, and will be available for direct access.

The expirationInDays parameter has to be set and specifies how long the restored files should be available. Once it has expired, it will be removed from direct access and once again end up in the ARCHIVED state.

Query Parameters:	extraData (string[]) – Additional parameters relevant for the restore, in the form of `key=value`. Specify multiple parameters using multiple query parameters. `expirationInDays={number-of-days}` How long the restored files should be available. Deprecated since version 4.9: Use the `expirationInDays` query parameter instead. expirationInDays (integer) – Required. How long the restored files should be available. retrievalTier (string) – Required. Sets the Glacier retrieval tier to use when restoring the file. One of `Expedited`, `Standard` or `Bulk`.
Produces:	text/plain – Informational text.
Status Codes:	400 Invalid Input – Amazon rejects the restore request, a parameter value is invalid. 404 Not Found – The file was not found.

Atempo Digital Archive Integration ¶

VidiCore can archive from and retrieve files to any storage location which has a corresponding agent set up in an Atempo Digital Archive environment. In such a setup, each archive location would be defined as a separate storage, and any agent would also have a corresponding storage.

To set up an Atempo archive storage, use the following storage XML:

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>ARCHIVE</type>
  <bean>AtempoDigitalArchiveBean</bean>
  <capacity>100000000000000</capacity>
  <metadata>
    <field>
      <key>atempoWebServiceEndpoint</key>
      <value>http://atempo-lto/meta/C9ABD9AAD3883A8CFD3EEB922C90B3F3/721b786531/ADA/WS/</value>
    </field>
    <field>
      <key>atempoRootPath/key>
      <value>/</value>
    </field>
    <field>
      <key>atempoArchiveName/key>
      <value>AMM</value>
    </field>
  </metadata>
</StorageDocument>

And for each storage that has a corresponding agent, the storage XML should look like this (note the storage metadata);

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>LOCAL</type>
  <autoDetect>true</autoDetect>
  <method>
    <uri>file:///mnt/storage/</uri>
    <read>true</read>
    <write>true</write>
    <browse>true</browse>
  </method>
  <metadata>
    <field>
      <key>atempoRootPath</key>
      <value>/mnt/atempo-agent1/</value>
    </field>
    <field>
      <key>atempoAgentName</key>
      <value>atempo-agent1</value>
    </field>
  </metadata>
</StorageDocument>

Now, to archive a file residing on the agent storage, all that is needed is to start a normal file copying job. The same mechanism is used to restore from the archive to the agent.

Front Porch Diva Integration ¶

Set up a shared folder which is accessible by both VidiCore and DIVarchive manager.

POST the following document to /API/storage

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
 <type>ARCHIVE</type>
 <capacity>1000000000000</capacity>
 <bean>DIVABean</bean>
 <metadata>
  <field>
   <!-- SSH host -->
   <!-- this is the hostname of the DIVA SSH service -->
   <key>hostname</key>
   <value>187.47.11.109</value>
  </field>
  <field>
   <!-- SSH username -->
   <!-- this is the username for the DIVA SSH service -->
   <key>username</key>
   <value>diva</value>
  </field>
  <field>
   <!-- SSH password -->
   <key>password</key>
   <value>diva</value>
  </field>
  <field>
   <!-- SSH port -->
   <key>port</key>
   <value>22</value>
  </field>

  <field>
   <!-- path to the shared folder on vidispine server -->
   <key>storagePath</key>
   <value>/shared/storage/</value>
  </field>
  <field>
   <!-- hostname or IP address for the DIVA manager -->
   <key>DIVAHostname</key>
   <value>187.47.11.109</value>
  </field>
  <field>
   <!-- TCP port for the DIVA manager -->
   <key>DIVAPort</key>
   <value>9065</value>
  </field>
  <field>
   <!-- Media name designates either a group of tape, or an array of disk
        declared in the configuration where the instance has to be created. -->
   <key>DIVAMediaName</key>
   <value>default</value>
  </field>
  <field>
   <!-- category -->
   <key>DIVACategory</key>
   <value>default</value>
  </field>
  <field>
   <!-- restore is not yet implemented -->
   <key>DIVARestoreDestination</key>
   <value></value>
  </field>
  <field>
   <!-- path to shared folder on DIVA server -->
   <key>DIVAFilePathRoot</key>
   <value>C:/shared/storage/</value>
  </field>
  <field>
   <!-- The value of this option is the name of the source/destination to be used
        by the specified command: archive, restore, copy... This server name
        must be a valid name as configured in the DIVA system. -->
   <key>DIVAServerName</key>
   <value>disk</value>
  </field>
 </metadata>

 <method>
  <uri>file:///shared/storage/</uri>
  <read>true</read>
  <write>true</write>
  <browse>true</browse>
 </method>
</StorageDocument>

To archive a file, copy it to the shared folder and wait for VidiCore to detect its presence. Once VidiCore has found the file, import it to trigger archiving.

Example using curl:

curl -X POST -uadmin:admin 'http://localhost:8080/API/storage/VX-4/file/VX-1/import' -Hcontent-type:application/xml -d '<MetadataDocument/>'

Archiving should begin shortly.