Archive Integration

VidiCore has built in integration with a few archive vendors. For other vendors it is possible to write your own integration scripts which VidiCore then will invoke when a file is to be archived or restored.

Integrating with an archive using JavaScript

In order to get a working integration with an external archive, a special storage must be created with type ARCHIVE. For this integration to work, a script must be associated with the storage (described below).

Additionally, the storage should have a method that points to a location where both VidiCore and the archiving application can read and write files. Below, this is called the “staging area”.

When VidiCore is about to archive a file, the file will first be copied to the staging area and then the integration script will be invoked, which is supposed to read the file from the staging area and archive it.

Similarly, when VidiCore is about to restore a file, the integration script is first invoked, which is supposed to restore the file to the staging area, and then the file is copied from the staging area to the destination storage.

If the archiving application already has access to some of the storages in the system the usage of the staging area can be disabled. This is configured using the storage metadata archiveDirectlyFromStorages and restoreDirectlyToStorages.

Archive script

To enable integration, a JavaScript must be written which will perform the actual archive operation. In order to be as flexible as possible, this script can both make API calls to VidiCore (The api object), and invoke shell operations (The shell object).

The script also has access to a file object, which corresponds to the file entity on the ARCHIVE storage. This object has the following functions defined:

file.getMetadata(key)

If the specified metadata key is set on the file, the value is returned, otherwise null.

file.setMetadata(key, value)

Sets the specified key-value pair as metadata on the file.

file.getAllMetadata()

Returns a map of all file metadata.

The script must as its last assignment define an object with the following properties:

  • archive(uri, id, data) - Invoked when an archive is to be performed.
    • uri: URI to the source file, i.e. the file that should be archived. This will be a URI to the staging area, or a URI to a storage in archiveDirectlyFromStorages.
    • id: the id of the file entity on the ARCHIVE storage.
    • data: storage metadata as a JavaScript object.
  • restore(uri, id, data) - Invoked when a restore is to be performed.
    • uri: URI to the destination file, i.e. the restored file. This will be a URI to the staging area, or a URI to a storage in restoreDirectlyToStorages.
    • id: the id of the file entity on the ARCHIVE storage.
    • data: storage metadata as a JavaScript object.
  • remove(id, data) - Invoked when a delete is to be performed.
    • id: the id of the file entity on the ARCHIVE storage.
    • data: storage metadata as a JavaScript object.
  • restorePartial(uri, id, offset, length, data) - Invoked when a partial restore is to be performed. This function is optional. If it is missing and a partial restore is requested, the restore function will be invoked.
    • uri: URI to the destination file, i.e. the restored file. This will be a URI to the staging area, or a URI to a storage in restoreDirectlyToStorages.
    • id: the id of the file entity on the ARCHIVE storage.
    • offset: the offset in bytes.
    • length: the length in bytes.
    • data: storage metadata as a JavaScript object.

Example

Creating an ARCHIVE storage:

POST /storage
<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>ARCHIVE</type>
  <capacity>1000000000</capacity>
  <method>
    <uri>file:///mnt/archive-staging-area/</uri>
    <read>true</read>
    <write>true</write>
    <browse>false</browse>
  </method>
  <archiveScript><![CDATA[
...
]]></archiveScript>
</StorageDocument>

A simple archive script which uses shell commands to copy files to and from a local archive directory. The archive directory is configured using storage metadata archiveDir and the script uses file metadata (archivePath) to remember where in the archive directory files have been archived. It assumes that the staging area is a local directory, or that archiveDirectlyFromStorages and restoreDirectlyToStorages is set to other local directory storages.

function getFilePath(url) {
  /* Converts the given url to a string that can be used with the cp and rm commands */
  if (url.indexOf('file:///') === 0) {
    url = url.substring(7);
  }
  if (url.indexOf('file:/') === 0) {
      url = url.substring(5);
  }
  if (url.indexOf('/C:/') === 0) {
    url = url.substring(1);
  }
  return url;
}

o = {
  "archive": function(uri, id, data) {
    const sourcePath = getFilePath(uri);
    const sourceFilename = uri.substring(uri.lastIndexOf('/') + 1);
    const archiveDir = getFilePath(data.archiveDir);

    /* Copy the source file to the archive directory */
    logger.log(`Running command "cp ${sourcePath} ${archiveDir}"`);
    const result = shell.exec('cp', sourcePath, archiveDir);
    if (result.exitcode !== 0) {
        throw `Failed to copy file to archive: ${result.err}`;
    } else {
        /* Store the uri where we archived the file so that we can later restore it */
        file.setMetadata('archivePath', `${archiveDir}${sourceFilename}`);
    }
  },
  "restore": function(uri, id, data) {
    /* Read the archivePath metadata, so we know where the archived file is located */
    const archivePath = getFilePath(file.getMetadata('archivePath'));
    const destinationPath = getFilePath(uri);

    /* Copy the file from archive directory to the destination file */
    logger.log(`Running command "cp ${archivePath} ${destinationPath}"`);
    const result = shell.exec('cp', archivePath, destinationPath);
    if (result.exitcode !== 0) {
        throw `Failed to copy file from archive: ${result.err}`;
    }
  },
  "remove": function(id, data) {
    /* Read the archivePath metadata, so we know where the archived file is located */
    const archivePath = getFilePath(file.getMetadata('archivePath'));

    /* Delete the file from archive directory */
    logger.log(`Running command "rm ${archivePath}"`);
    const result = shell.exec('rm', archivePath);
    if (result.exitcode !== 0) {
        throw `Failed to remove file: ${result.err}`;
    }
  }
};

Amazon Glacier

VidiCore can archive files on Amazon Glacier. There are two different ways this can be achieved:

  • Creating a separate Glacier storage and move files from other storages to be archived.

  • Using an S3 storage and transition objects to the Glacier storage class.

  • In order to archive and restore files the user needs all the following policy permission actions for the glacier vault:

    • glacier:InitiateJob
    • glacier:GetJobOutput
    • glacier:DescribeJob
    • glacier:InitiateMultipartUpload
    • glacier:ListMultipartUploads
    • glacier:UploadMultipartPart
    • glacier:CompleteMultipartUpload
  • This configuration is made entirely within AWS and instruction on how to configure it can be found here: https://docs.aws.amazon.com/amazonglacier/latest/dev/access-control-identity-based.html

Creating a dedicated Glacier storage

To create a storage used solely for Glacier archiving, you need to create a storage with an XML document like this:

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>ARCHIVE</type>
  <bean>GlacierBean</bean>
  <capacity>100000000000000</capacity>
  <metadata>
    <field>
      <key>glacierVaultName</key>
      <value>{vault name}</value>
    </field>
    <field>
      <key>glacierEndpoint</key>
      <value>https://glacier.us-east-1.amazonaws.com/</value>
    </field>
  </metadata>
</StorageDocument>

New in version 21.3.1.

From this version VidiCore has updated the client used for Glacier. In practice this means there is a minor change in how the client operates and how its created. In addition to suppying which Glacier endpoint to use, it is also adviced to supply which signing region you wish to use (the region to use for SigV4 signing of requests, e.g. us-west-1). If this is not supplied, VidiCore will try to guess signing region from the endpoint itself and try to use this.

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>ARCHIVE</type>
  <bean>GlacierBean</bean>
  <capacity>100000000000000</capacity>
  <metadata>
    <field>
      <key>glacierVaultName</key>
      <value>{vault name}</value>
    </field>
    <field>
      <key>glacierEndpoint</key>
      <value>https://glacier.us-east-1.amazonaws.com/</value>
    </field>
    <field>
      <key>glacierSigningRegion</key>
      <value>{signing region}</value>
    </field>
  </metadata>
</StorageDocument>

Glacier storages can now utilize Aws default credentials provider chain to look for credentials. This means that if no credentials are supplied using either metadata or a AwsCredentials.properties file, VidiCore will automatically try to use any credentials found using this chain.

In practice this means VidiCore looks for Glacier credentials in this order:

  • From metadata using glacierSecretKey and glacierAccessKeyId (if present).
  • Read from the AwsCredentials.properties file in the credentials directory (if this exist).
  • Using AWS default credentials provider chain.

Also new in this version is the ability to use IAM roles with Glacier. These can be supplied through metadata and VidiCore will try to assume these if they are set.

Note

When using IAM roles, VidiCore needs to contact Amazons STS service in a specific region to generate credentials, e.g. eu-west-1. This region can be supplied for Glacier in two ways.

  • Using the metadata field glacierStsRegion
  • Setting the VidiCore configuration property stsRegion to the desired region.

Please note that if the metadata field for STS region is set, it will take precedence over the property value. If neither of these are set VidiCore will fallback to using the Amazon default region which is us-west-2.

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>ARCHIVE</type>
  <bean>GlacierBean</bean>
  <capacity>100000000000000</capacity>
  <metadata>
    <field>
      <key>glacierRoleArn</key>
      <value>{role arn}</value>
    </field>
    <field>
      <key>glacierRoleExternalId</key>
      <value>{role external id}</value>
    </field>
    <field>
      <key>glacierStsRegion</key>
      <value>{glacier sts region}</value>
    </field>
    <field>
      <key>glacierVaultName</key>
      <value>{vault name}</value>
    </field>
    <field>
      <key>glacierEndpoint</key>
      <value>{glacier endpoint}</value>
    </field>
  </metadata>
</StorageDocument>

New in version 5.3.

Credentials for the storage can now be added as metadata as show in the example below. The secret key will be encrypted when the storage is created or updated.

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>ARCHIVE</type>
  <bean>GlacierBean</bean>
  <capacity>100000000000000</capacity>
  <metadata>
    <field>
      <key>glacierAccessKeyId</key>
      <value>{access key}</value>
    </field>
    <field>
      <key>glacierSecretKey</key>
      <value>{secret key}</value>
    </field>
    <field>
      <key>glacierVaultName</key>
      <value>{vault name}</value>
    </field>
    <field>
      <key>glacierEndpoint</key>
      <value>{glacier endpoint}</value>
    </field>
  </metadata>
</StorageDocument>

Files can then be moved to this storage either using storage rules or by initiating a copy job (Move/copy a file to another storage). Restore jobs must be initiated using storage rules.

Note that restore jobs typically take several hours, and the restore job will be put in the WAITING state while the restore initiation is in progress. This is to allow other jobs to run during this time.

New in version 5.1.6.

VidiCore adds metadata to files stored in Amazon Glacier or Glacier Deep Archive. Using the key s3ArchiveStorageClass, it is possible to distinguish what storage class the file has. This metadata is then removed if the file is restored.

Transitioning files From S3 to Glacier

There is no way, using the AWS SDK, to directly initiate a transition to the Glacier storage class for a single object. Instead, Object Lifecycle Management must be used. VidiCore will automatically detect when a transition to the Glacier class has happened, and put the file in the ARCHIVED state.

To restore a file so that it can be read directly, you can use the following request:

PUT {file-resource}/restore

Triggers a request to Glacier to initiate a restore.

Once the restore is complete, the file will be put in CLOSED state, and will be available for direct access.

The expirationInDays parameter has to be set and specifies how long the restored files should be available. Once it has expired, it will be removed from direct access and once again end up in the ARCHIVED state.

Query Parameters:
  • extraData (string[]) –

    Additional parameters relevant for the restore, in the form of key=value. Specify multiple parameters using multiple query parameters.

    expirationInDays={number-of-days}
    How long the restored files should be available.

    Deprecated since version 4.9: Use the expirationInDays query parameter instead.

  • expirationInDays (integer) – Required. How long the restored files should be available.
  • retrievalTier (string) – Sets the Glacier retrieval tier to use when restoring the file. One of Expedited, Standard or Bulk.
Produces:
  • text/plain – Informational text.
Status Codes:
  • 400 Invalid Input – Amazon rejects the restore request, a parameter value is invalid.
  • 404 Not Found – The file was not found.

Atempo Digital Archive Integration

VidiCore can archive from and retrieve files to any storage location which has a corresponding agent set up in an Atempo Digital Archive environment. In such a setup, each archive location would be defined as a separate storage, and any agent would also have a corresponding storage.

To set up an Atempo archive storage, use the following storage XML:

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>ARCHIVE</type>
  <bean>AtempoDigitalArchiveBean</bean>
  <capacity>100000000000000</capacity>
  <metadata>
    <field>
      <key>atempoWebServiceEndpoint</key>
      <value>http://atempo-lto/meta/C9ABD9AAD3883A8CFD3EEB922C90B3F3/721b786531/ADA/WS/</value>
    </field>
    <field>
      <key>atempoRootPath/key>
      <value>/</value>
    </field>
    <field>
      <key>atempoArchiveName/key>
      <value>AMM</value>
    </field>
  </metadata>
</StorageDocument>

And for each storage that has a corresponding agent, the storage XML should look like this (note the storage metadata);

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>LOCAL</type>
  <autoDetect>true</autoDetect>
  <method>
    <uri>file:///mnt/storage/</uri>
    <read>true</read>
    <write>true</write>
    <browse>true</browse>
  </method>
  <metadata>
    <field>
      <key>atempoRootPath</key>
      <value>/mnt/atempo-agent1/</value>
    </field>
    <field>
      <key>atempoAgentName</key>
      <value>atempo-agent1</value>
    </field>
  </metadata>
</StorageDocument>

Now, to archive a file residing on the agent storage, all that is needed is to start a normal file copying job. The same mechanism is used to restore from the archive to the agent.

Front Porch Diva Integration

Set up a shared folder which is accessible by both VidiCore and DIVarchive manager.

POST the following document to /API/storage

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
 <type>ARCHIVE</type>
 <capacity>1000000000000</capacity>
 <bean>DIVABean</bean>
 <metadata>
  <field>
   <!-- SSH host -->
   <!-- this is the hostname of the DIVA SSH service -->
   <key>hostname</key>
   <value>187.47.11.109</value>
  </field>
  <field>
   <!-- SSH username -->
   <!-- this is the username for the DIVA SSH service -->
   <key>username</key>
   <value>diva</value>
  </field>
  <field>
   <!-- SSH password -->
   <key>password</key>
   <value>diva</value>
  </field>
  <field>
   <!-- SSH port -->
   <key>port</key>
   <value>22</value>
  </field>

  <field>
   <!-- path to the shared folder on vidispine server -->
   <key>storagePath</key>
   <value>/shared/storage/</value>
  </field>
  <field>
   <!-- hostname or IP address for the DIVA manager -->
   <key>DIVAHostname</key>
   <value>187.47.11.109</value>
  </field>
  <field>
   <!-- TCP port for the DIVA manager -->
   <key>DIVAPort</key>
   <value>9065</value>
  </field>
  <field>
   <!-- Media name designates either a group of tape, or an array of disk
        declared in the configuration where the instance has to be created. -->
   <key>DIVAMediaName</key>
   <value>default</value>
  </field>
  <field>
   <!-- category -->
   <key>DIVACategory</key>
   <value>default</value>
  </field>
  <field>
   <!-- restore is not yet implemented -->
   <key>DIVARestoreDestination</key>
   <value></value>
  </field>
  <field>
   <!-- path to shared folder on DIVA server -->
   <key>DIVAFilePathRoot</key>
   <value>C:/shared/storage/</value>
  </field>
  <field>
   <!-- The value of this option is the name of the source/destination to be used
        by the specified command: archive, restore, copy... This server name
        must be a valid name as configured in the DIVA system. -->
   <key>DIVAServerName</key>
   <value>disk</value>
  </field>
 </metadata>

 <method>
  <uri>file:///shared/storage/</uri>
  <read>true</read>
  <write>true</write>
  <browse>true</browse>
 </method>
</StorageDocument>

To archive a file, copy it to the shared folder and wait for VidiCore to detect its presence. Once VidiCore has found the file, import it to trigger archiving.

Example using curl:

curl -X POST -uadmin:admin 'http://localhost:8080/API/storage/VX-4/file/VX-1/import' -Hcontent-type:application/xml -d '<MetadataDocument/>'

Archiving should begin shortly.