S3 Event SQS Notifications

Scanning an S3 storage can be expensive both in terms of time and money. To make it cheaper to access an S3 bucket, you can configure Vidispine to poll an Amazon SQS queue for S3 events and increase the time between regular storage scans, which are more expensive.

Prerequisites

Assuming that you already have an S3 storage setup in Vidispine, the next step is to create an SQS queue and configure the S3 bucket to send events to that queue. This configuration is made entirely within AWS and instruction on how to configure it can be found here: https://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html

Note

  • The two types of events that Vidispine are interested in are:
  • ObjectCreated:* (All object create)
  • ObjectRemoved:* (All object delete)
  • Vidispine will connect to the SQS queue using the credentials from the S3 storage method URI, so that user must have access to both the bucket and the queue. For the SQS queue the user needs permission for the following actions on the queue:
  • sqs:GetQueueUrl
  • sqs:ReceiveMessage
  • sqs:DeleteMessage
  • sqs:DeleteMessageBatch
  • sqs:PurgeQueue
  • Use one SQS queue per bucket. Don’t send events from multiple buckets to the same queue, as this is not supported by Vidispine.

Configure the storage

  1. To have Vidispine poll a SQS queue instead of scanning a S3 bucket, set the storage method metadata sqsName and sqsEndpoint to enable this feature:

    PUT /storage/VX-1/method/VX-2/metadata/sqsName
    Content-Type: text/plain
    
    s3-event-queue
    
    PUT /storage/VX-1/method/VX-2/metadata/sqsEndpoint
    Content-Type: text/plain
    
    sqs.eu-west-1.amazonaws.com
    
    GET /storage/VX-1
    
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
      ...
      <method>
        <uri>s3://bucketname/</uri>
        ...
        <metadata>
          <field>
            <key>sqsName</key>
            <value>s3-event-queue</value>
          </field>
          <field>
            <key>sqsEndpoint</key>
            <value>sqs.eu-west-1.amazonaws.com</value>
          </field>
        </metadata>
      </method>
      ...
    </StorageDocument>
    
  2. Then make sure that the storage metadata scanOnStart is true (this is the default).

    Due to the distributed nature of Amazon SQS, the messages come unordered. On every start up, Vidispine will need to purge the queue and do a full scan of the storage, to sync the file list with database.

  3. Finally you can configure Vidispine to do regular scans of the storage less often by setting the storage property scanInterval. Vidispine will perform a storage scan every scanInterval second, so setting this to 3600 will make Vidispine scan it once every hour. See When are files scanned? for more information.

You can check the storage method status (lastSuccess, lastFailure, failureMessage) to determine if the configuration is correct or not. For example, if a non-existing queue is specified:

<failureMessage>
   Error polling SQS: The specified queue does not exist for this wsdl version. (...)
</failureMessage>