Storages

Storages are where Vidispine will store any files that are ingested/created in the system. All files on a storage location will get an entry in the Vidispine database, containing state, file size, hash etc. This is to keep track of any file changes.

For information about files in storage, see Files.

Storages

Storage types

A storage must be designated a type, based on what type of operations are to be performed on the contained files. Operations in this context are transcode, move, delete, and destination (that is, placing new files here).

LOCAL
A Vidispine specific storage, suitable for all operations. Note that LOCAL doesn’t necessarily imply that the storage is physically local. It should however be a dedicated Vidispine storage. That is, files on such storages should not be written to/deleted by any external application.
SHARED
A storage shared with another application, Vidispine will not create new files, nor perform any write operations here.
REMOTE
A storage on a remote computer, files should be copied to a local storage before used.
EXTERNAL
A storage placeholder.
ARCHIVE
A storage meant for archiving, needs a plugin bean or a JavaScript, described in more detail at Archive Integration.
EXPORT
Files are not monitored, but copy operations to here will create a file entry in the database.

Storage states

Storages will have one of the following states:

NONE
Not used.
READY
Operating normally.
OFFLINE
No available storage method could be reached.
FAILED
Currently not used in Vidispine.
DISABLED
Currently not used in Vidispine.
EVACUATING
Storage is being evacuated.
EVACUATED
Evacuating process finished.

For more information about storage evacuation, see section on Evacuating storages.

Storage groups

Storages can be placed in named groups, called storage groups. These storage groups can then be used in Storage rules and Quota rules.

Storage capacity

When a storage is created a capacity can be specified. This is the total number of bytes that is freely available on the storage. The free capacity is calculated as total capacity - sum(file sizes in database list). Note that this means that the size of MISSING and LOST files are included in the used capacity. If you do not expect a file with these states to return, it is best to delete the file entity using the API.

Auto-detecting the storage capacity

By setting the element autoDetect in the StorageDocument you can make Vidispine read the capacity from the file system. This only works if the storage has a storage method that points to the local file system, that is, a file:// URI.

Warning

Do not enable auto-detection for multiple storages located on the same device, as each storage will then have the capacity of the device. This means that storages may appear to have free space in Vidispine, when there is actually no space left on the device.

Storage cleanup

If you have used storage rules to control the placement of files on storages then you may have noticed that files have been copied to the storages selected by the rules, but that files on the source storages have not been removed.

This is by design. Vidispine prefers to keep multiple copies of a file, and only remove the files when a storage is about to become full. The storage high and low watermarks control when files should start to be removed, and when enough files have been removed and storage cleanup should stop.

For example, for a 1 TB storage with a high watermark at 80% and a low watermark at 40%, Vidispine will keep adding files to the storage until the usage exceeds 800 GB. Once that happens cleanup would occur. Files that are deletable, that is, that have a copy on another storage and that is not required to exist according to the storage rules, will be deleted. Cleanup will stop once the usage has reached 400 GB or when there are no more deletable files.

If this behavior is not desirable, then there are two options.

  1. Update the storage rules to specify where files should not exist, using the not element. For example, using <not><any/></not>.

    <StorageRuleDocument xmlns="http://xml.vidispine.com/schema/vidispine">
      <storageCount>1</storageCount>
      <storage>VX-122</storage>
      <not><any/></not>
    </StorageRuleDocument>
    
  2. Set the high watermark on the storage to 0%. Updating the storage rules is preferred as storage cleanup will be triggered continuously if the high watermark is set at a low level.

Evacuating storages

If you would like to delete a storage, but you still have files there which are connected to items, you can first trigger an evacuation of the storage. This will cause Vidispine to attempt to delete redundant files, or move files to other storages. Once the evacuation is complete, the storage will get the state EVACUATED.

Storage methods

Methods are the way Vidispine talks to the storage. Every method has a base URL. See Storage method URIs for the list of supported schemes.

Retrieve a storage to check its status. The storage state shows if the storage is accessible to Vidispine. If a storage is not accessible, then its state will be OFFLINE. Check the failureMessage in the storage methods to find out why. The failure message will be the error from when the last attempt to connect to the storage was made, and will be available even when the storage comes back online again. Compare lastSuccess to lastFailure to determine if the error message is current or not.

If multiple methods are defined for one storage, it is important, in order to avoid inconsistencies, that they all point to the same physical location. E.g. a storage might have one file system method, and one HTTP method. The HTTP URL must point to the same physical location as the file system method.

Storage method examples

Here are some examples of valid storage methods:

  • file:///mnt/vidistorage/
  • ftp://vidispine:pA5sw0rd!?@10.85.0.10/storage/
  • azure://:%2ZmFuODl0MGg0MmJ5ZnZuczc5YmhndjkrZThodnV5Ymhqb2lwbW9lcmN4c2Rmc2Q0NThmdjQ0Mzc4cWF5NGcxNg0Kdjg0NyANCmw3csO2NWk%3D%3D@vsstorage/

Method types

Methods can also be of different type. By default, the type is empty. Only those methods (with empty types) are used by Vidispine when doing file operations, the other methods are ignored, but can be returned, for example when requesting URLs in search results.

New in version 4.1: Credentials are encrypted. This means that passwords cannot be viewed through the API/server logs.

Auto method types

One exception is method type AUTO, or any method type with prefix AUTO-. When a file URL is requested, with such method type, the a no-auth URL will be created (with the method URL as base).

If there is no AUTO method defined, but a file URL is requested with method type AUTO, an implicit one will be used automatically.

GET /item/VX-2406?content=uri&methodType=AUTO
Accept: application/xml
<ItemDocument xmlns="http://xml.vidispine.com/schema/vidispine" id="VX-2406">
  <files>
    <uri>http://vs.example.com:8089/APInoauth/storage/VX-1/file/VX-6537/0.7354486788234469/VX-6537.mp4</uri>
    <uri>http://vs.example.com:8089/APInoauth/storage/VX-1/file/VX-6536/0.7638025887084131/VX-6536.dv</uri>
  </files>
</ItemDocument>

The URL returned is only valid for the duration of fileTempKeyDuration minutes. The expiration timer is reset whenever the URL is used in a new operation (e.g. HEAD or GET).

Method metadata

In addition to select method types, method metadata can be given as instructions for the URI returned. Two metadata values are definied:

format

Specifies if any special format of the URI should be returned. By default, the normal URI is returned. Two values are defined:

SIGNED
Returns a http URI that points contains a signed URI directly to Azure or S3 storage. If a signed URI cannot be generated from the underlying (default) URI, no URI is returned.
SIGNED-AUTO

New in version 4.2.9.

As above, but if no URI can be generated, an AUTO URI (see above) is returned.

expiration
Sets the expiration time of the signed URI, in minutes. If not specified, the expiration time is 60 minutes, unless azureSasValidTime is set.
GET /item/VX-206?content=uri&methodMetadata=format=SIGNED-AUTO
Accept: application/xml
<ItemDocument xmlns="http://xml.vidispine.com/schema/vidispine" id="VX-206">
  <files>
    <uri>https://vstest.s3.amazonaws.com/VX-362.mp4?Expires=1439545041&amp;AWSAccessKeyId=AKIAJCCXQRY2MW4YQUVQ&amp;Signature=UcNdTIm1v1omM%2FaIGaYXf4QNfc%3D</uri>
    <uri>http://vs.example.com:8089/APInoauth/storage/VX-1/file/VX-336/0.7638025117084131/VX-336.dv</uri>
  </files>
</ItemDocument>

Parent directory management

For local file systems (method is using a file:// URI), Vidispine will by default remove empty parent directories when deleting the last file in the directory.

New in version 4.2.5: This can be controlled, either on system level or on storage level. If the storage metadata keepEmptyDirectories is set to true, empty directories are preserved in that storage. Likewise, if the configuration property keepEmptyDirectories is set to true, empty directories are preserved for all storages. Storage configuration overrules system configuration.

Files

When are files scanned?

In order to discover changes made to files, or if any files have been removed/added, Vidispine will scan the storages periodically. It is possible to disable the scanning by not having any methods with browse=true on the storage. The scan interval is also configurable on a per storage basis by setting the scanInterval storage metadata. The value should be in seconds. Setting this to a higher value will lower the I/O load of the device, but any file changes will take longer to be discovered. This also means that file notifications for file changes or file creation will be triggered later for changes occurring outside of Vidispine’s control.

You can force a rescan of a storage by calling POST /storage/(storage-id)/rescan. This will trigger an immediate rescan of a storage if the supervisor is idle. If a supervisor is already busy processing the files then you may notice that the rescan happens some time later.

File States

Files can be in one of the following states:

NONE
Just created, not used.
OPEN
Discovered or created, not yet marked as finished.
CLOSED
File does no longer grow.
UNKNOWN
The current state is not known.
MISSING
File is missing from the file system/storage.
LOST
File has been missing for a longer period. Candidate for restoration from archive.
TO_APPEAR
File will appear on file system/storage, transfer subsystem or transcoder will create it.
TO_BE_DELETED
The file is no longer in use, and will be deleted at the next clean-up sweep.
BEING_READ
File is in use by transfer subsystem or transcoder.
ARCHIVED
File is archived.
AWAITING_SYNC
File will be synchronized by multi-site agent.

Vidispine will mark a file as MISSING when it is first detected that the file no longer exists on the storage. No action is taken for files that are missing. If the file does not appear within the time specified by lostLimit, then the file will be marked as LOST. Lost files will be restored from other copies if such exist.

Items and storages

By default, when creating a new file, Vidispine will choose the LOCAL storage with the highest free capacity. This can be changed in a few different ways:

File hashing

Vidispine will calculate a hash for all files in a storage. This is done by a background process, running continuously. Files are hashed one by one for performance reasons, so if a large number of files are added to the system in a short time span it might take some time for all hashes to be calculated. The default hashing algorithm is SHA-1. This can be changed by setting the configuration property fileHashAlgorithm. See below for a list of supported values.

Additional algorithms

Vidispine can be configured to calculate hashes using additional algorithms by setting the additionalHash metadata field on the storage. It should contain a comma separated list (no spaces) of algorithms. The supported algorithms are:

  • MD2
  • MD5
  • SHA-1
  • SHA-256
  • SHA-384
  • SHA-512

Throttling storage I/O

Vidispine will retrieve information about files on a storage at the configured scan intervals. If you find that the I/O on your local disk drives is high, even when no transfers or transcodes are being performed, then you can try rate limiting the stat calls performed by Vidispine. Do this by setting statsPerSecond or the configuration property statsPerSecond to a suitable limit. During the file system scan, Vidispine will typically perform one stat per file.

An easy way to check if rate limiting the stat calls will have any effect is to disable the storage supervisors in Vidispine. This can be done using PUT /vidispine-service/service/StorageSupervisorServlet/disable. Remember to enable the service afterwards or you will find that Vidispine no longer detects new files on the storages, among other things.

It could also be that it’s the file hashing service that is the cause of the I/O. You should be able to tell which service is behind it by monitoring your disk devices. If there’s a high read activity/a large amount of data read from a device then it could be the file hashing that’s the cause. If the number of read operations per seconds is high then it’s more likely the storage supervisor.

Tip

Use tools such as htop, iotop, dstat and iostat to monitor your systems and devices.

Throttling transfer to and from a storage

New in version 4.0.

It is possible to specify a bandwidth on a storage or a specific storage method. This causes any file transfers involving the specified storage or storage method to be throttled. If multiple transfers take place concurrently, the total bandwidth will be allocated between the transfers. If a bandwidth is set on both the storage and its storage methods, the lowest applicable bandwidth will be used.

To set a bandwidth you can set the bandwidth element in the StorageMethodDocument when creating or updating a storage or storage method. The bandwidth is set in bytes per second.

Example

Updating a storage to set a bandwidth of 50,000,000 bytes per second.

PUT /storage/VX-2
Content-Type: application/xml

<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <type>LOCAL</type>
  <capacity>1000000000</capacity>
  <bandwidth>50000000</bandwidth>
</StorageDocument>

Example

Updating a storage method to set a bandwidth of 20,000,000 bytes per second.

PUT /storage/VX-2/method?uri=http://10.5.1.2/shared/&bandwidth=20000000

Temporary storages for transcoder output

New in version 4.2.3.

The Vidispine transcoder requires that the destination (output) file can be partially updated. This is in order to be able to write header files after the essence has been written.

In previous versions, this is solved by the application server storing the intermediate result as a temporary file on the local file system (/tmp). This requires a lot of space on the application server.

With version 4.2.3, another strategy is available. Instead of storing the result as one file on the application server, several small files are stored directly on the destination file system as “segments”. After the transcode has finished, the segments are merged. On S3 storage, this merging can be done with S3 object(s)-to-object copy.

Control of the segment file strategy is via the useSegmentFiles configuration property.

Storage method URIs

The following URI schemes are defined.

file

Syntax:file:///{path}
Example:file:///mnt/storage/, file:///C:/mystorage/
Note:The URI file://mnt/storage/ is not valid! (But file:/mnt/storage/ is.)

ftp

Syntax:ftp://{user}:{password}@{host}/{path}
Example:ftp://johndoe:secr3t@example.com/mystorage/

New in version 4.1.2: Add query parameter passive=false to force active mode. To set the client side ports used in active mode, set the configuration property ftpActiveModePortRange, the value should be a range, e.g. 42100-42200.

To set the client IP used in active mode, set the configuration property ftpActiveModeIp.

sftp

Syntax:sftp://{user}:{password}@{host}/{path}
Example:sftp://johndoe:secr3t@example.com/mystorage/

http

Syntax:http://{user}:{password}@{host}/{path}
Example:http://johndoe:secr3t@example.com/mystorage/
Note:Requires WebDAV support in host.

https

Syntax:https://{user}:{password}@{host}/{path}
Example:https://johndoe:secr3t@example.com/mystorage/
Note:Requires WebDAV support in host.

omms

Syntax:omms://{userId}:{userKey}@{hostList}/{clusterId}/{vaultId}/
Example:omms://c2f6a2f4-6927-11e1-cc94-ab94bd11183f:some%20secret@10.0.0.3,10.0.0.4/4255378f-dc73-fca3-e40d-5726008b3dac/0a49472d-15d4-12f1-862e-f9708d49267e/
Note:Object Matrix Matrix Store.

s3

Syntax:s3://{accessKey}:{secretKey}@{bucket}/{path}
Example:s3://KDASODSALSDI8U:RxZYlu23NDSIN293002WdlNyq@mystore/storage1/

Storage method metadata keys can be used control the interaction with the storage.

storageClass

The default Amazon S3 storage class that will be used for new files created on an Amazon S3 storage. Can be either standard or reduced

Default:standard

New in version 4.0.3.

azure

Syntax:azure://:{accessKey}@{accountName}/{containerName}
Example:azure://:KLKau23dEE02WdlLiO@companyname/container1/

New in version 4.0.1.

See also

See here for some notes on how to write URIs.