Storages¶
Storages are where Vidispine will store any files that are ingested/created in the system. All files on a storage location will get an entry in the Vidispine database, containing state, file size, hash etc. This is to keep track of any file changes.
For information about files in storage, see Files.
Storages¶
Storage types¶
A storage must be designated a type, based on what type of operations are to be performed on the contained files. Operations in this context are transcode, move, delete, and destination (that is, placing new files here).
- LOCAL
- A Vidispine specific storage, suitable for all operations. Note that LOCAL doesn’t necessarily imply that the storage is physically local. It should however be a dedicated Vidispine storage. That is, files on such storages should not be written to/deleted by any external application.
- SHARED
- A storage shared with another application, Vidispine will not create new files, nor perform any write operations here.
- REMOTE
- A storage on a remote computer, files should be copied to a local storage before used.
- EXTERNAL
- A storage placeholder.
- ARCHIVE
- A storage meant for archiving, needs a plugin bean or a JavaScript, described in more detail at Archive Integration.
- EXPORT
- Files are not monitored, but copy operations to here will create a file entry in the database.
Storage states¶
Storages will have one of the following states:
- NONE
- Not used.
- READY
- Operating normally.
- OFFLINE
- No available storage method could be reached.
- FAILED
- Currently not used in Vidispine.
- DISABLED
- Currently not used in Vidispine.
- EVACUATING
- Storage is being evacuated.
- EVACUATED
- Evacuating process finished.
For more information about storage evacuation, see section on Evacuating storages.
Storage groups¶
Storages can be placed in named groups, called storage groups. These storage groups can then be used in Storage rules and Quota rules.
Storage capacity¶
When a storage is created a capacity can be specified. This is the total
number of bytes that is freely available on the storage. The free capacity
is calculated as total capacity - sum(file sizes in database list)
. Note
that this means that the size of MISSING
and LOST
files are included in the
used capacity. If you do not expect a file with these states to return, it is
best to delete the file entity using the API.
Auto-detecting the storage capacity¶
By setting the element autoDetect
in the StorageDocument you
can make Vidispine read the capacity from the file system. This only works if
the storage has a storage method that points to the local file system,
that is, a file://
URI.
Warning
Do not enable auto-detection for multiple storages located on the same device, as each storage will then have the capacity of the device. This means that storages may appear to have free space in Vidispine, when there is actually no space left on the device.
Storage cleanup¶
If you have used storage rules to control the placement of files on storages then you may have noticed that files have been copied to the storages selected by the rules, but that files on the source storages have not been removed.
This is by design. Vidispine prefers to keep multiple copies of a file, and only remove the files when a storage is about to become full. The storage high and low watermarks control when files should start to be removed, and when enough files have been removed and storage cleanup should stop.
For example, for a 1 TB storage with a high watermark at 80% and a low watermark at 40%, Vidispine will keep adding files to the storage until the usage exceeds 800 GB. Once that happens cleanup would occur. Files that are deletable, that is, that have a copy on another storage and that is not required to exist according to the storage rules, will be deleted. Cleanup will stop once the usage has reached 400 GB or when there are no more deletable files.
If this behavior is not desirable, then there are two options.
Update the storage rules to specify where files should not exist, using the
not
element. For example, using<not><any/></not>
.<StorageRuleDocument xmlns="http://xml.vidispine.com/schema/vidispine"> <storageCount>1</storageCount> <storage>VX-122</storage> <not><any/></not> </StorageRuleDocument>
Set the high watermark on the storage to 0%. Updating the storage rules is preferred as storage cleanup will be triggered continuously if the high watermark is set at a low level.
Evacuating storages¶
If you would like to delete a storage, but you still have files there which are
connected to items, you can first trigger an evacuation of the storage. This
will cause Vidispine to attempt to delete redundant files, or move files to
other storages. Once the evacuation is complete, the storage will get the state
EVACUATED
.
Storage methods¶
Methods are the way Vidispine talks to the storage. Every method has a base URL. See Storage method URIs for the list of supported schemes.
Retrieve a storage to check its status. The storage state
shows if the
storage is accessible to Vidispine. If a storage is not accessible, then its
state will be OFFLINE
. Check the failureMessage
in the storage
methods to find out why. The failure message will be the error from when
the last attempt to connect to the storage was made, and will be available
even when the storage comes back online again. Compare lastSuccess
to lastFailure
to determine if the error message is current or not.
If multiple methods are defined for one storage, it is important, in order to avoid inconsistencies, that they all point to the same physical location. E.g. a storage might have one file system method, and one HTTP method. The HTTP URL must point to the same physical location as the file system method.
Storage method examples¶
Here are some examples of valid storage methods:
file:///mnt/vidistorage/
ftp://vidispine:pA5sw0rd!?@10.85.0.10/storage/
azure://:%2ZmFuODl0MGg0MmJ5ZnZuczc5YmhndjkrZThodnV5Ymhqb2lwbW9lcmN4c2Rmc2Q0NThmdjQ0Mzc4cWF5NGcxNg0Kdjg0NyANCmw3csO2NWk%3D%3D@vsstorage/
Method types¶
Methods can also be of different type. By default, the type is empty. Only those methods (with empty types) are used by Vidispine when doing file operations, the other methods are ignored, but can be returned, for example when requesting URLs in search results.
New in version 4.1: Credentials are encrypted. This means that passwords cannot be viewed through the API/server logs.
Auto method types¶
One exception is method type AUTO
, or any method type with prefix AUTO-
.
When a file URL is requested, with such method type, the a no-auth URL will
be created (with the method URL as base).
If there is no AUTO
method defined, but a file URL is requested with
method type AUTO
, an implicit one will be used automatically.
GET /item/VX-2406?content=uri&methodType=AUTO
Accept: application/xml
<ItemDocument xmlns="http://xml.vidispine.com/schema/vidispine" id="VX-2406">
<files>
<uri>http://vs.example.com:8089/APInoauth/storage/VX-1/file/VX-6537/0.7354486788234469/VX-6537.mp4</uri>
<uri>http://vs.example.com:8089/APInoauth/storage/VX-1/file/VX-6536/0.7638025887084131/VX-6536.dv</uri>
</files>
</ItemDocument>
The URL returned is only valid for the duration of fileTempKeyDuration
minutes. The expiration timer is reset whenever the URL is used in a
new operation (e.g. HEAD or GET).
Method metadata¶
In addition to select method types, method metadata can be given as instructions for the URI returned. Two metadata values are defined:
- format
Specifies if any special format of the URI should be returned. By default, the normal URI is returned. Two values are defined:
- SIGNED
- Returns a
http
URI that points contains a signed URI directly to Azure or S3 storage. If a signed URI cannot be generated from the underlying (default) URI, no URI is returned. - SIGNED-AUTO
New in version 4.2.9.
As above, but if no URI can be generated, an
AUTO
URI (see above) is returned.
- expiration
- Sets the expiration time of the signed URI, in minutes.
If not specified, the expiration time is 60 minutes, unless
azureSasValidTime
is set.
GET /item/VX-206?content=uri&methodMetadata=format=SIGNED-AUTO
Accept: application/xml
<ItemDocument xmlns="http://xml.vidispine.com/schema/vidispine" id="VX-206">
<files>
<uri>https://vstest.s3.amazonaws.com/VX-362.mp4?Expires=1439545041&AWSAccessKeyId=AKIAJCCXQRY2MW4YQUVQ&Signature=UcNdTIm1v1omM%2FaIGaYXf4QNfc%3D</uri>
<uri>http://vs.example.com:8089/APInoauth/storage/VX-1/file/VX-336/0.7638025117084131/VX-336.dv</uri>
</files>
</ItemDocument>
Parent directory management¶
For local file systems (method is using a file://
URI), Vidispine will
by default remove empty parent directories when deleting the last
file in the directory.
New in version 4.2.5: This can be controlled, either on system level or on storage level.
If the storage metadata keepEmptyDirectories
is set to true,
empty directories are preserved in that storage.
Likewise, if the configuration property keepEmptyDirectories
is set to true, empty directories are preserved for all storages.
Storage configuration overrules system configuration.
Files¶
When are files scanned?¶
In order to discover changes made to files, or if any files have been
removed/added, Vidispine will scan the storages periodically. It is
possible to disable the scanning by not having any methods with
browse=true
on the storage. The scan interval is also configurable
on a per storage basis by setting the scanInterval
storage
metadata. The value should be in seconds. Setting this to a higher
value will lower the I/O load of the device, but any file changes
will take longer to be discovered. This also means that file notifications
for file changes or file creation will be triggered later for changes
occurring outside of Vidispine’s control.
You can force a rescan of a storage by calling POST /storage/(storage-id)/rescan
.
This will trigger an immediate rescan of a storage if the supervisor is idle.
If a supervisor is already busy processing the files then you may notice that
the rescan happens some time later.
Avoiding frequent scan of S3 storages¶
New in version 4.4.
Scanning a S3 storage can be expensive both in terms of time and money. To make it cheaper to access a S3 bucket, you can configure Vidispine to poll Amazon SQS for S3 events.
See S3 Event Notifications for more information.
File States¶
Files can be in one of the following states:
- NONE
- Just created, not used.
- OPEN
- Discovered or created, not yet marked as finished.
- CLOSED
- File does no longer grow.
- UNKNOWN
- The current state is not known.
- MISSING
- File is missing from the file system/storage.
- LOST
- File has been missing for a longer period. Candidate for restoration from archive.
- TO_APPEAR
- File will appear on file system/storage, transfer subsystem or transcoder will create it.
- TO_BE_DELETED
- The file is no longer in use, and will be deleted at the next clean-up sweep.
- BEING_READ
- File is in use by transfer subsystem or transcoder.
- ARCHIVED
- File is archived.
- AWAITING_SYNC
- File will be synchronized by multi-site agent.
Vidispine will mark a file as MISSING
when it is first detected that the
file no longer exists on the storage. No action is taken for files that are
missing. If the file does not appear within the time specified by lostLimit
,
then the file will be marked as LOST
. Lost files will be restored from
other copies if such exist.
Items and storages¶
By default, when creating a new file, Vidispine will choose the LOCAL storage with the highest free capacity. This can be changed in a few different ways:
- Setting the
defaultIngestStorage
configuration property. - Supplying the storageId parameter on the import request.
- Using Storage rules.
File hashing¶
Vidispine will calculate a hash for all files in a storage. This is done by a
background process, running continuously. Files are hashed one by one for
performance reasons, so if a large number of files are added to the system
in a short time span it might take some time for all hashes to be calculated.
The default hashing algorithm is SHA-1. This can be changed by setting the
configuration property fileHashAlgorithm
. See below for a list of
supported values.
Additional algorithms¶
Vidispine can be configured to calculate hashes using additional algorithms by
setting the additionalHash
metadata field on the storage. It should
contain a comma separated list (no spaces) of algorithms. The supported
algorithms are:
- MD2
- MD5
- SHA-1
- SHA-256
- SHA-384
- SHA-512
Throttling storage I/O¶
Vidispine will retrieve information about files on a storage at the configured
scan intervals. If you find that the I/O on your local disk drives is high,
even when no transfers or transcodes are being performed, then you can try
rate limiting the stat calls performed by Vidispine. Do this by setting statsPerSecond
or the configuration property statsPerSecond
to a suitable limit.
During the file system scan, Vidispine will typically perform one stat per file.
An easy way to check if rate limiting the stat calls will have any effect
is to disable the storage supervisors in Vidispine. This can be done using
PUT /vidispine-service/service/StorageSupervisorServlet/disable
. Remember to enable the service
afterwards or you will find that Vidispine no longer detects new files on the
storages, among other things.
It could also be that it’s the file hashing service that is the cause of the I/O. You should be able to tell which service is behind it by monitoring your disk devices. If there’s a high read activity/a large amount of data read from a device then it could be the file hashing that’s the cause. If the number of read operations per seconds is high then it’s more likely the storage supervisor.
Tip
Use tools such as htop
, iotop
, dstat
and iostat
to
monitor your systems and devices.
Throttling transfer to and from a storage¶
New in version 4.0.
It is possible to specify a bandwidth on a storage or a specific storage method. This causes any file transfers involving the specified storage or storage method to be throttled. If multiple transfers take place concurrently, the total bandwidth will be allocated between the transfers. If a bandwidth is set on both the storage and its storage methods, the lowest applicable bandwidth will be used.
To set a bandwidth you can set the bandwidth
element in the
StorageMethodDocument when creating or updating a storage or storage
method. The bandwidth is set in bytes per second.
Example¶
Updating a storage to set a bandwidth of 50,000,000 bytes per second.
PUT /storage/VX-2
Content-Type: application/xml
<StorageDocument xmlns="http://xml.vidispine.com/schema/vidispine">
<type>LOCAL</type>
<capacity>1000000000</capacity>
<bandwidth>50000000</bandwidth>
</StorageDocument>
Example¶
Updating a storage method to set a bandwidth of 20,000,000 bytes per second.
PUT /storage/VX-2/method?uri=http://10.5.1.2/shared/&bandwidth=20000000
Temporary storages for transcoder output¶
New in version 4.2.3.
The Vidispine transcoder requires that the destination (output) file can be partially updated. This is in order to be able to write header files after the essence has been written.
In previous versions, this is solved by the application server storing
the intermediate result as a temporary file on the local file system (/tmp
).
This requires a lot of space on the application server.
With version 4.2.3, another strategy is available. Instead of storing the result as one file on the application server, several small files are stored directly on the destination file system as “segments”. After the transcode has finished, the segments are merged. On S3 storage, this merging can be done with S3 object(s)-to-object copy.
Control of the segment file strategy is via the useSegmentFiles
configuration property.
Storage method URIs¶
The following URI schemes are defined.
file
¶
Syntax: | file:///{path} |
---|---|
Example: | file:///mnt/storage/ , file:///C:/mystorage/ |
Note: | The URI file://mnt/storage/ is not valid! (But file:/mnt/storage/ is.) |
ftp
¶
Syntax: | ftp://{user}:{password}@{host}/{path} |
---|---|
Example: | ftp://johndoe:secr3t@example.com/mystorage/ |
New in version 4.1.2: Add query parameter passive=false
to force active mode. To set
the client side ports used in active mode, set the configuration
property ftpActiveModePortRange
, the value should be a range,
e.g. 42100-42200
.
To set the client IP used in active mode, set the configuration
property ftpActiveModeIp
.
sftp
¶
Syntax: | sftp://{user}:{password}@{host}/{path} |
---|---|
Example: | sftp://johndoe:secr3t@example.com/mystorage/ |
http
¶
Syntax: | http://{user}:{password}@{host}/{path} |
---|---|
Example: | http://johndoe:secr3t@example.com/mystorage/ |
Note: | Requires WebDAV support in host. |
https
¶
Syntax: | https://{user}:{password}@{host}/{path} |
---|---|
Example: | https://johndoe:secr3t@example.com/mystorage/ |
Note: | Requires WebDAV support in host. |
omms
¶
Syntax: | omms://{userId}:{userKey}@{hostList}/{clusterId}/{vaultId}/ |
---|---|
Example: | omms://c2f6a2f4-6927-11e1-cc94-ab94bd11183f:some%20secret@10.0.0.3,10.0.0.4/4255378f-dc73-fca3-e40d-5726008b3dac/0a49472d-15d4-12f1-862e-f9708d49267e/ |
Note: | Object Matrix Matrix Store. |
s3
¶
Syntax: | s3://{accessKey}:{secretKey}@{bucket}/{path} |
---|---|
Example: | s3://KDASODSALSDI8U:RxZYlu23NDSIN293002WdlNyq@mystore/storage1/ |
The following query parameters are supported:
endpoint
The endpoint that the S3 requests will be sent to.
See Regions and Endpoints in the Amazon documentation for more information.
New in version 4.4.
region
The region that will be used in the S3 requests.
See Regions and Endpoints in the Amazon documentation for more information.
New in version 4.4.
signer
The algorithm to use to signing requests. Valid values include
S3SignerType
for AWS signature v2, andAWSS3V4SignerType
for AWS signature v4.New in version 4.5.3.
Default: Signature algorithm will be selected by region.
Note
For Version 4 Signature only regions (Beijing and Frankfurt) to work, the endpoint or region parameter must be set. Example:
s3://frankfurt-bucket/?endpoint=s3.eu-central-1.amazonaws.com
s3://frankfurt-bucket/?region=eu-central-1
Storage method metadata keys can be used control the interaction with the storage.
storageClass
The default Amazon S3 storage class that will be used for new files created on an Amazon S3 storage. Can be either
standard
,infrequent
orreduced
Default: standard
New in version 4.0.3.
Changed in version 4.5: Support for infrequent access was added.
sseAlgorithm
The encryption used to encrypt data on the server side. See Server-Side Encryption. By default no encryption will be performed.
This sets the
x-amz-server-side-encryption
header on PUT Object S3 requests.Example: AES256
New in version 4.4.1.
ds3
¶
Syntax: | ds3://{accessKey}:{secretKey}@{bucket}/{path} |
---|---|
Example: | ds3://KDASODSALSDI8U:RxZYlu23NDSIN2Nyq@bucketname/?endpoint=http://blackpearl-endpoint |
Note: | Spectra BlackPearl Deep Storage Gateway. |
New in version 4.5.
The following query parameters are supported:
endpoint
- The endpoint of the BlackPearl service. This is mandatory.
chunkReadyTimeout
The maximum time (in seconds) of waiting for BlackPearl to prepare the target data chunk, or an EOF will be returned.
Default: 1800
checksumType
If set, a client-side checksum will be computed and sent to BlackPearl gateway for data integrity verification. Supported checksum types are:
md5
,crc32
andcrc32c
.Default: Empty, no checksum will be sent.