Metadata datasets

A metadata dataset is a set of metadata values that have semantic relations between each other. Datasets can be used to validate metadata documents.

Defining the dataset

A dataset is defined using a RDF document. Vidispine supports creating a dataset using either a RDF/XML document or a TURTLE document.

For example, the geographical hierarchy relations of USA, New York, California, Los Angeles and San Francisco can be defined as follows:

@prefix r: <http://example.com/random/id#> .
@prefix c: <http://example.com/country/#> .
@prefix st: <http://example.com/state/#> .
@prefix city: <http://example.com/city/#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .

c:usa   skos:definition  "country" ;
        skos:member      r:bid1 ;
        skos:prefLabel   "USA" .

r:bid1  a       rdf:Bag ;
        rdf:_1  st:ny ;
        rdf:_2  st:ca .

st:ny   skos:definition  "state" ;
        skos:prefLabel   "New York" .

st:ca   skos:definition  "state" ;
        skos:member      r:bid2 ;
        skos:prefLabel   "California" .

r:bid2  a       rdf:Bag ;
        rdf:_1  c:la ;
        rdf:_2  c:sf .

c:la    skos:definition  "city" ;
        skos:prefLabel   "Los Angeles" .

c:sf    skos:definition  "city" ;
        skos:prefLabel   "San Francisco" .

In the above dataset, five subjects (or resources) have been defined: USA, New York, California, Los Angeles and San Francisco. Each subject has its own id (c:usa, st:ca etc.), and two predicates (or properties) skos:prefLabel and skos:definition; representing the subject’s “display value” and “hierarchical level” respectively. The hierarchical relationships between the subjects are defined by the skos:member and the RDF container rdf:Bag.

You can also use self-defined vocabularies. The example below uses self-defined hasState and hasCity properties to represent the geographical relationship:

@prefix c: <http://example.com/country/#> .
@prefix st: <http://example.com/state/#> .
@prefix city: <http://example.com/city/#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .

c:usa   skos:definition  "country" ;
        skos:prefLabel   "USA" ;
        c:hasState  st:ca , st:ny .

st:ca   st:hasCity   city:la , city:sf ;
        skos:definition  "state" ;
        skos:prefLabel   "California" .

st:ny   skos:definition  "state" ;
        skos:prefLabel   "New York" .

city:sf skos:definition  "city" ;
        skos:prefLabel   "San Francisco" .

city:la skos:definition  "city" ;
        skos:prefLabel   "Los Angeles" .

Create the dataset

The dataset above can be used to create a metadata dataset in Vidispine:

PUT /metadata/dataset/mytestmodel

Content-Type: text/turtle
@prefix c: <http://example.com/country/#> .
@prefix st: <http://example.com/state/#> .
@prefix city: <http://example.com/city/#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .

c:usa   skos:definition  "country" ;
        skos:prefLabel   "USA" ;
        c:hasState  st:ca , st:ny .

st:ca   st:hasCity   city:la , city:sf ;
        skos:definition  "state" ;
        skos:prefLabel   "California" .

st:ny   skos:definition  "state" ;
        skos:prefLabel   "New York" .

city:sf skos:definition  "city" ;
        skos:prefLabel   "San Francisco" .

city:la skos:definition  "city" ;
        skos:prefLabel   "Los Angeles" .

If the display values of a dataset model is changed, make sure to reindex any entities that have metadata set on them using these fields.

Configure metadata fields

After creating the dataset, the metadata fields needs to be configured accordingly:

<MetadataFieldDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <name>rdf_country</name>
  <type>string</type>
  <constraint>
    <dataset>mytestmodel</dataset>
    <levelProperty>skos:definition</levelProperty>
    <levelValue>country</levelValue>
    <value>skos:prefLabel</value>
  </constraint>
</MetadataFieldDocument>
<MetadataFieldDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <name>rdf_state</name>
  <type>string</type>
  <constraint>
    <dataset>mytestmodel</dataset>
    <levelProperty>skos:definition</levelProperty>
    <levelValue>state</levelValue>
    <value>skos:prefLabel</value>
    <parent>rdf_country</parent>
  </constraint>
</MetadataFieldDocument>
<MetadataFieldDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <name>rdf_city</name>
  <type>string</type>
  <constraint>
    <dataset>mytestmodel</dataset>
    <levelProperty>skos:definition</levelProperty>
    <levelValue>city</levelValue>
    <value>skos:prefLabel</value>
    <parent>rdf_state</parent>
  </constraint>
</MetadataFieldDocument>

The configuration above defines three metadata fields: rdf_country, rdf_state and rdf_city, whose values are restricted by the metadata dataset mytestmodel.

  • <dataset>: Which dataset the field value should be validated against.
  • <levelProperty>: Should be the property in the dataset that defines a level value.
  • <levelValue>: Which level in the dataset does this metadata field belong to.
  • <value>: The display value of the metadata field.
  • <parent>: (Optional). The parent field if any. This defines the field hierarchy. Fields in the same hierarchy will be validated together.
  • <validationGroup>: (Optional). Containing an ordered, comma separated value, defining which fields should be validated together, and the hierarchical (validation) order of those fields.

Changed in version 4.17.2: The parent element was added. The validationGroup element was deprecated.

Updating metadata

There are two ways to post metadata documents containing semantically related fields:

  1. Using the value directly, like we have always been doing:
<?xml version="1.0" encoding="UTF-8"?>
<MetadataDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <timespan start="-INF" end="+INF">
    <field>
      <name>rdf_country</name>
      <value>USA</value>
    </field>
    <field>
      <name>rdf_state</name>
      <value>New York</value>
    </field>
  </timespan>
</MetadataDocument>
  1. Using the corresponding subject id from the dataset:
<?xml version="1.0" encoding="UTF-8"?>
<MetadataDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <timespan start="-INF" end="+INF">
    <field>
      <name>rdf_country</name>
      <value id="c:usa"/>

      <!-- or the full URI -->
      <!-- <value id="http://example.com/country#usa"/> -->
    </field>
    <field>
      <name>rdf_state</name>
      <value id="st:ny"/>
    </field>
  </timespan>
</MetadataDocument>

The resulting value will contain both id and the “display value”:

GET item/(item-id)/metadata
...
<field uuid="783a6bc1-7917-4aa4-9d37-0c1dd4f6787f">
    <name>rdf_country</name>
    <value id="c:usa" uuid="459b3d37-c3a1-4ae6-8ad7-ab4a934f3a42">USA</value>
</field>
<field uuid="41ffab92-e984-4a36-b8ed-2aae41be56e6">
    <name>rdf_state</name>
    <value id="st:ny" uuid="898cb68c-b661-4923-8ded-3e9cb02a200b">New York</value>
</field>
...

The includeConstraintValue query parameter can be used to only fetch the “display value” of the specified fields:

GET item/(item-id)/metadata?includeConstraintValue=rdf_country
...
<field uuid="783a6bc1-7917-4aa4-9d37-0c1dd4f6787f">
    <name>rdf_country</name>
    <value id="c:usa" uuid="459b3d37-c3a1-4ae6-8ad7-ab4a934f3a42">USA</value>
</field>
<field uuid="41ffab92-e984-4a36-b8ed-2aae41be56e6">
    <name>rdf_state</name>
    <value id="st:ny" uuid="898cb68c-b661-4923-8ded-3e9cb02a200b"/>
</field>
...

Searching for dataset values

New in version 4.17.3.

Searching for entities with metadata from a dataset can be done via regular search. See Search

For example:

<ItemSearchDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <field>
    <name>rdf_state</name>
    <value>New York</value>
  </field>
</ItemSearchDocument>

Or:

<ItemSearchDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <text>New York</text>
</ItemSearchDocument>

Validation of metadata values

Vidispine will try to validate the incoming metadata document accordingly to the constraint configured on the metadata fields. Fields defined in the same hierarchy and that belongs to the same timespan and metadata group will be validated together.

For example, this is an invalid document because London is not a city in USA according to the metadata dataset:

<?xml version="1.0" encoding="UTF-8"?>
<MetadataDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <timespan start="-INF" end="+INF">
    <field>
      <name>rdf_city</name>
      <value>London</value>
    </field>
    <field>
      <name>rdf_country</name>
      <value>USA</value>
    </field>
  </timespan>
</MetadataDocument>

This is an invalid document because the field values in my_test_group are not correct:

<?xml version="1.0" encoding="UTF-8"?>
<MetadataDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <timespan start="-INF" end="+INF">
    <field>
      <name>rdf_city</name>
      <value>Los Angeles</value>
    </field>
    <field>
      <name>rdf_state</name>
      <value>California</value>
    </field>
    <group>
      <name>my_test_group</name>
      <field>
        <name>rdf_city</name>
        <value>Los Angeles</value>
      </field>
      <field>
        <name>rdf_state</name>
        <value>New York</value>
      </field>
    </group>
  </timespan>
</MetadataDocument>

This is a valid document, because rdf_state2 belongs to a different hierarchy than rdf_city and rdf_state belong to. And they all contain valid values:

<?xml version="1.0" encoding="UTF-8"?>
<MetadataDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <timespan start="-INF" end="+INF">
    <field>
      <name>rdf_city</name>
      <value>Los Angeles</value>
    </field>
    <field>
      <name>rdf_state</name>
      <value>California</value>
    </field>
    <field>
      <name>rdf_state2</name>
      <value>New York</value>
    </field>
  </timespan>
</MetadataDocument>

Retrieving allowed values

To get all allowed value of a metadata field:

GET metadata-field/rdf_city/allowed-values

Response:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ConstraintValueListDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <value id="city:man">Manchester</value>
  <value id="city:nyc">New York City</value>
  <value id="city:la">Los Angeles</value>
  <value id="city:buf">Buffalo</value>
  <value id="city:sf">San Francisco</value>
  <value id="city:roc">Rochester</value>
  <value id="city:lnd">London</value>
</ConstraintValueListDocument>

To find all allowed values:

POST metadata-field/rdf_city/allowed-values
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<MetadataFieldValueConstraintListDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <constraint>
    <field>rdf_country</field>
    <value>USA</value>
    <!-- or use the constraint subject id -->
    <id>http://example.com/country#usa</id>
  </constraint>
  <constraint>
    <field>rdf_state</field>
    <value>New York</value>
  </constraint>
</MetadataFieldValueConstraintListDocument>

Response:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ConstraintValueListDocument xmlns="http://xml.vidispine.com/schema/vidispine">
  <value id="city:nyc">New York City</value>
</ConstraintValueListDocument>