Fork me on GitHub
Wo sind die Excel Daten die Sie verarbeiten wollen ?
Download example :  Example 1 (simple exemple, in english)
Wählen Ändern Entfernen
(Supported extension : .xls or .xlsx - OpenOffice is not supported !)
(Bitte geben Sie einen Link)
Wählen Sie die Sprache der Excel-Datei, die Sie konvertieren möchten ?



 This development was partly founded by the government of Luxembourg (Ministère d'Etat - Service central de législation)
 General Documentation

What is this tool ?

This is an Excel-to-SKOS converter. It can generate SKOS RDF files from Excel spreadsheets structured in a specific way.
Using the same Excel spreadsheet structure, it is also possible to produce other RDF data than SKOS (lists of foaf:Person, of schema:Event, etc.)
This converter does not require any configuration file to work, only the Excel document to convert.

Can I convert any Excel file in RDF ?

No. The spreadsheet has to follow the specific structure described below.

What should the Excel file look like ?

Start by downloading and looking at one of the provided examples above. You can start from one of these files and adapt it. Look at the documentation below for an explanation on the expected spreadsheet format.

Do you know of any similar tools ?

There are other converters from Excel to SKOS or RDF out there :

 Excel File structure Your excel file MUST follow the structure described below to be converted to RDF. Otherwise you will get an exception or an empty RDF file. Download and look at the examples above.

 Spreadsheet processing

Your file can contain any number of sheets. All the sheets are processed, and the extractor attempts to convert RDF from all of them. If the structure of a sheet doesn't correspond to the expected template, the converter simply moves to the next one.

 Sheet header processing

ConceptScheme URI : To be converted to RDF, a sheet MUST contain a URI in cell B1. This is interpreted as the URI of a skos:ConceptScheme.

ConceptScheme metadata : The header CAN contain descriptive metadata of the ConceptScheme, by specifying a property URI in column A, either using a declared prefix (e.g. dct:title, see below) or as a full URI (starting with 'http');

Prefix declaration : Prefixes can be declared in the header :

Default prefixes are already known and don't have to be declared (see below).

Other lines : the header CAN contain other lines that will be ignored if column A does not contain a known prefixed property or the "PREFIX" keyword.

This is how a typical header can look like :

 Sheet body processing

Title row : The body MUST start by a row that declares the property corresponding to each column (e.g. skos:prefLabel, skos:definition), except column A, that will contain the URI for each resource being generated.

This is how a typical title row can look like :

Line : Each line after the title row generates one resource with the URI read from column A. The column A MUST contain the URI of a resource, either as a full URI (starting with 'http'), or using a declared prefix.

Cell : Each cell in a line is processed, and the value is converted to a literal or object property :

This is how a typical body part can look like :

 Generating multilingual values

You can specify the language to be assigned to a column by appending @en (or another language code) to the property declaration in the title row. This also works in the header part for the metadata of the ConceptScheme.

This is an example of multilingual columns declaration :

 Generating values with datatypes

You can specify the datatype to be assigned to a column by appending ^^xsd:date (or another datatype) to the property declaration in the title row.

This is an example of columns declaration with a datatype :

 Generating multiple values

You can specify a separator on a colum by appending (separator=",") (or another separator) to the property declaration in the title row. This indicates that the values in the cells of that columns will be splitted on that separator, and multiple values will be generated. You can combine this with a language or datatype declaration, for example schema:name@en(separator=",").
The alternative is to create multiple columns with the same property, which is allowed.

 Generating skos:Collection with object-to-subject columns

By default, each line in the body generates an instance of skos:Concept. If you need to generate instances of skos:Collection (or other classes, by the way), do the following :

  1. Add a column with the title rdf:type;
  2. Add a column with the title ^skos:member; note the '^' character at the beginning of the column name; this tells the converter to generate the corresponding property (here, skos:member) from the value given in the cell to the URI of the resource generated for this row;
  3. On the row corresponding to the collection, specify skos:Collection in the rdf:type column; for rows corresponding to skos:Concept, you can leave this column empty or specify skos:Concept explicitely if you want;
  4. On each row of skos:Concept that belongs to the collection, enter the collection URI in the ^skos:member column;

This is an example of expressing collections using object-to-subject column :

 Dealing with skos:OrderedCollection and rdf:Lists

If you need to deal with skos:OrderedCollection, do the following :

  1. Add a column with the title rdf:type;
  2. Add a column with the title skos:memberList;
  3. On the row corresponding to the ordered collection, specify skos:OrderedCollection in the rdf:type column; for rows corresponding to skos:Concept, you can leave this column empty or specify skos:Concept explicitely if you want;
  4. On the row corresponding to the ordered collection, in the skos:memberList column, write the list of values like you would do in the Turtle, that is :
    • Put the whole list between parenthesis;
    • Separate each value with a whitespace character;

The same technique can be used to declare any rdf:List (see below to generate plain RDF).

This is an example of expressing ordered collections using rdf:list syntax :

 Default SKOS post-processings

After each line in the body has been converted, the following SKOS post-processings are applied :

skos:inScheme : a skos:inScheme is added to every instance of skos:Concept and skos:Collection, with the value of the ConceptScheme given in cell B1;

skos:broader and skos narrower inverse : the inverse of skos:broader and skos:narrower are automatically added;

skos:hasTopConcept and skos:topConceptOf : every skos:Concept without skos:broader or not referenced by a skos:narrower is given a skos:topConceptOf and its inverse skos:hasTopConcept;

SKOS-XL generation : if requested by the corresponding parameter, labels are turned into SKOS-XL;

 Generating plain RDF (not SKOS)

The converter can actually generate other RDF vocabularies than SKOS. For this :

This is how this kind of file could look like :

 Advanced features

 Creating blank nodes with [...]

The converter understands the blank node syntax with "[...]" : simply put a cell value between square brackets and write the blank node data inside like you would do in a Turtle file. This can be useful to generate references to reified SKOS definitions or SKOS-XL Labels. For example, if a cell with title skos:definition contains the following value :
[ rdf:value "Definition blah blah"; dcterms:created "2017-02-21"^^xsd:date ], then a reference to a blank node will be created. You need to use the prefixes defined in the file in your blank node content. The blank node is parsed exactly as a piece of Turtle, so it can contain any piece of valid Turtle syntax. If anything goes wrong during the parsing, the converter will generate a Literal with the cell content instead.

 Disabling cell conversion with a strikethrough

When working on a file, if you are unsure about the conversion of a certain cell but you don't want to delete the value, use a strikethrough font : the converter will ignore any cell with such a font style. You can keep uncertain values in the files and simply change the font back to normal once the value is validated.

 Named graph management

The converter actually puts all the triples generated in one sheet in a graph with the URI in cell B1. This is usually the same URI as the URI of the ConceptScheme; but in case of processing generic RDF data, this cell B1 can be used to indicate the URI of the graph, with its associated metadata in the header.

 Changing Subject Column

By default, the property in each column is expressed on the subject URI of the first column of the spreadsheet. It is possible to state that a given column is expressed on a subject URI in a different column on the table. To do this, add a column parameter subjectColumn with a reference to the column letter containing the URI of the subject. For example schema:name(subjectColumn="N") means this column is the name of the URI stored in column N.

This is how such a header could look like :

 Referring to URIs using their labels

SKOS involves creating hierarchies of concepts, and connecting related concepts. This involves making references to other concept URIs, typically in a column skos:broader. But URIs can be opaque, and copy-pasting concept URIs across cells can be tedious. The lookupColumn parameter is a mechanism that allows you to reference a concept URI through one of its label (or other unique key) stored in another column. To use it, add a column parameter lookupColumn with a reference to the column in which the string value of this column will be searched. The reference can be either a reference to the Excel column letter or to the corresponding property in which you want to lookup. A typical example is skos:broader(lookupColumn=skos:prefLabel), which means that you want to create a skos:broader having as a value the URI of the Concept that have in its skos:prefLabel column the value you indicate in your skos:broader column.

This is how it would look like, have a look at example 8 in the included examples :

You can view the lookupColumn parameter as the equivalent of Excel "VLOOKUP" / "RECHERCHEV" function, except easier to write.

 Reconcile / lookup values in other sheets

The lookupColumn parameter described above works only to search for a URI within the same sheet. It may be the case that you would like to create a link to an entity that is defined in another sheet of the same file. In that case :

Warning : The sheet you want to search into needs to be _before_ the sheet you search from; the sheets are converted in order, and the reconcile runs a SPARQL query on the content of the already converted sheets. This means you cannot reconcile on a value that is in a sheet _after_ the sheet that contain your column with a reconcile parameter.

The reconciliation is done on the following properties :

 Ignore values in parenthesis

Add parameter ignoreIfParenthesis="true" to the column header parameters to ignore the values if they are between parenthesis. This can be combined with a separator, e.g. skos:broader(separator=";" ignoreIfParenthesis="true"). In this case, if the cell value is "pollution;(water pollution)" then water pollution will be ignored.

This can be used to maintain temporary values in the spreadsheet, waiting for further validation.

 Default prefixes known in the converter

This is the list of known prefixes in the converter. You don't have to declare them in the header.

 Column header parameters reference