Skip to content
O2A Documentation

Vocabulary Mapping

AuthorPeter Konopatzky
Technical ContactsPeter Konopatzky, Andreas Walter
Version0.1

This document formely lived on AWI Confluence and is only relevant for dataproducts based on the (deprecated) GeoCSV (.sdi.tab + .sdi.meta.json) as exchange format.

In context of MareHUB and the Viewer on Marine Data data from different sources gets processed and provided as OGC Web Services. Mapping incoming data to unified names (target vocabulary) is part of this process. This page offers technical file specifications used during this process. It is not about the process (however, if in special cases there is process information, it is marked as such). For general information on processes, see the page on Standard Operating Procedures.

Most important related reading might be this page about Data Harmonization and the Mapping Principle.

Overview

There are two "types" of files. Those adding mapping rules and those introducing target vocabulary data can be mapped to. Mapping rules do not work without the according target vocabulary. Both are tab-separated files, only differing in column names.

Base specs as follows:

  • tab-separated text file
  • UTF-8-encoding
  • file extension: .sdi.mapping.tab
  • order of columns matter
  • columns without values (e.g. sphere_name) can be dropped
  • custom columns can be appended but will get ignored (might be useful for future verification, for example)

Target Vocabulary Files

Units

column ordercolumn headervalue is
mandatory
description
1unit_nameyesname of unit
2unit_symbolyessymbol of unit
3+whatevernocan be used for comments, notes or reminders –
will get technically ignored
Example
tsv
unit_name	unit_symbol
meter	m
square meter	
degree	°
degree Celsius	°C
meter per second	m/s
centimeter per second	cm/s
unit_nameunit_symbol
meterm
square meter
degree°
degree Celsius°C
meter per secondm/s
centimeter per secondcm/s

Parameters

Column header for parameter vocabulary is parameter_group because currently it is used as rough grouping instead of precise mapping.

column ordercolumn headervalue is
mandatory
description
1parameter_groupyesname of parameter or parameter group
2parameder_sdnnoSDN of parameter
3parameter_nerc_urinoNERC URI of parameter
4+whatevernocan be used for comments, notes or reminders –
will get technically ignored
Example
tsv
parameter_group
Chlorophyll
Salinity
Sample ID
Temperature
parameter_group
Chlorophyll
Salinity
Sample ID
Temperature

Methods

Column header for method vocabulary is method_group because currently it is used as rough grouping instead of precise mapping.

column ordercolumn headervalue is
mandatory
description
1parameter_groupyesname of method or method group
2+whatevernocan be used for comments, notes or reminders –
will get technically ignored
Example
tsv
method_group
ungrouped
unspecified
direct
indirect
count
electric
method_group
ungrouped
unspecified
direct
indirect
count
electric

Spheres

Process information: Status Quo is NERC spheres. Please consult AG Seafloor/Ocean Obs (Norbert Anselm) and AG Portal/Viewer (Peter Konopatzky) before introducing other sphere vocabulary.

column ordercolumn headervalue is
mandatory
description
1sphere_nameyesname of sphere
2sphere_sdnnoSDN of sphere
3sphere_nerc_urinoNERC URI of sphere
4+whatevernocan be used for comments, notes or reminders –
will get technically ignored
Status Quo
tsv
sphere_name	sphere_sdn	sphere_nerc_uri
atmosphere	SDN:S21::S21S001	http://vocab.nerc.ac.uk/collection/S21/current/S21S001/1/
water body	SDN:S21::S21S027	http://vocab.nerc.ac.uk/collection/S21/current/S21S027/1/
surface ice	SDN:S21::S21S009	http://vocab.nerc.ac.uk/collection/S21/current/S21S009/1/
rock	SDN:S21::S21S038	http://vocab.nerc.ac.uk/collection/S21/current/S21S038/1/
biota	SDN:S21::S21S037	http://vocab.nerc.ac.uk/collection/S21/current/S21S037/2/
not applicable	SDN:S21::S21S017	http://vocab.nerc.ac.uk/collection/S21/current/S21S017/1/
Earth	SDN:S21::S21S006	http://vocab.nerc.ac.uk/collection/S21/current/S21S006/1/
bed	SDN:S21::S21S003	http://vocab.nerc.ac.uk/collection/S21/current/S21S003/1/
cave atmosphere	SDN:S21::S21S033	http://vocab.nerc.ac.uk/collection/S21/current/S21S033/1/
experiment water sample	SDN:S21::S21S011	http://vocab.nerc.ac.uk/collection/S21/current/S21S011/2/
geological sample	SDN:S21::S21S039	http://vocab.nerc.ac.uk/collection/S21/current/S21S039/1/
groundwater	SDN:S21::S21S005	http://vocab.nerc.ac.uk/collection/S21/current/S21S005/1/
peat	SDN:S21::S21S019	http://vocab.nerc.ac.uk/collection/S21/current/S21S019/1/
rainwater	SDN:S21::S21S020	http://vocab.nerc.ac.uk/collection/S21/current/S21S020/1/
sediment	SDN:S21::S21S022	http://vocab.nerc.ac.uk/collection/S21/current/S21S022/2/
sediment pore water	SDN:S21::S21S023	http://vocab.nerc.ac.uk/collection/S21/current/S21S023/1/
snow	SDN:S21::S21S024	http://vocab.nerc.ac.uk/collection/S21/current/S21S024/1/
stalactite	SDN:S21::S21S034	http://vocab.nerc.ac.uk/collection/S21/current/S21S034/2/
stalagmite	SDN:S21::S21S025	http://vocab.nerc.ac.uk/collection/S21/current/S21S025/1/
suspended particulate material	SDN:S21::S21S026	http://vocab.nerc.ac.uk/collection/S21/current/S21S026/1/
water body plus atmosphere	SDN:S21::S21S028	http://vocab.nerc.ac.uk/collection/S21/current/S21S028/1/
wet sediment	SDN:S21::S21S031	http://vocab.nerc.ac.uk/collection/S21/current/S21S031/1/
sphere_namesphere_sdnsphere_nerc_uri
atmosphereSDN:S21::S21S001http://vocab.nerc.ac.uk/collection/S21/current/S21S001/1/
water bodySDN:S21::S21S027http://vocab.nerc.ac.uk/collection/S21/current/S21S027/1/
surface iceSDN:S21::S21S009http://vocab.nerc.ac.uk/collection/S21/current/S21S009/1/
rockSDN:S21::S21S038http://vocab.nerc.ac.uk/collection/S21/current/S21S038/1/
biotaSDN:S21::S21S037http://vocab.nerc.ac.uk/collection/S21/current/S21S037/2/
not applicableSDN:S21::S21S017http://vocab.nerc.ac.uk/collection/S21/current/S21S017/1/
EarthSDN:S21::S21S006http://vocab.nerc.ac.uk/collection/S21/current/S21S006/1/
bedSDN:S21::S21S003http://vocab.nerc.ac.uk/collection/S21/current/S21S003/1/
cave atmosphereSDN:S21::S21S033http://vocab.nerc.ac.uk/collection/S21/current/S21S033/1/
experiment water sampleSDN:S21::S21S011http://vocab.nerc.ac.uk/collection/S21/current/S21S011/2/
geological sampleSDN:S21::S21S039http://vocab.nerc.ac.uk/collection/S21/current/S21S039/1/
groundwaterSDN:S21::S21S005http://vocab.nerc.ac.uk/collection/S21/current/S21S005/1/
peatSDN:S21::S21S019http://vocab.nerc.ac.uk/collection/S21/current/S21S019/1/
rainwaterSDN:S21::S21S020http://vocab.nerc.ac.uk/collection/S21/current/S21S020/1/
sedimentSDN:S21::S21S022http://vocab.nerc.ac.uk/collection/S21/current/S21S022/2/
sediment pore waterSDN:S21::S21S023http://vocab.nerc.ac.uk/collection/S21/current/S21S023/1/
snowSDN:S21::S21S024http://vocab.nerc.ac.uk/collection/S21/current/S21S024/1/
stalactiteSDN:S21::S21S034http://vocab.nerc.ac.uk/collection/S21/current/S21S034/2/
stalagmiteSDN:S21::S21S025http://vocab.nerc.ac.uk/collection/S21/current/S21S025/1/
suspended particulate materialSDN:S21::S21S026http://vocab.nerc.ac.uk/collection/S21/current/S21S026/1/
water body plus atmosphereSDN:S21::S21S028http://vocab.nerc.ac.uk/collection/S21/current/S21S028/1/
wet sedimentSDN:S21::S21S031http://vocab.nerc.ac.uk/collection/S21/current/S21S031/1/

Mapping Rules Files

Output columns need to hold target vocabulary established via target vocabulary files.

Unit Mapping

column ordercolumn headervalue is
mandatory
description
1unit_stringyesmapping input: any string that should get mapped
2unit_nameyesmapping output: unit name (see Unit Vocabulary)
3+whatevernocan be used for comments, notes or reminders –
will get technically ignored
Example
tsv
unit_string	unit_name	comment
°C	degree Celsius
◦C	degree Celsius	weird alternative degree character, found in GLODAP
?C	degree Celsius	broken encoding, found in COSYNA SOS
degC	degree Celsius
cm/s	centimeter per second
unit_stringunit_namecomment
°Cdegree Celsius
◦Cdegree Celsiusweird alternative degree character, found in GLODAP
?Cdegree Celsiusbroken encoding, found in COSYNA SOS
degCdegree Celsius
cm/scentimeter per second

Parameter Mapping

column ordercolumn headervalue is
mandatory
description
1parameter_stringyesmapping input: any string that should get mapped
2parameter_groupyesmapping output: any known method name/group
(see Parameter Vocabulary)
3sphere_nameyesmapping output: any known sphere name
(see Sphere Vocabulary)
4+whatevernocan be used for comments, notes or reminders –
will get technically ignored
Example
tsv
parameter_string	parameter_group	sphere_name
AirTemperature	Temperature	atmosphere
SeaSurfaceTemperature	Temperature
TEMP_13.0	Temperature
Temperature	Temperature
parameter_stringparameter_groupsphere_name
AirTemperatureTemperatureatmosphere
SeaSurfaceTemperatureTemperature
TEMP_13.0Temperature
TemperatureTemperature

Method Mapping

column ordercolumn headervalue is
mandatory
description
1parameter_groupyesmapping input: any known parameter name
(see Parameter Vocabulary)
2method_stringyesmapping input: any string that should get mapped
3method_groupyesmapping output: any known method name/group
(see Method Vocabulary)
4sphere_namenomapping output: any known sphere name
(see Sphere Vocabulary)
5+whatevernocan be used for comments, notes or reminders –
will get technically ignored
Example
tsv
parameter_group	method_string	method_group
Chlorophyll	High Performance Liquid Chromatography	direct
Chlorophyll	Fluorometry	indirect
Chlorophyll	Acetone extraction (Turner Designs)	indirect
parameter_groupmethod_stringmethod_group
ChlorophyllHigh Performance Liquid Chromatographydirect
ChlorophyllFluorometryindirect
ChlorophyllAcetone extraction (Turner Designs)indirect

Combined Example

Imagine you have the following data you want to have integrated into our SDI. It already comes in handy O2A GeoCSV (note from 2026: deprecated) format, including metadata files. Two data files (and two metadata files) with comparable datausing different vocabulary, and almost none of them using the vocabulary you want.

json
{
  "version": "2.0",
  "events": [
    {
        "name": "Kono's Trip"
    }
  ],
  "parameters": [
    {
      "name": "Caffeine Level",
      "unit": "clicks/minute"
    },
    {
      "name": "Blutalkoholkonzentration",
      "unit": "Promille",
      "method": "ACE Breathalyser AF - 33"
    }
  ]
}
tsv
date_time_start	event_name	Caffeine Level [clicks/minute]	Blutalkoholkonzentration [Promille]	geometry
1982-12-29T11:02:00	Kono's Trip	200	0.20	POINT(-4.3 49.6)
1982-12-29T11:45:00	Kono's Trip	121	1.10	POINT(-4.3 49.6)
1982-12-29T13:21:00	Kono's Trip	84	0.40	POINT(-4.3 49.6)
json
{
  "version": "2.0",
  "events": [
    {
      "name": "Andreas' Adventure"
    }
  ],
  "parameters": [
    {
      "name": "caffeine level",
      "unit": "clicks/min"
    },
    {
      "name": "alcohol concentration",
      "unit": "",
      "method": "YOMA Alcohol Tester"
    }
  ]
}
date_time_start	event_name	caffeine level [clicks/min]	alcohol concentration [‰]	geometry
1982-12-30T11:02:00	Andreas' Adventure	156	0.40	POINT(-1.3 50.6)
1982-12-30T11:45:00	Andreas' Adventure	144	1.00	POINT(-1.3 50.6)
1982-12-30T13:21:00	Andreas' Adventure	112	0.50	POINT(-1.3 50.6)

The following mapping files would be good solution to properly add above data to our SDI and have it integrated into VEF-based viewers. The most important part is the parameter mapping. Without this, data cannot be integrated into our parameter measurement layers. Unit and method mapping are recommended for proper filtering but can be left out. In any case both source strings/names and mapping results will be shown/accessible in viewers.

tsv
unit_name	unit_symbol
permille	
clicks per minute	cpm
tsv
parameter_group
caffeine level
blood alcohol content
tsv
method_group
breathalyzer
tsv
unit_string	unit_name
clicks/minute	clicks per minute
clicks/min◦C	clicks per minute
Promille	permille
permille
tsv
parameter_string	parameter_group
Caffeine Level	caffeine level
caffeine level	caffeine level
Blutalkoholkonzentration	blood alcohol content
alcohol concentration	blood alcohol content
tsv
parameter_group	method_string	method_group
blood alcohol content	ACE Breathalyser AF - 33	breathalyzer
blood alcohol content	YOMA Alcohol Tester	breathalyzer