streams and dataset IDs – the hidden champions
The O2A STREAMS provide near real-time (NRT) data, either for monitoring purposes in O2A DASHBOARDS or to use the data in downstream applications, such as follow-polarstern or in sea ice portal.
With the release of streams users are able to post "their" data to the NRT database themselves. Only constrains are:
- the user must be in the contact list of the item in REGISTRY where the codes shall be written to
- a viable o2a token is used to post data
- submitted data goes along with the NRT standards -> reminder what this standard is about
get metadata
python
import requests
import json
import pandas as pd
from _io import StringIO
api_url = "https://ingest.o2a-data.de/rest/"
urn1 = "vessel:meteor:tsg_meteor:tsg_stb_meteor:salinity"
time1 = "2025-08-02T00:00:00"
time2 = "2025-08-02T23:59:59"
resp = requests.get(api_url +
"datasets?where=streams.code=IN=(" +
urn1 +
");datetimeMax<='" +
time2 +
"';datetimeMin>='" +
time1 +
"'"
)
json.loads(resp.content)The resulting json output is:
json
{
"offset": 0,
"hits": 1,
"totalHits": 1,
"records": [
{
"id": 4839394,
"name": "",
"datetime": "2025-08-04T02:35:51.587745",
"datetimeMin": "2025-08-02T00:00:00",
"datetimeMax": "2025-08-02T23:59:58",
"values": 699686,
"username": ""
}
],
"duration": 41
}In this case only one record is available for the request.
idrefers to the dataset ID, it is a unique identifierdatetimeis the time when the dataset entered the databasedatetimeMinis the earliest time of the data itselfdatetimeMaxis the latest time of the data itselfvaluesis the count of all values in the dataset, it also includes the datetime elementusernameis the username who was responsible for ingestion, if''ori.ngest@awi.deis the author the dataset was ingested centralized
Be aware that rather unspecific requests might result in huge server-side responses. Therefore it is strongly recommended apply filtering on the server-side by rsql (some hints might be found here ). Larger responses need to be paginated.
If you know a specific dataset ID its context info can be retrieved like this:
python
resp = requests.get(api_url +
"datasets/" +
str(4839394)
)
json.loads(resp.content)The output looks familiar:
json
{
"id": 4839394,
"name": "",
"datetime": "2025-08-04T02:35:51.587745",
"datetimeMin": "2025-08-02T00:00:00",
"datetimeMax": "2025-08-02T23:59:58",
"values": 699686,
"username": ""
}There is more to discover -- the streams itself. Basically each stream represents one parameter URN and the corresponding data. We read the streams for the dataset ID 4839394 and print the first three items of the output (list).
python
resp = requests.get(api_url +
"datasets/" +
str(4839394) +
"/streams"
)
json.loads(resp.content)[0:3]itemIdnumeric integer id of the item as in REGISTRYitemUuidtechnically a string, uuid of the item as in REGISTRYitemUrlthe link leading to the item in REGISTRYcodestring, the parameter code as in REGISTRYidinteger, the numeric of the data streamunitstring, the unit in which the data is measured
json
[
{
"itemId": 6072,
"itemUuid": "0cc456ca-517b-475e-ad41-f0a42b3ac36c",
"itemUrl": "https://registry.o2a-data.de/items/6072",
"code": "vessel:meteor:course",
"id": 4291,
"unit": ""
},
{
"itemId": 6072,
"itemUuid": "0cc456ca-517b-475e-ad41-f0a42b3ac36c",
"itemUrl": "https://registry.o2a-data.de/items/6072",
"code": "vessel:meteor:headt",
"id": 7792,
"unit": "deg"
},
{
"itemId": 6072,
"itemUuid": "0cc456ca-517b-475e-ad41-f0a42b3ac36c",
"itemUrl": "https://registry.o2a-data.de/items/6072",
"code": "vessel:meteor:poslat",
"id": 4287,
"unit": ""
}
]As can be seen poslat and headt are not properly defined in REGISTRY, since the units are empty ''. Nonetheless the respective data streams contain numeric content.
get data data
By a litte extension of the call the data itself can be downloaded as well. In this case the content is retrieved as json. As a json it gives some extra. By asking for the keys it can be seen what is available.
python
resp = requests.get(api_url +
"datasets/" +
str(4839394) +
"/data" +
"?format=application/json",
headers = {"accept": "application/json"},
)
a=json.loads(resp.content)
a.keys()
dict_keys(['datetimeMin', 'datetimeMax', 'withQualityFlags', 'sensors', 'data'])
a['sensors']
['vessel:meteor:course', 'vessel:meteor:headt', 'vessel:meteor:poslat', 'vessel:meteor:poslon', 'vessel:meteor:sound_velocity', 'vessel:meteor:speed_over_ground', 'vessel:meteor:tsg_meteor:tsg_bb_meteor:conductivity', 'vessel:meteor:tsg_meteor:tsg_bb_meteor:sound_velocity_external', 'vessel:meteor:tsg_meteor:tsg_bb_meteor:density', 'vessel:meteor:tsg_meteor:tsg_bb_meteor:salinity', 'vessel:meteor:tsg_meteor:tsg_bb_meteor:sound_velocity_internal', 'vessel:meteor:tsg_meteor:tsg_bb_meteor:water_temperature_sbe45', 'vessel:meteor:tsg_meteor:tsg_bb_meteor:water_temperature_sbe38', 'vessel:meteor:tsg_meteor:tsg_stb_meteor:conductivity', 'vessel:meteor:tsg_meteor:tsg_stb_meteor:sound_velocity_external', 'vessel:meteor:tsg_meteor:tsg_stb_meteor:density', 'vessel:meteor:tsg_meteor:tsg_stb_meteor:salinity', 'vessel:meteor:tsg_meteor:tsg_stb_meteor:sound_velocity_internal', 'vessel:meteor:tsg_meteor:tsg_stb_meteor:water_temperature_sbe45', 'vessel:meteor:tsg_meteor:tsg_stb_meteor:water_temperature_sbe38']
a['data'][0:2]
[['2025-08-02T00:00:00.000', 309.1, 309.2, 43.927850416666665, -35.98171421666667, 1525.6, 10.5, None, None, None, None, None, None, None, None, None, None, None, None, None, None], ['2025-08-02T00:00:01.000', 308.5, 309.1, 43.92788051666667, -35.981766633333336, 1525.6, 10.4, None, None, None, None, None, None, None, None, None, None, None, None, None, None]]A little less structured, but much more performative (from a database perspective) is the retrieval of tab-separated values. The tabular output is converted to a pandas dataframe.
python
resp = requests.get(api_url +
"datasets/" +
str(4839394) +
"/data" +
"?format=text/tab-separated-values"
)
a = pd.read_csv(StringIO(resp.text), sep="\t")
a.columns
Index(['datetime', 'vessel:meteor:course []', 'vessel:meteor:headt [deg]',
'vessel:meteor:poslat []', 'vessel:meteor:poslon []',
'vessel:meteor:sound_velocity [m/s]',
'vessel:meteor:speed_over_ground [knot]',
'vessel:meteor:tsg_meteor:tsg_bb_meteor:conductivity [S/m]',
'vessel:meteor:tsg_meteor:tsg_bb_meteor:sound_velocity_external [m/s]',
'vessel:meteor:tsg_meteor:tsg_bb_meteor:density [kg/m3]',
'vessel:meteor:tsg_meteor:tsg_bb_meteor:salinity [PSU]',
'vessel:meteor:tsg_meteor:tsg_bb_meteor:sound_velocity_internal [m/s]',
'vessel:meteor:tsg_meteor:tsg_bb_meteor:water_temperature_sbe45 [°C]',
'vessel:meteor:tsg_meteor:tsg_bb_meteor:water_temperature_sbe38 [°C]',
'vessel:meteor:tsg_meteor:tsg_stb_meteor:conductivity [S/m]',
'vessel:meteor:tsg_meteor:tsg_stb_meteor:sound_velocity_external [m/s]',
'vessel:meteor:tsg_meteor:tsg_stb_meteor:density [kg/m3]',
'vessel:meteor:tsg_meteor:tsg_stb_meteor:salinity [PSU]',
'vessel:meteor:tsg_meteor:tsg_stb_meteor:sound_velocity_internal [m/s]',
'vessel:meteor:tsg_meteor:tsg_stb_meteor:water_temperature_sbe45 [°C]',
'vessel:meteor:tsg_meteor:tsg_stb_meteor:water_temperature_sbe38 [°C]'],
dtype='object')
a.head()
datetime vessel:meteor:course [] ... vessel:meteor:tsg_meteor:tsg_stb_meteor:water_temperature_sbe45 [°C] vessel:meteor:tsg_meteor:tsg_stb_meteor:water_temperature_sbe38 [°C]
0 2025-08-02T00:00:00.000 309.1 ... NaN NaN
1 2025-08-02T00:00:01.000 308.5 ... NaN NaN
2 2025-08-02T00:00:02.000 308.2 ... NaN NaN
3 2025-08-02T00:00:03.000 308.4 ... NaN NaN
4 2025-08-02T00:00:04.000 308.8 ... NaN NaN
[5 rows x 21 columns]