streams and dataset IDs – the hidden champions

The O2A STREAMS provide near real-time (NRT) data, either for monitoring purposes in O2A DASHBOARDS or to use the data in downstream applications, such as follow-polarstern or in sea ice portal.

With the release of streams users are able to post "their" data to the NRT database themselves. Only constrains are:

the user must be in the contact list of the item in REGISTRY where the codes shall be written to
a viable o2a token is used to post data
submitted data goes along with the NRT standards -> reminder what this standard is about

get metadata

python

import requests
import json
import pandas as pd
from _io import StringIO

api_url = "https://ingest.o2a-data.de/rest/"
urn1 = "vessel:meteor:tsg_meteor:tsg_stb_meteor:salinity"
time1 = "2025-08-02T00:00:00"
time2 = "2025-08-02T23:59:59"

resp = requests.get(api_url +
                    "datasets?where=streams.code=IN=(" +
                    urn1 +
                    ");datetimeMax<='" +
                    time2 +
                    "';datetimeMin>='" +
                    time1 +
                    "'"
                    )

json.loads(resp.content)

The resulting json output is:

json

{
  "offset": 0,
  "hits": 1,
  "totalHits": 1,
  "records": [
    {
      "id": 4839394,
      "name": "",
      "datetime": "2025-08-04T02:35:51.587745",
      "datetimeMin": "2025-08-02T00:00:00",
      "datetimeMax": "2025-08-02T23:59:58",
      "values": 699686,
      "username": ""
    }
  ],
  "duration": 41
}

In this case only one record is available for the request.

id refers to the dataset ID, it is a unique identifier
datetime is the time when the dataset entered the database
datetimeMin is the earliest time of the data itself
datetimeMax is the latest time of the data itself
values is the count of all values in the dataset, it also includes the datetime element
username is the username who was responsible for ingestion, if '' or i.ngest@awi.de is the author the dataset was ingested centralized

Be aware that rather unspecific requests might result in huge server-side responses. Therefore it is strongly recommended apply filtering on the server-side by rsql (some hints might be found here ). Larger responses need to be paginated.

If you know a specific dataset ID its context info can be retrieved like this:

python

resp = requests.get(api_url +
                    "datasets/" +
                    str(4839394)
                    )

json.loads(resp.content)

The output looks familiar:

json

{
  "id": 4839394,
  "name": "",
  "datetime": "2025-08-04T02:35:51.587745",
  "datetimeMin": "2025-08-02T00:00:00",
  "datetimeMax": "2025-08-02T23:59:58",
  "values": 699686,
  "username": ""
}

There is more to discover -- the streams itself. Basically each stream represents one parameter URN and the corresponding data. We read the streams for the dataset ID 4839394 and print the first three items of the output (list).

python

resp = requests.get(api_url +
                    "datasets/" +
                    str(4839394) +
                    "/streams"
                    )

json.loads(resp.content)[0:3]

itemId numeric integer id of the item as in REGISTRY
itemUuid technically a string, uuid of the item as in REGISTRY
itemUrl the link leading to the item in REGISTRY
code string, the parameter code as in REGISTRY
id integer, the numeric of the data stream
unit string, the unit in which the data is measured

json

[
  {
    "itemId": 6072,
    "itemUuid": "0cc456ca-517b-475e-ad41-f0a42b3ac36c",
    "itemUrl": "https://registry.o2a-data.de/items/6072",
    "code": "vessel:meteor:course",
    "id": 4291,
    "unit": ""
  },
  {
    "itemId": 6072,
    "itemUuid": "0cc456ca-517b-475e-ad41-f0a42b3ac36c",
    "itemUrl": "https://registry.o2a-data.de/items/6072",
    "code": "vessel:meteor:headt",
    "id": 7792,
    "unit": "deg"
  },
  {
    "itemId": 6072,
    "itemUuid": "0cc456ca-517b-475e-ad41-f0a42b3ac36c",
    "itemUrl": "https://registry.o2a-data.de/items/6072",
    "code": "vessel:meteor:poslat",
    "id": 4287,
    "unit": ""
  }
]

As can be seen poslat and headt are not properly defined in REGISTRY, since the units are empty ''. Nonetheless the respective data streams contain numeric content.

get data data

By a litte extension of the call the data itself can be downloaded as well. In this case the content is retrieved as json. As a json it gives some extra. By asking for the keys it can be seen what is available.

python

resp = requests.get(api_url +
                    "datasets/" +
                    str(4839394) +
                    "/data" +
                    "?format=application/json",
                    headers = {"accept": "application/json"},
                    )

a=json.loads(resp.content)

a.keys()
dict_keys(['datetimeMin', 'datetimeMax', 'withQualityFlags', 'sensors', 'data'])

a['sensors']
['vessel:meteor:course', 'vessel:meteor:headt', 'vessel:meteor:poslat', 'vessel:meteor:poslon', 'vessel:meteor:sound_velocity', 'vessel:meteor:speed_over_ground', 'vessel:meteor:tsg_meteor:tsg_bb_meteor:conductivity', 'vessel:meteor:tsg_meteor:tsg_bb_meteor:sound_velocity_external', 'vessel:meteor:tsg_meteor:tsg_bb_meteor:density', 'vessel:meteor:tsg_meteor:tsg_bb_meteor:salinity', 'vessel:meteor:tsg_meteor:tsg_bb_meteor:sound_velocity_internal', 'vessel:meteor:tsg_meteor:tsg_bb_meteor:water_temperature_sbe45', 'vessel:meteor:tsg_meteor:tsg_bb_meteor:water_temperature_sbe38', 'vessel:meteor:tsg_meteor:tsg_stb_meteor:conductivity', 'vessel:meteor:tsg_meteor:tsg_stb_meteor:sound_velocity_external', 'vessel:meteor:tsg_meteor:tsg_stb_meteor:density', 'vessel:meteor:tsg_meteor:tsg_stb_meteor:salinity', 'vessel:meteor:tsg_meteor:tsg_stb_meteor:sound_velocity_internal', 'vessel:meteor:tsg_meteor:tsg_stb_meteor:water_temperature_sbe45', 'vessel:meteor:tsg_meteor:tsg_stb_meteor:water_temperature_sbe38']

a['data'][0:2]
[['2025-08-02T00:00:00.000', 309.1, 309.2, 43.927850416666665, -35.98171421666667, 1525.6, 10.5, None, None, None, None, None, None, None, None, None, None, None, None, None, None], ['2025-08-02T00:00:01.000', 308.5, 309.1, 43.92788051666667, -35.981766633333336, 1525.6, 10.4, None, None, None, None, None, None, None, None, None, None, None, None, None, None]]

A little less structured, but much more performative (from a database perspective) is the retrieval of tab-separated values. The tabular output is converted to a pandas dataframe.

python

resp = requests.get(api_url +
                    "datasets/" +
                    str(4839394) +
                    "/data" +
                    "?format=text/tab-separated-values"
                    )

a = pd.read_csv(StringIO(resp.text), sep="\t")

a.columns
Index(['datetime', 'vessel:meteor:course []', 'vessel:meteor:headt [deg]',
       'vessel:meteor:poslat []', 'vessel:meteor:poslon []',
       'vessel:meteor:sound_velocity [m/s]',
       'vessel:meteor:speed_over_ground [knot]',
       'vessel:meteor:tsg_meteor:tsg_bb_meteor:conductivity [S/m]',
       'vessel:meteor:tsg_meteor:tsg_bb_meteor:sound_velocity_external [m/s]',
       'vessel:meteor:tsg_meteor:tsg_bb_meteor:density [kg/m3]',
       'vessel:meteor:tsg_meteor:tsg_bb_meteor:salinity [PSU]',
       'vessel:meteor:tsg_meteor:tsg_bb_meteor:sound_velocity_internal [m/s]',
       'vessel:meteor:tsg_meteor:tsg_bb_meteor:water_temperature_sbe45 [°C]',
       'vessel:meteor:tsg_meteor:tsg_bb_meteor:water_temperature_sbe38 [°C]',
       'vessel:meteor:tsg_meteor:tsg_stb_meteor:conductivity [S/m]',
       'vessel:meteor:tsg_meteor:tsg_stb_meteor:sound_velocity_external [m/s]',
       'vessel:meteor:tsg_meteor:tsg_stb_meteor:density [kg/m3]',
       'vessel:meteor:tsg_meteor:tsg_stb_meteor:salinity [PSU]',
       'vessel:meteor:tsg_meteor:tsg_stb_meteor:sound_velocity_internal [m/s]',
       'vessel:meteor:tsg_meteor:tsg_stb_meteor:water_temperature_sbe45 [°C]',
       'vessel:meteor:tsg_meteor:tsg_stb_meteor:water_temperature_sbe38 [°C]'],
      dtype='object')

a.head()
                  datetime  vessel:meteor:course []  ...  vessel:meteor:tsg_meteor:tsg_stb_meteor:water_temperature_sbe45 [°C]  vessel:meteor:tsg_meteor:tsg_stb_meteor:water_temperature_sbe38 [°C]
0  2025-08-02T00:00:00.000                    309.1  ...                                                NaN                                                                   NaN
1  2025-08-02T00:00:01.000                    308.5  ...                                                NaN                                                                   NaN
2  2025-08-02T00:00:02.000                    308.2  ...                                                NaN                                                                   NaN
3  2025-08-02T00:00:03.000                    308.4  ...                                                NaN                                                                   NaN
4  2025-08-02T00:00:04.000                    308.8  ...                                                NaN                                                                   NaN

[5 rows x 21 columns]

streams and dataset IDs – the hidden champions ​

get metadata ​

get data data ​

streams and dataset IDs – the hidden champions

get metadata

get data data