Domain data

Domain data is the set of raw data features that user set as it and that will not be used during learning workflows. It will be use once models have been trained and include aside models to link it to these last.

Representation

Keys _id, name, stepId, domainInformation, dataset, project, processingInfo, creationTS, latestUpdateTS are unchanged and follow the classic of representation only dataSpecification keys are changed and are described as above :

Currently, only numerical vector can be saved during raw data loading which gives us un single template at this moment.

JSON template for numerical vector processing data representation.

{
  "dataSpecification": {
    "keyword": "domainData",
    "valueType": {
      "dataType": "numerical",
      "structureType": "scalar"
    },
    "meaning": "Domain data",
    "view": {
      "id": "637ce534dd85c10875c4fe26",
      "name": "view_11-22-2022_15:05:24"
    },
    "dataLocationId": "62b18d804ae71c6a0025237a"
  }
}

Observation

The key value has always the same String value “DataLocation”, it means to look on associated representation dataLocation key which points towards Domain data parquet files.

{
  "observationId": 1,
  "value": "DataLocation",
  "representationId": "62b18d804ae71c6a00252379",
  "dataLocationId": "62b18d804ae71c6a00252371"
}