Dataset template
Dataset
Dataset is an abstraction which describe whole information related to input raw data and every information which will be generated from it including new columns (vectorization, clustering, …) and link to models.
Keys description :
_id: MongoDB unique identifier.name: User defined name of the dataset.observationCollectionName: Optional user defined name for observations collections.representationCollectionName: Optional user defined name for representations collections.
Only required info is the name of the dataset.
Minimal Dataset in Mongo if user only specify the Dataset’s name :
{
"_id": "62bcf05b2bd0146b93d1b1ed",
"name": "my_dataset"
}
At first usage of the dataset, default name for associated representations and observations collections will be set as representations/observations_datasetname where dataset’s name is the one defined by user.
Default generated Dataset if user do not specify observationCollectionName, representationCollectionName keys :
{
"_id": "62bcf05b2bd0146b93d1b1ed",
"name": "my_dataset",
"observationCollectionName": "observations_my_dataset",
"representationCollectionName": "representations_my_dataset"
}
User also have the option to personalise name of both representations and observations collection names.
Fully user defined Dataset :
{
"_id": "62bcf05b2bd0146b93d1b1ed",
"name": "my_dataset",
"observationCollectionName": "custom_observations_name",
"representationCollectionName": "custom_representation_name"
}