Representations and observations
This section explains every data representations used in the HephIA product, you will find representation of every kind of input data HephIA can deal with from numerical scalar to binary vectors passing from multivariate time series and unsupervised models.
This chapter contains every used representation in HephIA product and is divided in four sections :
- Domain and processing data
- Regular input data
- Unsupervised models schemas
- Predicted Outputs
- Normalization
- PCA
- Denormalization
- Clustering Internal Indices
- CAH
This schema design imply two JSON template. The first one is for all shared information concerning a given type, it is the representation, and it exists a SINGLE representation JSON for a given type when it is generated. The second one describe each individual data, these are observations and their number goes from $0$ to $N$ where $N$ can often be the number of data points in a given Dataset or the number of prototypes in models which have it.
Click here to go list of representations and observations.
(representation_mandatory_keys)=
Representation
Currently, the whole schema is described as follows and MUST contain these keys :
- _id: The unique hexadecimal String identifying Mongo objects.
- name: Representation name.
- stepId: Unique identifier of associated DAG step.
- domainInformation: Information linked to domain of data application.- customer: Customer name.
 
- dataset: Dictionary with dataset name and mongo collection.- name: Name of the dataset.
- collection: Mongo collection where is store the Dataset object.
 
- project:- id: Unique identifier of associated customer project.
- name: Project name.
 
- processingInfo: Relative information about this processing.- processingId: Long value referring to this processing.
- processingName: User defined name of the processing.
- editionContext: Enumerator which defines rights over the representation.
 
- creationTS: TimeStamp of creation JSON.
- lastestUpdateTS: TimeStamp of last update on this document.
- dataSpecification: It is a dictionary which describes the data itself and it will differ from one representation to another even if some sub keys are mandatory- keyword: Unique and meaningful keyword which identify this representation.
- valueType: Type description of observation- valuekey.- dataType: Observation- valuekey data type.
- structureType: Structure description of- valuekey data type.
 
- meaning: Meaning of the representation.
- view: It is a dictionary containing 2 sub keys.- id: Unique view identifier as a String.
- name: View name (view_TS).
 
- dataLocationId: MongoId of relative DataLocation.
 
(representation_key_mandatory_keys_representations_and_observations)=
dataSpecification sub keys descriptions
The representation key describe the nature of data which is saved and actually contains following MANDATORY keys :
- keyword: Unique and meaningful keyword which identify this representation.
- valueType: Type description of observation key named- value.
- dataType: Observation- valuekey data type.
- structureType: Structure description of- valuekey data type.
- view: It is a dictionary containing 2 sub keys.- id: Unique view identifier as a String.
- name: View name (view_TS).
 
- dataLocationId: MongoId of relative DataLocation.
(value_type_mandatory_keys)=
valueType sub keys in details
- dataType: Observation- valuekey data type.
- structureType: Structure description of- valuekey data type.
(observation_template)=
Observation
Observation schema describe what’s compose an observation, here are list default MANDATORY keys :
- observationId: A Long value which identify an observation.
- value: MUST contain every data defining the observation as a primitive type or a dictionary. Two case are possible :- The value describe specifically the observation, i.e. it contains its data.
- The value is the String “DataLocation”, then the observation is just a pointer to its real data which is specified
by the datLocationIdkey.
 
- representationId: MongoId of associated representation to this observation.
- dataLocationId: Information about where is store this observation, it follows regular dataLocation standard
(rep_and_obs_list)=
List of representations and observations
- Unsupervised models
- Predicted values
- Regular input data
- Domain and processing data
- Normalization
- Principal Component Analysis
- Denoramlization
- Clustering Internal Indices
- Classification Ascendante Hiérarchique