Core concepts

This chapter gathers HephIA’s concepts glossary.

Along your data crunching journeys using HephIA product you will need to manipulate many concepts, some of them have already been established by the community when others are specific to us.

Some of these concepts exist on both side, they often share the same semantic. We simply add the little something that allow you to dig down into your problematics. To make it easier for you will use following syntactic rules to distinguish :

  • Well established concepts on both theoretical and / or technical sides used Roman style.
  • HephIA own concepts are in italic.
  • Concepts merging well established and HephIA ones are in bold.

Page organization :

Concepts

  • Id
  • Model
  • Instance
  • Predictor // is there exists a general concept of predictor, i do not believe it, but check and/or ask others
  • Evaluator // is there exists a general concept of predictor, i do not believe it, but check and/or ask others
  • Evaluation // is there exists a general concept of predictor, i do not believe it, but check and/or ask others
  • Pipe // is there exists a general concept of predictor, i do not believe it, but check and/or ask others
  • Algorithm
  • Scalability
    • Algorithmic complexity
      • Time
      • Space
  • DAG : Directed Acyclic Graph

Minimal required concepts

The minimal required concepts’set include for few various examples :

  • Clustering

    • Cluster
    • Dissimilarity measure
    • ClusterId
  • Dimensions reduction

  • Outliers detection

  • Predictor

  • At least algorithms you want to apply

Quick overlook of some important concepts

  1. Model
  2. TraversableModel
    1. TraversableMemory
  3. GlobalModel
    1. GlobalMemory
  4. TreeModel
    1. TreeMemory
    2. GTreeMemory
    3. GTreeModel
  5. Instance
    1. InstanceMemory
    2. List, Vector, Array, ParVector, ParArray, RDD InstanceMemory/Instance
    3. GInstanceMemory
    4. GInstance
    5. List, Vector, Array, ParVector, ParArray, RDD InstanceMemory/Instance
  6. Extension
    1. GExtension
  7. Clustering
    1. ClusteringEntity
      1. Discrete Distribution
    2. Hard Clustering
      1. HardClusteringEntity
        1. Constant
      2. Aggregator
    3. Soft Clustering
      1. Categorical Clustering
        1. CategoricalDistribution
  8. Cluster
    1. GInstanceMemory
    2. ProtoCluster
  9. Clusters
    1. ProtoClusters
  10. Pipe
  11. Algorithm1
    1. TraversableAlgorithm
      1. GTraversableAlgorithm
    2. InstanceAlgorithm
    3. ClusteringAlgorithm
    4. ClustersAlgorithm
    5. KMeans
    6. IterativeAlgo
  12. Predictor
  13. Evaluator
    1. Evaluation
    2. Internal / External
  14. FuncData
    1. FuncDataExtension
  15. Outlier detection
  16. Dimension reduction
  17. Matrix
  18. NN
    1. KNN
    2. ANN
      1. Hash
  19. Preprocessing
    1. Gradient ascent
    2. Vectorizer