$K$NN Predictor
Labels / Tags
- Predictor
- AnyType
Principle
The problem of the K Nearest Neighbors often abbreviated K-NN consists for a given set S of data to retrieve the K neirest neighbors for a particular query given a dissimilarity measure.
Scalability
When dissimilarity measure complexity is considered negligeable the computational complexity of a KNN query is in O(K.n) where :
- K is the number of neirest neighbors to look for.
- n is the number of data observations.
Be sure to know the computational complexity of used dissimilarity measure.
Input
Single query
A value of any type R.
Multiple queries, i.e collection of data observations
A collection of values of type R.
Parameters
1 : Instance[R], i.e. a collection of values of any handled type symbolized by R
In order to predict $K$NN for a query, we have to look into a set of observations. It is here represented by the Instance model which is a collection of values of type R.
R can be any regular HephIA value type such as :
- Numerical vector.
- Binary vector.
- Mixed vector.
- Monovariate Time Series.
- Multivariate Time Series.
2 : $K$, the integer value of how many neirest neighbors have to be returned
Output
Single query
Returned type is the set of KNN values For a given query of type R
Multiple queries, i.e. a collection of queries
Returned a collection of size of queries number where each element is the set of KNN values for each query.