$K$NN Predictor

Labels / Tags

Predictor
AnyType

Principle

The problem of the K Nearest Neighbors often abbreviated K-NN consists for a given set S of data to retrieve the K neirest neighbors for a particular query given a dissimilarity measure.

Scalability

When dissimilarity measure complexity is considered negligeable the computational complexity of a KNN query is in O(K.n) where :

K is the number of neirest neighbors to look for.
n is the number of data observations.

Be sure to know the computational complexity of used dissimilarity measure.

Input

Single query

A value of any type R.

Multiple queries, i.e collection of data observations

A collection of values of type R.

Parameters

1 : `Instance[R]`, i.e. a collection of values of any handled type symbolized by `R`

In order to predict $K$NN for a query, we have to look into a set of observations. It is here represented by the Instance model which is a collection of values of type R.

R can be any regular HephIA value type such as :

Numerical vector.
Binary vector.
Mixed vector.
Monovariate Time Series.
Multivariate Time Series.

2 : $K$, the integer value of how many neirest neighbors have to be returned

Output

Single query

Returned type is the set of KNN values For a given query of type R

Multiple queries, i.e. a collection of queries

Returned a collection of size of queries number where each element is the set of KNN values for each query.