$K$NN Predictor

Labels / Tags

  • Predictor
  • AnyType

Principle

The problem of the K Nearest Neighbors often abbreviated K-NN consists for a given set S of data to retrieve the K neirest neighbors for a particular query given a dissimilarity measure.

Scalability

When dissimilarity measure complexity is considered negligeable the computational complexity of a KNN query is in O(K.n) where :

  • K is the number of neirest neighbors to look for.
  • n is the number of data observations.

Be sure to know the computational complexity of used dissimilarity measure.

Input

Single query

A value of any type R.

Multiple queries, i.e collection of data observations

A collection of values of type R.

Parameters

1 : Instance[R], i.e. a collection of values of any handled type symbolized by R

In order to predict $K$NN for a query, we have to look into a set of observations. It is here represented by the Instance model which is a collection of values of type R.

R can be any regular HephIA value type such as :

  • Numerical vector.
  • Binary vector.
  • Mixed vector.
  • Monovariate Time Series.
  • Multivariate Time Series.

2 : $K$, the integer value of how many neirest neighbors have to be returned

Output

Single query

Returned type is the set of KNN values For a given query of type R

Multiple queries, i.e. a collection of queries

Returned a collection of size of queries number where each element is the set of KNN values for each query.

Associate visualizations

Practical strategies

Business case

Usage