Patch Work Predictor

Labels / Tags

  • Predictor
  • Numerical vector

Principle

PatchWorkPredictor works on numerical vector values and for a given query will affect the ClusterId corresponding to which cell, in PatchWork sense, on which the query is associated.

Scalability

When dissimilarity measure complexity is considered negligeable the computational complexity of a PatchWorkPredictor query is in O(cells)

Where :

  • cells is the number of cells generated by the PatchWork algorithm.

If there are $n$ queries, the complexity will be O(cell.n).

Cells number can drastically grow with numerical vector dimensionality. Cells defining atomic space units, they have the same dimensionality than numerical vectors. And at constant value per dimension, the higher the dimension of cells, the more can exist.

This is a reason why PatchWork algorithm is often executed on low dimensional space.

A query will always be linear with number of cells. But depending the dimensionality of your numerical vectors querying time can vary a lot.

Input

Single query

A numerical vector value.

Multiple queries, i.e collection of numerical vectors.

A collection of numerical vector.

Parameters

1 : PatchWorkModel

The model returned by the PatchWork algorithm.

Output

Single query

Returns the ClusterId of the PatchWork cluster which contains the cell in which the query fall in.

Multiple queries, i.e. a collection of queries of numerical vectors

Returns the HardClustering associates to input data.

Associate visualizations

  • HardClustering

Practical strategies

Business case

Usage