Self Organized Map (SOM)
Labels / Tags
- Clustering
- Hard
- Numerical vector
Principle
Self-organizing map is a clustering algorithm, it allow projection in small spaces that are generally two dimensional. The basic model proposed by Kohonen consists on a discrete set $C$ of cells called map. The size of the grid C is denoted by k and must be provided a priori. A variety of self-organizing models is derived from the first original model proposed by Kohonen. All models are different from each other but share the same idea: depict large data-sets on a simple geometric relationship projected on a reduced topology (1D or 2D). This grid has topological order of $k$ cells. Each cell $c$ has its own cluster denoted. Self-organizing process requires neighbourhood functions to preserve topological relationships between cells. Hence the neighbourhood functions are needed to update prototypes.
Scalability
Computational complexity is in O(n.k.iter.d). Where :
- n is the number of data points.
- k is the number of prototypes.
- iter is the number of iterations.
- d is the dimensionality of the data.
Input
- A collection of numerical vector (R^d).
Parameters
- A continous metric, by default Euclidean.
- Stopping criteria. Many strategies have been developped. Actually A, B, C are available.
- K prototypes initialization, also denote as h.w.
- (Optionally) list of K prototypes.
- Maximum number of iteration.
- Neighborhood function.
Output
SOMModel ref_to_SOMModel_type
SOMModel contains the grid of size K=w.h prototypes.
Predictor
If we do not consider the structure between the SOMModel prototype grid, taking them as a collection of prototypes. ClosestPrototypePredictor [ref_to_ClosestPrototype] is a good start, it will allow, for a new data point, to affect the ClusterId of its closest SOMModel’s prototype (numerical vector) on regards of the used dissimilarity measure.
Associated visualization
- SOM like
- 2/3D numerical vector
- Rn numerical vector
Practical strategies
Recommended Associations
Business case
Usage
tools for visualization