Page History

...

Name	Comments on Applicability	Reference
Hierarchical Clustering	(N-1) combination of clusters are formed to choose from. Expensive and slow. n×n distance matrix needs to be made. Cannot work on very large datasets. Results are reproducible. Does not work well with hyper-spherical clusters. Can provide insights into the way the data pts. are clustered. Can use various linkage methods(apart from centroid).
k-means	Pre-specified number of clusters. Less computationally intensive. Suited for large dataset. Point of start can be random which leads to a different result each time the algorithm runs. K-means needs circular data. Hyper-spherical clusters. K-Means simply divides data into mutually exclusive subsets without giving much insight into the process of division. K-Means uses median or mean to compute centroid for representing cluster.
Gaussian Mixture Models	Pre-specified number of clusters. GMs are somewhat more flexible and with a covariance matrix we can make the boundaries elliptical (as opposed to K-means which makes circular boundaries). Another thing is that GMs is a probabilistic algorithm. By assigning the probabilities to data points, we can express how strong is our belief that a given data point belongs to a specific cluster. GMs usually tend to be slower than K-Means because it takes more iterations to reach the convergence. (The problem with GMs is that they have converged quickly to a local minimum that is not very optimal for this dataset. To avoid this issue, GMs are usually initialized with K-Means.)
DBSCAN	No pre-specified no. of clusters. Computationally a little intensive. Cannot efficiently handle large datasets. Suitable for non-compact and mixed-up arbitrary shaped clusters. Uses density-based clustering. Cannot work well with density varying data points. Not effected by noise or outliers.

DIMENSIONALITY REDUCTION ALGORITHMS

APPLICABILITY

Linear Discriminant Analysis

It is used to find a linear combination of features that characterizes or separates two or more classes of objects or events.

LDA is a supervised

LDA is also used for clustering sometimes. And almost always outperforms logistic regression.

Principle Component Analysis

It performs a linear mapping of the data from a higher-dimensional space to a lower-dimensional space in such a manner that the variance of the data in the low-dimensional representation is maximized.

PCA is unsupervised

...

Space shortcuts

Page tree

Versions Compared

Old Version 18

New Version Current

Key