Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

NameComments on ApplicabilityReference









Un-supervised

  1. Clustering  hierarchical clusteringk-means, mixture models, DBSCAN, and OPTICS algorithm
  2. Anomaly Detection - Local Outlier Factor, and Isolation Forest
  3. Dimensionality Reduction - Principal component analysis, Independent component analysis, Non-negative matrix factorization, Singular value decomposition

Algorithms


NameComments on ApplicabilityReference
Hierarchical Clustering
  1. (N-1) combination of clusters are formed to choose from.
  2. Expensive and slow. n×n  distance matrix needs to be made.
  3. Cannot work on very large datasets.
  4. Results are reproducible.
  5. Does not work well with hyper-spherical clusters.
  6. Can provide insights into the way the data pts. are clustered.
  7. Can use various linkage methods(apart from centroid).

k-means
  1. Pre-specified number of clusters.
  2. Less computationally intensive.
  3. Suited for large dataset.
  4. Point of start can be random which leads to a different result each time the algorithm runs.
  5. K-means needs circular data. Hyper-spherical clusters.
  6. K-Means simply divides data into mutually exclusive subsets without giving much insight into the process of division.
  7. K-Means uses median or mean to compute centroid for representing cluster.

Gaussian Mixture Models

...