EZ

Eduzan

Learning Hub

Eduzan
Eduzan / AI & Machine Learning

Unsupervised Learning

Computer Science / AI & Machine Learning tutorial chapter - Published 2025-12-17 - AI & Machine Learning

1. k-Means Clustering:

  • Description: k-Means is a simple and widely used clustering algorithm. It partitions the data into kkk clusters, where each data point belongs to the cluster with the nearest mean.
  • How it works:
    1. Initialize kkk centroids randomly.
    2. Assign each data point to the nearest centroid.
    3. Recalculate the centroids based on the current cluster members.
    4. Repeat steps 2 and 3 until convergence (centroids no longer change).
  • Use Case: Customer segmentation, image compression

2. Hierarchical Clustering:

  • Description: Hierarchical clustering creates a tree of clusters, where each node is a cluster containing its children clusters. This can be done in an agglomerative manner (bottom-up) or a divisive manner (top-down).
  • How it works (Agglomerative):
    1. Start with each data point as a single cluster.
    2. Merge the two closest clusters.
    3. Repeat until all points are merged into a single cluster.
  • Use Case: Creating taxonomies, social network analysis.

3. DBSCAN (Density-Based Spatial Clustering of Applications with Noise):

  • Description: DBSCAN is a density-based clustering algorithm that groups together points that are closely packed together while marking points that are in low-density regions as outliers.
  • How it works:
    1. Identify core points, which are points with at least a minimum number of neighboring points within a certain distance.
    2. Expand clusters from these core points, including all directly reachable points.
    3. Mark points that are not part of any cluster as noise (outliers).
  • Use Case: Clustering in data with noise, spatial data analysis.
End of lesson.