K-Medoids Clustering (K-Medoids Clustering)
Definition
K-Medoids (also known as PAM, Partitioning Around Medoids) is a clustering algorithm that uses actual data points as cluster representatives. Unlike K-Means which uses cluster means as centroids, K-Medoids always selects an actual data point from within the cluster as the center point (Medoid), making it more robust to noise and outliers.
Algorithm Principles
Objective Function
K-Medoids minimizes the sum of distances from data points to cluster centers:
Where is the actual center point of the -th cluster and is the distance function.
PAM Algorithm Steps
- Randomly select data points as initial Medoids
- Repeat until convergence:
- Assignment step: Assign each data point to the nearest Medoid
- Swap step: Try replacing current Medoid with a non-Medoid point; accept replacement if it reduces total cost
Comparison with K-Means
| Characteristic | K-Means | K-Medoids |
|---|---|---|
| Center point type | Cluster mean (may not be data point) | Actual data point in cluster |
| Noise robustness | Low | High |
| Distance function | Euclidean distance | Any distance metric |
| Computational complexity |
Application in Cislunar SSA Architecture Analysis
Klonowski (2025) used K-Medoids clustering to classify cislunar space SSA Pareto-optimal architectures:
- K-Medoids center points are actual architecture configurations, facilitating interpretation and decision-making
- Less sensitive to outliers on the Pareto frontier
- Can identify architecture categories with different orbital family configurations
Core Elements
Mathematical Definition
K-Medoids partitions dataset into clusters, with each cluster represented by an actual data point , minimizing total distance cost.
Key Properties
K-Medoids is robust to noise and outliers because center points are actual data points rather than means. Computational complexity is higher than K-Means, but results are more stable.
Application Scenarios
K-Medoids is suitable for scenarios requiring robust clustering, such as anomaly detection, image segmentation, document clustering, and the SSA architecture classification focused on in this document.
Related Concepts
References
- Kaufman L, Rousseeuw P J. Finding groups in data: an introduction to cluster analysis[M]. John Wiley & Sons, 1990.
- Klonowski M, Owens-Fahrner N, Heidrich C, et al. Cislunar space domain awareness architecture design and analysis for cooperative agents[J]. The Journal of the Astronautical Sciences, 2024.
