K-Mode Clustering in Python
Last Updated : 26 Jun, 2025
K-mode clustering is an unsupervised machine-learning used to group categorical data into k clusters (groups). The K-Modes clustering partitions the data into two mutually exclusive groups. Unlike K-Means which uses distances between numbers K-Modes uses the number of mismatches between categorical values to decide how similar two data points are. For example:
- Data point 1: ["red", "small", "round"]
- Data point 2: ["blue", "small", "square"]
Here there are 2 mismatches (color and shape) so these two are not very similar.
When Should You Use K-Modes?
Use K-Modes when:
- Your dataset contains categorical variables like gender, color, brand etc.
- You want to group customers by product preferences
- You're analyzing survey responses Yes/No, Male/Female etc.
How K-Modes clustering works?
Unlike hierarchical clustering KModes requires us to decide the number of clusters (K) in advance. Here's how it works step by step:
- Start by picking clusters: Randomly select K data points from the dataset to act as the starting clusters these are called "modes".
- Assign data to clusters: Check how similar each data point is to these clusters using the total number of mismatches and assign each data point to the cluster it matches the most.
- Update the clusters: Find the most common value for each cluster and update the cluster centers based on this.
- Repeat the process: Keep repeating steps 2 and 3 until no data points are reassigned to different clusters.
Let X be a set of categorical data objects of X = \begin{bmatrix} x_{11}, & ... & x_{1n}\\ ... & ... & ...\\ x_{n1},& ... & x_{nm} \end{bmatrix} that can be denoted as and the mode of Z is a vector Q = [q_{1},q_{2},...,q_{m}] then minimize
D(X,Q) = \sum_{i=1}^{n}d(X_{i},Q)
Apply dissimilarity metric equation for data objects
D(X,Q) = \sum_{i=1}^{n}\sum_{j=1}^{m}\delta(x_{ij},Q)
Suppose we want to K cluster Then we have Q = [q_{k1},q_{k2},....,q_{km}] \epsilon Q
C(Q) = \sum_{k=1}^{K}\sum_{i=1}^{n}\sum_{j=1}^{m}\delta(x_{ij},q_{kj})
Overall the goal of K-modes clustering is to minimize the dissimilarities between the data objects and the centroids (modes) of the clusters using a measure of categorical similarity such as the Hamming distance.
Implementation of the k-mode clustering algorithm
K-Modes is a way to group categorical data into clusters. Here's how you can do it step-by-step in Python using just NumPy and Pandas.
Step 1: Prepare Your Data
Start by defining your dataset. Each row is a data point and each column contains categorical values like letters or labels.
Python import numpy as np import pandas as pd data = np.array([ ['A', 'B', 'C'], ['B', 'C', 'A'], ['C', 'A', 'B'], ['A', 'C', 'B'], ['A', 'A', 'B'] ])
Step 2: Set Number of Clusters
Decide how many groups you want to divide your data into.
Python
Step 3: Pick Starting Points (Modes)
Randomly choose k
rows from the data to be the starting cluster centers.
Python np.random.seed(0) modes = data[np.random.choice(data.shape[0], k, replace=False)]
Step 4: Assign Data to Clusters
For each data point, count how many features are different from each mode. Assign the point to the most similar cluster.
Python clusters = np.zeros(data.shape[0], dtype=int) for _ in range(10): for i, point in enumerate(data): distances = [np.sum(point != mode) for mode in modes] clusters[i] = np.argmin(distances)
Step 5: Update Cluster Modes
After assigning all points update each cluster’s mode to the most common values in that cluster.
Python for j in range(k): if np.any(clusters == j): modes[j] = pd.DataFrame(data[clusters == j]).mode().iloc[0].values
Step 6: View Final Results
Print out which cluster each data point belongs to and what the final cluster centers (modes) are.
Python print("Cluster assignments:", clusters) print("Cluster modes:", modes)
Output:
K-Mode ClusteringThe output shows that the first data point belongs to cluster 1 and the rest belong to cluster 0. Each cluster has a common pattern: cluster 0 has mode values ['A', 'A', 'B'] and cluster 1 has ['A', 'B', 'C']. These modes represent the most frequent values in each cluster and are used to group similar rows together.
Cluster with kmodes Library
pip install kmodes
Optimal number of clusters in the K-Mode algorithm
Elbow method is used to find the optimal number of clusters
Python import pandas as pd import numpy as np # !pip install kmodes from kmodes.kmodes import KModes import matplotlib.pyplot as plt %matplotlib inline cost = [] K = range(1,5) for k in list(K): kmode = KModes(n_clusters=k, init = "random", n_init = 5, verbose=1) kmode.fit_predict(data) cost.append(kmode.cost_) plt.plot(K, cost, 'x-') plt.xlabel('No. of clusters') plt.ylabel('Cost') plt.title('Elbow Curve') plt.show()
Outputs:
Elbow MethodAs we can see from the graph there is an elbow-like shape at 2.0 and 3.0 Now it we can consider either 2.0 or 3.0 cluster. Let's consider Number of cluster =2.0
Python kmode = KModes(n_clusters=2, init = "random", n_init = 5, verbose=1) clusters = kmode.fit_predict(data) clusters
Outputs :
array([1, 0, 1, 1, 1], dtype=uint16)
This also shows that the first, third, fourth and fifth data points have been assigned to the first cluster and the second data points have been assigned to the second cluster. So our previous answer was 100 % correct. To find the best number of groups we use the Elbow Method which helps us see when adding more groups doesn't make a big difference. K-Modes is an easy and effective way to group similar data when working with categories
Similar Reads
Machine Learning Algorithms Machine learning algorithms are essentially sets of instructions that allow computers to learn from data, make predictions, and improve their performance over time without being explicitly programmed. Machine learning algorithms are broadly categorized into three types: Supervised Learning: Algorith
8 min read
Top 15 Machine Learning Algorithms Every Data Scientist Should Know in 2025 Machine Learning (ML) Algorithms are the backbone of everything from Netflix recommendations to fraud detection in financial institutions. These algorithms form the core of intelligent systems, empowering organizations to analyze patterns, predict outcomes, and automate decision-making processes. Wi
14 min read
Linear Model Regression
Ordinary Least Squares (OLS) using statsmodelsOrdinary Least Squares (OLS) is a widely used statistical method for estimating the parameters of a linear regression model. It minimizes the sum of squared residuals between observed and predicted values. In this article we will learn how to implement Ordinary Least Squares (OLS) regression using P
3 min read
Linear Regression (Python Implementation)Linear regression is a statistical method that is used to predict a continuous dependent variable i.e target variable based on one or more independent variables. This technique assumes a linear relationship between the dependent and independent variables which means the dependent variable changes pr
14 min read
Multiple Linear Regression using Python - MLLinear regression is a statistical method used for predictive analysis. It models the relationship between a dependent variable and a single independent variable by fitting a linear equation to the data. Multiple Linear Regression extends this concept by modelling the relationship between a dependen
4 min read
Polynomial Regression ( From Scratch using Python )Prerequisites Linear RegressionGradient DescentIntroductionLinear Regression finds the correlation between the dependent variable ( or target variable ) and independent variables ( or features ). In short, it is a linear model to fit the data linearly. But it fails to fit and catch the pattern in no
5 min read
Bayesian Linear RegressionLinear regression is based on the assumption that the underlying data is normally distributed and that all relevant predictor variables have a linear relationship with the outcome. But In the real world, this is not always possible, it will follows these assumptions, Bayesian regression could be the
10 min read
How to Perform Quantile Regression in PythonIn this article, we are going to see how to perform quantile regression in Python. Linear regression is defined as the statistical method that constructs a relationship between a dependent variable and an independent variable as per the given set of variables. While performing linear regression we a
4 min read
Isotonic Regression in Scikit LearnIsotonic regression is a regression technique in which the predictor variable is monotonically related to the target variable. This means that as the value of the predictor variable increases, the value of the target variable either increases or decreases in a consistent, non-oscillating manner. Mat
6 min read
Stepwise Regression in PythonStepwise regression is a method of fitting a regression model by iteratively adding or removing variables. It is used to build a model that is accurate and parsimonious, meaning that it has the smallest number of variables that can explain the data. There are two main types of stepwise regression: F
6 min read
Least Angle Regression (LARS)Regression is a supervised machine learning task that can predict continuous values (real numbers), as compared to classification, that can predict categorical or discrete values. Before we begin, if you are a beginner, I highly recommend this article. Least Angle Regression (LARS) is an algorithm u
3 min read
Linear Model Classification
Regularization
K-Nearest Neighbors (KNN)
Support Vector Machines
ML - Stochastic Gradient Descent (SGD) Stochastic Gradient Descent (SGD) is an optimization algorithm in machine learning, particularly when dealing with large datasets. It is a variant of the traditional gradient descent algorithm but offers several advantages in terms of efficiency and scalability, making it the go-to method for many d
8 min read
Decision Tree
Ensemble Learning