Skip to content
geeksforgeeks
  • Tutorials
    • Python
    • Java
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
    • Practice Coding Problems
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
Understanding Decision Boundaries in K-Nearest Neighbors (KNN)
Next article icon

Understanding Decision Boundaries in K-Nearest Neighbors (KNN)

Last Updated : 15 May, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

A decision boundary is a line or surface that divides different groups in a classification task. It shows which areas belong to which class based on what the model decides. K-Nearest Neighbors (KNN) algorithm operates on the principle that similar data points exist in close proximity within a feature space. The shape of this boundary depends on:

  • The value of K (how many neighbors are considered).
  • How the data points are spread out in space.

For example given a dataset with two classes the decision boundary can be visualized as the line or curve dividing the two regions where each class is predicted. For a 1-nearest neighbor (1-NN) classifier the decision boundary can be visualized using a Voronoi diagram.

Using Voronoi Diagrams to Visualize

  • A Voronoi diagram splits space into regions based on which training point is closest.
  • Each region called a Voronoi cell contains all the points closest to one specific training point.
  • The lines between regions are where points are equally close to two or more seeds. These are the decision boundaries for 1-Nearest Neighbor.
  • If we label the training points by class the Voronoi diagram shows how KNN assigns a new point based on which region it falls into.
  • The boundary line between two points p_i \quad and \quad p_j is the perpendicular bisector of the line joining them — meaning it’s a line that cuts the segment between them exactly in half at a right angle.
knn-decision-boundafries
Formation of Decision Boundaries

Relationship Between KNN Decision Boundaries and Voronoi Diagrams

In two-dimensional space the decision boundaries of KNN can be visualized as Voronoi diagrams. Here’s how:

  • KNN Boundaries: The decision boundary for KNN is determined by regions where the classification changes based on the nearest neighbors. K approaches infinity, these boundaries approach the Voronoi diagram boundaries.
  • Voronoi Diagram as a Special Case: When k = 1 KNN’s decision boundaries directly correspond to the Voronoi diagram of the training points. Each region in the Voronoi diagram represents the area where the nearest training point is closest.

How KNN Defines Decision Boundaries

In KNN, decision boundaries are influenced by the choice of k and the distance metric used:

1. Impact of 'K' on Decision Boundaries: The number of neighbors (k) affects the shape and smoothness of the decision boundary.

  • Small k: When k is small the decision boundary can become very complex, closely following the training data. This can lead to overfitting.
  • Large k: When k is large the decision boundary smooths out and becomes less sensitive to individual data points, potentially leading to underfitting.

2. Distance Metric: The decision boundary is also affected by the distance metric used like Euclidean, Manhattan. Different metrics can lead to different boundary shapes.

  • Euclidean Distance: Commonly used leading to circular or elliptical decision boundaries in two-dimensional space.
  • Manhattan Distance: Results in axis-aligned decision boundaries.

Decision Boundaries for Binary Classification with Varying k

Consider a binary classification problem with two features where the goal is to visualize how KNN decision boundary changes as k varies. This example uses synthetic data to illustrate the impact of different k values on the decision boundary.

For a two-dimensional dataset decision boundary can be plotted by:

  • Creating a Grid: Generate a grid of points covering the feature space.
  • Classifying Grid Points: Use the KNN algorithm to classify each point in the grid based on its neighbors.
  • Plotting: Color the grid points according to their class labels and draw the boundaries where the class changes.
Python
import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import make_classification from sklearn.neighbors import KNeighborsClassifier   X, y = make_classification(n_samples=200, n_features=2, n_informative=2, n_redundant=0, n_clusters_per_class=1, random_state=42)  x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1 y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1 xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.01), np.arange(y_min, y_max, 0.01))  fig, axs = plt.subplots(2, 2, figsize=(12, 10)) k_values = [1, 3, 5, 10]  for ax, k in zip(axs.flat, k_values):      knn = KNeighborsClassifier(n_neighbors=k)     knn.fit(X, y)          Z = knn.predict(np.c_[xx.ravel(), yy.ravel()])     Z = Z.reshape(xx.shape)      ax.contourf(xx, yy, Z, alpha=0.3, cmap=plt.cm.Paired)     ax.scatter(X[:, 0], X[:, 1], c=y, edgecolor='k',                cmap=plt.cm.Paired, marker='o')     ax.set_title(f'KNN Decision Boundaries (k={k})')     ax.set_xlabel('Feature 1')     ax.set_ylabel('Feature 2')  plt.tight_layout() plt.show() 

Output:

casestudy1
Binary Classification with Varying k
  • For small k the boundary is highly sensitive to local variations and can be irregular.
  • For larger k the boundary smooths out, reflecting a more generalized view of the data distribution.

Factors That Affect KNN Decision Boundaries

  • Feature Scaling: KNN is sensitive to the scale of data. Features with larger ranges can dominate distance calculations, affecting the boundary shape.
  • Noise in Data: Outliers and noisy data points can shift or distort decision boundaries, leading to incorrect classifications.
  • Data Distribution: How data points are spread across the feature space influences how KNN separates classes.
  • Boundary Shape: A clear and accurate boundary improves classification accuracy, while a messy or unclear boundary can lead to errors.

Understanding these boundaries helps in optimizing KNN's performance for specific datasets.


Next Article
Understanding Decision Boundaries in K-Nearest Neighbors (KNN)

F

frisbevhwy
Improve
Article Tags :
  • Machine Learning
  • AI-ML-DS
  • ML-Classification
  • AI-ML-DS With Python
Practice Tags :
  • Machine Learning

Similar Reads

    kNN: k-Nearest Neighbour Algorithm in R From Scratch
    In this article, we are going to discuss what is KNN algorithm, how it is coded in R Programming Language, its application, advantages and disadvantages of the KNN algorithm. kNN algorithm in RKNN can be defined as a K-nearest neighbor algorithm. It is a supervised learning algorithm that can be use
    15+ min read
    K-Nearest Neighbors and Curse of Dimensionality
    In high-dimensional data, the performance of the k-nearest neighbor (k-NN) algorithm often deteriorates due to increased computational complexity and the breakdown of the assumption that similar points are proximate. These challenges hinder the algorithm's accuracy and efficiency in high-dimensional
    6 min read
    How To Predict Diabetes using K-Nearest Neighbor in R
    In this article, we are going to predict Diabetes using the K-Nearest Neighbour algorithm and analyze on Diabetes dataset using the R Programming Language. What is the K-Nearest Neighbor algorithm?The K-Nearest Neighbor (KNN) algorithm is a popular supervised learning classifier frequently used by d
    13 min read
    How to Draw Decision Boundaries in R
    Decision boundaries are essential concepts in machine learning, especially for classification tasks. They define the regions in feature space where the model predicts different classes. Visualizing decision boundaries helps us understand how a classifier separates different classes. In this article,
    4 min read
    K-Nearest Neighbor(KNN) Algorithm
    K-Nearest Neighbors (KNN) is a supervised machine learning algorithm generally used for classification but can also be used for regression tasks. It works by finding the "k" closest data points (neighbors) to a given input and makesa predictions based on the majority class (for classification) or th
    8 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences