Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
LOOCV (Leave One Out Cross-Validation) in R Programming
Next article icon

How to interpret cross validation output from cv.kknn (kknn package)

Last Updated : 19 Jul, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

Cross-validation is a resampling procedure used to evaluate machine learning models on a limited data sample. The most common type is k-fold cross-validation, where the dataset is divided into k subsets (folds). The model is trained on k-1 of these folds and validated on the remaining one. This process is repeated k times, with each fold being used exactly once as the validation data. The results are then averaged to produce a single performance metric.

What is cv.kknn?

The cv.kknn function of the kknn the package performs k-fold cross-validation on a k-NN classifier. This means it divides the dataset into k equal parts (folds), trains the model on k-1 folds, and tests it on the remaining fold. This process is repeated k times, each time with a different fold as the test set. The function provides a detailed output that includes confusion matrices for each fold, overall accuracy for each fold, and other performance metrics.

The cv.kknn function in the kknn package is used for k-fold cross-validation of k-nearest neighbor models. The function syntax is:

cv.kknn(formula, data, k = 10, distance = 2, kernel = "rectangular", ykernel = NULL, scale = TRUE)

Where:

  • formula: A symbolic description of the model to be fit.
  • data: The dataset to be used.
  • k: The number of folds for cross-validation (default is 10).
  • distance: The Minkowski distance metric (default is 2, which is Euclidean distance).
  • kernel: The kernel function for weighting neighbors (default is "rectangular").
  • ykernel: Kernel for regression.
  • scale: Logical, whether to scale the data.

Let's go through a practical example using the iris dataset to illustrate how to interpret the cross-validation output from cv.kknn in R Programming Language.

Step 1: Load Necessary Libraries and Data

First we will install and load the Libraries and Data.

R
library(kknn) data(iris) 

Step 2: Perform Cross-Validation

We will perform 5-fold cross-validation on the iris dataset using cv.kknn.

R
set.seed(123)  # For reproducibility cv_results <- cv.kknn(Species ~ ., data = iris, k = 5) 

Step 3: Understanding the Output

The output of cv.kknn is a list containing various elements. Let's inspect the key components:

R
print(cv_results) 

Output:

[[1]]
y yhat
1 1 1
2 1 1
3 1 1
4 1 1
5 1 1
6 1 1.................

[[2]]
[1] 1

The table displays the actual and predicted class labels for each observation in the iris dataset. Here’s how to interpret it:

  • Column y: The true class labels for the observations.
  • Column yhat: The predicted class labels by the k-NN model.
    • True labels (y) and predicted labels (yhat) are all 1, indicating that all setosa samples were correctly predicted.
    • True labels are all 2 (versicolor), but there are a few misclassifications where yhat is 3 (virginica).
    • True labels are all 3 (virginica), but there are a few misclassifications where yhat is 2 (versicolor).

In 2nd case, it seems to be 1, suggesting a perfect accuracy of 100%. However, this might be a simplified output, and the actual accuracy should consider misclassifications observed in the prediction results table.

Conclusion

Cross-validation is an essential step in the model evaluation process, providing insights into how well your model generalizes to unseen data. The cv.kknn function in the kknn package offers a convenient way to perform k-fold cross-validation for k-nearest neighbors models in R. By understanding and interpreting the output of cv.kknn, you can make informed decisions about your model's performance and potential improvements. This ensures a robust and reliable machine learning workflow.


Next Article
LOOCV (Leave One Out Cross-Validation) in R Programming

N

nyadavxenc
Improve
Article Tags :
  • Machine Learning
  • Blogathon
  • AI-ML-DS
  • AI-ML-DS With R
  • Data Science Blogathon 2024
Practice Tags :
  • Machine Learning

Similar Reads

  • How to Use K-Fold Cross-Validation in a Neural Network
    To use K-Fold Cross-Validation in a neural network, you need to perform K-Fold Cross-Validation splits the dataset into K subsets or "folds," where each fold is used as a validation set while the remaining folds are used as training sets. This helps in understanding how the model performs across dif
    3 min read
  • Generalisation Performance from NNET in R using k-fold cross-validation
    Neural networks are a powerful tool for solving complex machine-learning tasks. However, assessing their performance on new, unseen data is crucial to ensure their reliability. In this tutorial, we'll explore how to evaluate the generalization performance of a neural network implemented using the `n
    15+ min read
  • K- Fold Cross Validation in Machine Learning
    K-Fold Cross Validation is a statistical technique to measure the performance of a machine learning model by dividing the dataset into K subsets of equal size (folds). The model is trained on K − 1 folds and tested on the last fold. This process is repeated K times, with each fold being used as the
    4 min read
  • LOOCV (Leave One Out Cross-Validation) in R Programming
    LOOCV (Leave-One-Out Cross-Validation) is a cross-validation technique where each individual observation in the dataset is used once as the validation set, while the remaining observations are used as the training set. This process is repeated for all observations, with each one serving as the valid
    4 min read
  • How to perform 10 fold cross validation with LibSVM in R?
    Support Vector Machines (SVM) are a powerful tool for classification and regression tasks. LibSVM is a widely used library that implements SVM, and it can be accessed in R with the e1071 package. Cross-validation, particularly 10-fold cross-validation, is an essential technique for assessing the per
    5 min read
  • Cross validation in R without caret package
    Cross-validation is a technique for evaluating the performance of a machine learning model by training it on a subset of the data and evaluating it on the remaining data. It is a useful method for estimating the performance of a model when you don't have a separate test set, or when you want to get
    4 min read
  • Cross Validation in Machine Learning
    Cross-validation is a technique used to check how well a machine learning model performs on unseen data. It splits the data into several parts, trains the model on some parts and tests it on the remaining part repeating this process multiple times. Finally the results from each validation step are a
    7 min read
  • Creating Custom Cross-Validation Generators in Scikit-learn
    Cross-validation is a fundamental technique in machine learning used to assess the performance and generalizability of models. Scikit-learn, a popular Python library, provides several built-in cross-validation methods, such as K-Fold, Stratified K-Fold, and Time Series Split. However, there are scen
    6 min read
  • Cross Validation function for logistic regression in R
    Cross-validation is a technique for assessing the performance of a machine-learning model. It helps in understanding how the model generalizes to an independent dataset, thereby ensuring that the model is neither overfitted nor underfitted. This article will guide you through creating a cross-valida
    3 min read
  • Cross-validation on Digits Dataset in Scikit-learn
    In this article, we will discuss cross-validation and its use on digit datasets. Further, we will see the code implementation using a digits dataset. What is Cross-Validation?Cross Validation on the Digits Dataset will allow us to choose the best parameters avoiding overfitting over the training dat
    5 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences