Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
How to Deal with Factors with Rare Levels in Cross-Validation in R
Next article icon

How to perform 10 fold cross validation with LibSVM in R?

Last Updated : 30 Jul, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

Support Vector Machines (SVM) are a powerful tool for classification and regression tasks. LibSVM is a widely used library that implements SVM, and it can be accessed in R with the e1071 package. Cross-validation, particularly 10-fold cross-validation, is an essential technique for assessing the performance and generalizability of a model. This article will explain the theory behind 10-fold cross-validation and demonstrate how to perform it using LibSVM in R Programming Language.

Overview of 10 fold cross validation

Cross-validation is a statistical method used to estimate the skill of a model on unseen data. It is commonly used to assess the effectiveness of machine learning models.

  1. Cross-Validation: A technique to evaluate the performance of a model by partitioning the data into subsets, training the model on some subsets, and validating it on the remaining subsets.
  2. K-Fold Cross-Validation: Involves splitting the dataset into K equally-sized folds. The model is trained K times, each time using a different fold as the validation set and the remaining K-1 folds as the training set.
  3. 10-Fold Cross-Validation: A specific case of K-Fold Cross-Validation where K=10. This is a widely used method that balances computational efficiency and performance estimation accuracy.

Now we will discuss the step-by-step implementation of How to perform 10-fold cross-validation with LibSVM in R Programming Language.

Step 1: Getting Started with e1071 and libsvm

To perform 10-fold cross-validation with libsvm, we need to install and load the e1071 package:

R
install.packages("e1071") library(e1071) 

Step 2: Prepare the Data

We will use the iris dataset for this example. The dataset will be split into features and labels.

R
# Load the iris dataset data(iris)  # Split the dataset into features (X) and labels (y) X <- iris[, -5] y <- iris$Species 

Step 3: Define the Cross-Validation Folds

Create 10 folds for cross-validation using the createFolds function from the caret package.

R
# Install and load the caret package install.packages("caret") library(caret)  # Create 10 folds set.seed(123) folds <- createFolds(y, k = 10, list = TRUE) folds  

Output:

$Fold01
[1] 1 18 26 28 45 57 59 61 98 100 126 128 129 132 143

$Fold02
[1] 5 27 29 39 46 66 74 84 86 97 101 105 106 136 149

$Fold03
[1] 4 6 32 35 47 51 53 82 88 91 114 137 138 140 146

$Fold04
[1] 3 23 33 48 50 71 75 81 85 95 102 103 113 121 130

$Fold05
[1] 2 9 12 24 42 55 58 67 77 78 109 112 125 131 144

$Fold06
[1] 10 21 38 44 49 60 62 64 83 90 119 120 122 127 142

$Fold07
[1] 7 11 19 25 43 54 70 89 92 93 108 115 118 123 124

$Fold08
[1] 8 14 17 37 41 52 56 65 73 96 135 139 141 145 148

$Fold09
[1] 15 16 22 30 40 69 72 79 87 99 116 117 134 147 150

$Fold10
[1] 13 20 31 34 36 63 68 76 80 94 104 107 110 111 133

The dataset is divided into 10 folds using the createFolds function from the caret package.

Step 4: Perform 10-Fold Cross-Validation

Train and evaluate the SVM model using LibSVM for each fold.

R
# Initialize a vector to store accuracy for each fold accuracies <- c()  # Loop over each fold for (i in 1:10) {   # Split the data into training and test sets   test_indices <- folds[[i]]   train_data <- X[-test_indices, ]   train_labels <- y[-test_indices]   test_data <- X[test_indices, ]   test_labels <- y[test_indices]      # Train the SVM model using LibSVM   svm_model <- svm(train_data, train_labels, type = 'C-classification',                                                                    kernel = 'linear')     summary(svm_model)   

Output:

Call:
svm.default(x = train_data, y = train_labels, type = "C-classification",
kernel = "linear")


Parameters:
SVM-Type: C-classification
SVM-Kernel: linear
cost: 1

Number of Support Vectors: 28

( 2 15 11 )


Number of Classes: 3

Levels:
setosa versicolor virginica

For each fold, the model is trained on the training data (9 folds) and evaluated on the test data (1 fold).

Step 5: Evaluate the model performance

Now we will Evaluate the model performance.

R
  # Make predictions on the test set   predictions <- predict(svm_model, test_data)      # Calculate accuracy   accuracy <- sum(predictions == test_labels) / length(test_labels)      # Store the accuracy   accuracies <- c(accuracies, accuracy)  # Calculate and print the average accuracy average_accuracy <- mean(accuracies) print(paste("Average Accuracy:", round(average_accuracy * 100, 2), "%")) 

Output:

[1] "Average Accuracy: 96.67 %"

The average accuracy across all 10 folds is calculated and printed, providing an estimate of the model's performance.

Conclusion

Performing 10-fold cross-validation in R is straightforward with the e1071 package for libsvm and the glmnet package for regularized regression. By following the steps and examples provided in this article, you can ensure that your models are robustly evaluated and optimized.

By integrating cross-validation into your modeling workflow, you can improve the reliability and performance of your predictive models, whether you're working with support vector machines or regularized linear models.


Next Article
How to Deal with Factors with Rare Levels in Cross-Validation in R

J

jagritiezz6
Improve
Article Tags :
  • Machine Learning
  • Blogathon
  • AI-ML-DS
  • AI-ML-DS With R
  • Data Science Blogathon 2024
Practice Tags :
  • Machine Learning

Similar Reads

  • How to Use K-Fold Cross-Validation in a Neural Network
    To use K-Fold Cross-Validation in a neural network, you need to perform K-Fold Cross-Validation splits the dataset into K subsets or "folds," where each fold is used as a validation set while the remaining folds are used as training sets. This helps in understanding how the model performs across dif
    3 min read
  • How to do nested cross-validation with LASSO in caret or tidymodels?
    Nested cross-validation is a robust technique used for hyperparameter tuning and model selection. When working with complex models like LASSO (Least Absolute Shrinkage and Selection Operator), it becomes essential to understand how to implement nested cross-validation efficiently. In this article, w
    10 min read
  • How to Deal with Factors with Rare Levels in Cross-Validation in R
    Cross-validation is a vital technique for evaluating model performance in machine learning. However, traditional cross-validation approaches may lead to biased or unreliable results when dealing with factors (categorical variables) that contain rare levels. In this guide, we'll explore strategies fo
    4 min read
  • Cross Validation on a Dataset with Factors in R
    Cross-validation is a widely used technique in machine learning and statistical modeling to assess how well a model generalizes to new data. When working with datasets containing factors (categorical variables), it's essential to handle them appropriately during cross-validation to ensure unbiased p
    4 min read
  • Cross-Validation Using K-Fold With Scikit-Learn
    Cross-validation involves repeatedly splitting data into training and testing sets to evaluate the performance of a machine-learning model. One of the most commonly used cross-validation techniques is K-Fold Cross-Validation. In this article, we will explore the implementation of K-Fold Cross-Valida
    12 min read
  • Cross validation in R without caret package
    Cross-validation is a technique for evaluating the performance of a machine learning model by training it on a subset of the data and evaluating it on the remaining data. It is a useful method for estimating the performance of a model when you don't have a separate test set, or when you want to get
    4 min read
  • Generalisation Performance from NNET in R using k-fold cross-validation
    Neural networks are a powerful tool for solving complex machine-learning tasks. However, assessing their performance on new, unseen data is crucial to ensure their reliability. In this tutorial, we'll explore how to evaluate the generalization performance of a neural network implemented using the `n
    15+ min read
  • Cross Validation function for logistic regression in R
    Cross-validation is a technique for assessing the performance of a machine-learning model. It helps in understanding how the model generalizes to an independent dataset, thereby ensuring that the model is neither overfitted nor underfitted. This article will guide you through creating a cross-valida
    3 min read
  • K- Fold Cross Validation in Machine Learning
    K-Fold Cross Validation is a statistical technique to measure the performance of a machine learning model by dividing the dataset into K subsets of equal size (folds). The model is trained on K − 1 folds and tested on the last fold. This process is repeated K times, with each fold being used as the
    4 min read
  • Stratified K Fold Cross Validation
    Stratified K-Fold Cross Validation is a technique used for evaluating a model. It is particularly useful for classification problems in which the class labels are not evenly distributed i.e data is imbalanced. It is a enhanced version of K-Fold Cross Validation. Key difference is that it uses strati
    4 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences