Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
Hyperparameter tuning using GridSearchCV and KerasClassifier
Next article icon

Hyperparameter tuning SVM parameters using Genetic Algorithm

Last Updated : 09 Jul, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

The performance support Vector Machines (SVMs) are heavily dependent on hyperparameters such as the regularization parameter (C) and the kernel parameters (gamma for RBF kernel). Genetic Algorithms (GAs) leverage evolutionary principles to search for optimal hyperparameter values.

This article explores the use of Genetic Algorithms for tuning SVM parameters, discussing their implementation and advantages.

Hyperparameters of Support Vector Machines (SVMs)

Support Vector Machines (SVMs) are supervised learning models for classification and regression tasks. They work by finding the hyperplane that best separates the data into different classes, maximizing the margin between them.

Key hyperparameters for SVMs include:

  • C (Regularization Parameter): Controls the trade-off between achieving a low training error and a low testing error.
  • Kernel Parameters: These include parameters specific to the chosen kernel function, such as gamma for the RBF kernel.

GA for Hyperparameter Tuning SVM Parameters

For SVMs, the hyperparameters (C and gamma) are encoded as chromosomes. Each gene in the chromosome represents a specific hyperparameter.

The fitness function evaluates the performance of the SVM model with a given set of hyperparameters, typically using cross-validation to measure accuracy or another relevant metric.

GA Workflow for Hyperparameter Tuning

  1. Initialization: Generate an initial population of potential hyperparameter sets.
  2. Selection: Choose parent solutions based on their fitness scores.
  3. Crossover: Combine parent solutions to produce offspring with traits from both parents.
  4. Mutation: Introduce random variations to offspring to maintain diversity.
  5. Evaluation: Assess the fitness of the new solutions.
  6. Iteration: Repeat the process for multiple generations until convergence or a stopping criterion is met.

Pseudocode

Initialize population with random hyperparameter sets
Evaluate fitness of each individual in the population
while (termination criteria not met) do:
Select parents based on fitness
Apply crossover to produce offspring
Apply mutation to offspring
Evaluate fitness of offspring
Select individuals for the next generation
end while
Return the best hyperparameter set

Optimizing SVM Hyperparameters with Genetic Algorithms

Step 1: Install Necessary Packages

This step installs the required Python packages deap and scikit-learn using pip. These packages are necessary for running the genetic algorithm and for the machine learning tasks, respectively.

pip install deap 

Step 2: Import Libraries

In this step, we import the necessary libraries for implementing the genetic algorithm and machine learning functionalities. random is used for random number generation, numpy for numerical operations, datasets, cross_val_score, and SVC from sklearn for loading the dataset, cross-validation, and SVM classifier respectively. The deap library provides the tools needed for the genetic algorithm.

import random
import np
from sklearn import datasets
from sklearn.model_selection import cross_val_score
from sklearn.svm import SVC
from deap import base, creator, tools, algorithms

Step 3: Load Dataset

Here, we load the digits dataset from scikit-learn. This dataset is a collection of handwritten digits and is a good example for demonstrating the use of machine learning classifiers. We then separate the data into features (X) and target labels (y).

# Load dataset
data = datasets.load_digits()
X = data.data
y = data.target

Step 4: Define Evaluation Function with Error Handling

We define a function evaluate that will be used to assess the performance of an individual in the genetic algorithm. The individual represents a set of hyperparameters for the SVM classifier (C and gamma). We ensure the values of C and gamma are at least 0.1 to avoid invalid parameter values. The SVM classifier is trained and evaluated using cross-validation, and the mean score is returned. If any error occurs during evaluation, a poor score is assigned.

# Define evaluation function with error handling
def evaluate(individual):
C = max(0.1, individual[0])
gamma = max(0.1, individual[1])
try:
clf = SVC(C=C, gamma=gamma)
score = cross_val_score(clf, X, y, cv=5).mean()
except Exception as e:
score = -1 # Assign a poor score if there's an error
return score,

Step 5: Setup Genetic Algorithm Toolbox

This step involves setting up the DEAP toolbox for the genetic algorithm. We define the fitness function to be maximized and the structure of an individual (a list with a fitness attribute). We then register the functions for creating attributes (random float), individuals (repeated attributes), and populations (repeated individuals). The genetic operators for crossover, mutation, selection, and evaluation are also registered.

# Genetic Algorithm setup
toolbox = base.Toolbox()
creator.create("FitnessMax", base.Fitness, weights=(1.0,))
creator.create("Individual", list, fitness=creator.FitnessMax)

toolbox.register("attr_float", random.uniform, 0.1, 10)
toolbox.register("individual", tools.initRepeat, creator.Individual, toolbox.attr_float, 2)
toolbox.register("population", tools.initRepeat, list, toolbox.individual)

toolbox.register("mate", tools.cxBlend, alpha=0.5)
toolbox.register("mutate", tools.mutGaussian, mu=0, sigma=1, indpb=0.2)
toolbox.register("select", tools.selTournament, tournsize=3)
toolbox.register("evaluate", evaluate)

Step 6: Define Main Function to Run Genetic Algorithm

We define the main function that initializes the random seed for reproducibility and creates the initial population. We set up statistics to be recorded during the genetic algorithm run, including average, standard deviation, minimum, and maximum fitness. The genetic algorithm is then executed using eaSimple, which runs the algorithm for a specified number of generations with given crossover and mutation probabilities.

# Genetic Algorithm execution
def main():
random.seed(42)

# Create initial population
population = toolbox.population(n=50)

# Define statistics to be recorded
stats = tools.Statistics(lambda ind: ind.fitness.values)
stats.register("avg", np.mean)
stats.register("std", np.std)
stats.register("min", np.min)
stats.register("max", np.max)

# Run genetic algorithm
population, logbook = algorithms.eaSimple(population, toolbox, cxpb=0.5, mutpb=0.2, ngen=40, stats=stats, verbose=True)

return population, logbook

Step 7: Execute Main Function and Output Results

In the final step, we execute the main function and retrieve the best individual from the final population. We then extract the best hyperparameters (C and gamma) and print the best individual along with its fitness score and hyperparameters.

if __name__ == "__main__":
population, logbook = main()

# Get the best individual
best_individual = tools.selBest(population, 1)[0]
best_C = max(0.1, best_individual[0])
best_gamma = max(0.1, best_individual[1])

print(f"Best individual: {best_individual}")
print(f"Best fitness: {best_individual.fitness.values[0]}")
print(f"Best hyperparameters: C={best_C}, gamma={best_gamma}")

Complete Code

Python
# Install necessary packages !pip install deap scikit-learn  import random import numpy as np from sklearn import datasets from sklearn.model_selection import cross_val_score from sklearn.svm import SVC from deap import base, creator, tools, algorithms  # Load dataset data = datasets.load_digits() X = data.data y = data.target  # Define evaluation function with error handling def evaluate(individual):     C = max(0.1, individual[0])     gamma = max(0.1, individual[1])     try:         clf = SVC(C=C, gamma=gamma)         score = cross_val_score(clf, X, y, cv=5).mean()     except Exception as e:         score = -1  # Assign a poor score if there's an error     return score,  # Genetic Algorithm setup toolbox = base.Toolbox() creator.create("FitnessMax", base.Fitness, weights=(1.0,)) creator.create("Individual", list, fitness=creator.FitnessMax)  toolbox.register("attr_float", random.uniform, 0.1, 10) toolbox.register("individual", tools.initRepeat, creator.Individual, toolbox.attr_float, 2) toolbox.register("population", tools.initRepeat, list, toolbox.individual)  toolbox.register("mate", tools.cxBlend, alpha=0.5) toolbox.register("mutate", tools.mutGaussian, mu=0, sigma=1, indpb=0.2) toolbox.register("select", tools.selTournament, tournsize=3) toolbox.register("evaluate", evaluate)  # Genetic Algorithm execution def main():     random.seed(42)      # Create initial population     population = toolbox.population(n=50)      # Define statistics to be recorded     stats = tools.Statistics(lambda ind: ind.fitness.values)     stats.register("avg", np.mean)     stats.register("std", np.std)     stats.register("min", np.min)     stats.register("max", np.max)      # Run genetic algorithm     population, logbook = algorithms.eaSimple(population, toolbox, cxpb=0.5, mutpb=0.2, ngen=40, stats=stats, verbose=True)      return population, logbook  if __name__ == "__main__":     population, logbook = main()      # Get the best individual     best_individual = tools.selBest(population, 1)[0]     best_C = max(0.1, best_individual[0])     best_gamma = max(0.1, best_individual[1])      print(f"Best individual: {best_individual}")     print(f"Best fitness: {best_individual.fitness.values[0]}")     print(f"Best hyperparameters: C={best_C}, gamma={best_gamma}")  # Train and test the final model X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) final_model = SVC(C=best_individual[0], gamma=best_individual[1]) final_model.fit(X_train, y_train) print(f'Test accuracy: {final_model.score(X_test, y_test)}') 

Output:

gen	nevals	avg     	std       	min     	max     
0 50 0.107677 0.00987644 0.101281 0.139164
1 33 0.113994 0.0121825 0.101281 0.140279
2 30 0.124992 0.0120814 0.101838 0.153649
3 32 0.132813 0.00799749 0.110752 0.150864
4 32 0.134774 0.0107546 0.101838 0.154206
5 29 0.139799 0.00769207 0.107409 0.154206
6 23 0.142953 0.0092506 0.101838 0.155877
7 24 0.147688 0.00616172 0.136379 0.155877
8 20 0.151276 0.00503346 0.135822 0.155877
9 34 0.151811 0.00685508 0.121894 0.155877
10 28 0.152724 0.00622722 0.134708 0.155877
11 21 0.154084 0.0080811 0.101281 0.155877
12 31 0.155454 0.00280674 0.135822 0.155877
13 38 0.154763 0.00440217 0.135822 0.155877
14 28 0.155309 0.00301966 0.135822 0.155877
15 27 0.154017 0.00824709 0.101838 0.155877
16 25 0.154072 0.00547047 0.135822 0.155877
17 25 0.155811 0.000467967 0.152535 0.155877
18 33 0.15288 0.00929871 0.101838 0.155877
19 22 0.154752 0.00447696 0.135822 0.155877
20 31 0.15454 0.00458452 0.135822 0.155877
21 40 0.154373 0.00513775 0.135822 0.155877
22 29 0.155265 0.003148 0.135822 0.155877
23 30 0.154396 0.00801579 0.101838 0.155877
24 32 0.152813 0.00940053 0.101838 0.155877
25 32 0.153627 0.00872855 0.101838 0.155877
26 33 0.154295 0.00536714 0.135822 0.155877
27 27 0.155476 0.0028078 0.135822 0.155877
28 30 0.154641 0.00471836 0.135822 0.155877
29 32 0.154396 0.00801579 0.101838 0.155877
30 35 0.154072 0.00531391 0.135822 0.155877
31 31 0.154173 0.00517646 0.135822 0.155877
32 24 0.154429 0.0061524 0.119109 0.155877
33 30 0.153404 0.0107497 0.101838 0.155877
34 32 0.155298 0.00304926 0.135822 0.155877
35 36 0.154674 0.00476297 0.135822 0.155877
36 35 0.154507 0.00768577 0.101838 0.155877
37 21 0.155877 2.77556e-17 0.155877 0.155877
38 27 0.154184 0.00576914 0.131365 0.155877
39 36 0.154474 0.0055802 0.126351 0.155877
40 26 0.155153 0.0034742 0.135822 0.155877
Best individual: [-6.96604485823403, 1.256273035647874]
Best fitness: 0.15587743732590528
Best hyperparameters: C=0.1, gamma=1.256273035647874

The output provided represents the progress and results of the genetic algorithm over 40 generations. Here's a detailed explanation of the key parts:

Generational Statistics

The table shows statistics for each generation:

  1. gen: The generation number.
  2. nevals: The number of individuals evaluated in that generation.
  3. avg: The average fitness value of the population in that generation.
  4. std: The standard deviation of the fitness values, indicating the variability within the population.
  5. min: The minimum fitness value in the population.
  6. max: The maximum fitness value in the population.

This table helps track the genetic algorithm's progress, showing how the fitness of the population improves (or not) over generations.

The best individual found by the genetic algorithm is represented as:

Best individual: [-6.96604485823403, 1.256273035647874]

This individual corresponds to the hyperparameters:

  • C=0.1 (adjusted to the minimum value allowed)
  • \gamma = 1.256273035647874

The fitness value of the best individual is:

Best fitness: 0.15587743732590528

This value represents the highest cross-validation score achieved by the SVM classifier with the best hyperparameters during the genetic algorithm run.

Advantages of Using GA for Hyperparameter Tuning

  • Efficient Exploration of the Search Space: GAs focus on promising regions, reducing the time needed to find optimal hyperparameters.
  • Ability to Escape Local Optima: GAs' stochastic nature helps them avoid being trapped in suboptimal solutions.
  • Scalability to Complex Models: GAs are effective even with large, complex hyperparameter spaces.
  • Balancing Exploration and Exploitation: GAs maintain diversity while refining good solutions.

Conclusion

Hyperparameter tuning is essential for optimizing machine learning models, and Genetic Algorithms offer an efficient and effective solution. GAs provide a balance between exploration and exploitation, making them suitable for complex hyperparameter spaces. While they come with computational challenges, their advantages often outweigh the drawbacks. As machine learning continues to evolve, GAs will likely play an increasingly important role in hyperparameter optimization.


Next Article
Hyperparameter tuning using GridSearchCV and KerasClassifier

D

djnanasatkx0l
Improve
Article Tags :
  • Machine Learning
  • Blogathon
  • AI-ML-DS
  • AI-ML-DS With Python
  • Data Science Blogathon 2024
Practice Tags :
  • Machine Learning

Similar Reads

  • SVM Hyperparameter Tuning using GridSearchCV | ML
    Support Vector Machines (SVM) are used for classification tasks but their performance depends on the right choice of hyperparameters like C and gamma. Finding the optimal combination of these hyperparameters can be a issue. GridSearchCV automates the process by systematically testing various combina
    3 min read
  • How to implement Genetic Algorithm using PyTorch
    The optimization algorithms are capable of solving complex problems and genetic algorithm is one of the optimization algorithm. Genetic Algorithm can be easily integrate with PyTorch to address a wide array of optimization tasks. We will understand how to implement Genetic Algorithm using PyTorch. G
    8 min read
  • Hyperparameter tuning using GridSearchCV and KerasClassifier
    Hyperparameter tuning is done to increase the efficiency of a model by tuning the parameters of the neural network. Some scikit-learn APIs like GridSearchCV and RandomizedSearchCV are used to perform hyper parameter tuning. In this article, you'll learn how to use GridSearchCV to tune Keras Neural N
    2 min read
  • How to Tune Hyperparameters in Gradient Boosting Algorithm
    Gradient Boosting is an ensemble learning method and it works by sequentially adding decision trees where each tree tries to improve the model's performance by focusing on the errors made by the previous trees and reducing those error with help of gradient descent. While Gradient Boosting performs w
    8 min read
  • Difference Between Model Parameters VS HyperParameters
    The two most confusing terms in Machine Learning are Model Parameters and Hyperparameters. In this post, we will try to understand what these terms mean and how they are different from each other. What is a Model Parameter?A model parameter is a variable of the selected model which can be estimated
    2 min read
  • Random Forest Hyperparameter Tuning in Python
    Random Forest is one of the most popular and powerful machine learning algorithms used for both classification and regression tasks. It works by building multiple decision trees and combining their outputs to improve accuracy and control overfitting. While Random Forest is already a robust model fin
    6 min read
  • CatBoost Parameters and Hyperparameters
    For gradient boosting on decision trees, CatBoost is a well-liked open-source toolkit. It was created by Yandex and may be applied to a range of machine-learning issues, including classification, regression, ranking, and more. Compared to other boosting libraries, CatBoost has a number of benefits,
    12 min read
  • Hyperparameter Tuning with R
    In R Language several techniques and packages can be used to optimize these hyperparameters, leading to better, more reliable models. in this article, we will discuss all the techniques and packages for Hyperparameter Tuning with R. What are Hyperparameters?Hyperparameters are the settings that cont
    5 min read
  • How to tune a Decision Tree in Hyperparameter tuning
    Decision trees are powerful models extensively used in machine learning for classification and regression tasks. The structure of decision trees resembles the flowchart of decisions helps us to interpret and explain easily. However, the performance of decision trees highly relies on the hyperparamet
    14 min read
  • Sklearn | Model Hyper-parameters Tuning
    Hyperparameter tuning is the process of finding the optimal values for the hyperparameters of a machine-learning model. Hyperparameters are parameters that control the behaviour of the model but are not learned during training. Hyperparameter tuning is an important step in developing machine learnin
    12 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences