Skip to content
geeksforgeeks
  • Tutorials
    • Python
    • Java
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
    • Practice Coding Problems
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
Regularization in Machine Learning
Next article icon

Regularization in Machine Learning

Last Updated : 23 May, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Regularization is an important technique in machine learning that helps to improve model accuracy by preventing overfitting which happens when a model learns the training data too well including noise and outliers and perform poor on new data. By adding a penalty for complexity it helps simpler models to perform better on new data. In this article, we will see main types of regularization i.e Lasso, Ridge and Elastic Net and see how they help to build more reliable models.

Table of Content

  • Types of Regularization
  • What are Overfitting and Underfitting?
  • What are Bias and Variance?
  • Benefits of Regularization

Types of Regularization

1. Lasso Regression

A regression model which uses the L1 Regularization technique is called LASSO (Least Absolute Shrinkage and Selection Operator) regression. It adds the absolute value of magnitude of the coefficient as a penalty term to the loss function(L). This penalty can shrink some coefficients to zero which helps in selecting only the important features and ignoring the less important ones.

\rm{Cost} = \frac{1}{n}\sum_{i=1}^{n}(y_i-\hat{y_i})^2 +\lambda \sum_{i=1}^{m}{|w_i|}

where

  • m - Number of Features
  • n - Number of Examples
  • yi - Actual Target Value
  • \hat{y}_i - Predicted Target Value

Lets see how to implement this using python:

  • X, y = make_regression(n_samples=100, n_features=5, noise=0.1, random_state=42) : Generates a regression dataset with 100 samples, 5 features and some noise.
  • X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) : Splits the data into 80% training and 20% testing sets.
  • lasso = Lasso(alpha=0.1) : Creates a Lasso regression model with regularization strength alpha set to 0.1.
Python
from sklearn.linear_model import Lasso from sklearn.model_selection import train_test_split from sklearn.datasets import make_regression from sklearn.metrics import mean_squared_error  X, y = make_regression(n_samples=100, n_features=5, noise=0.1, random_state=42) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)  lasso = Lasso(alpha=0.1) lasso.fit(X_train, y_train)  y_pred = lasso.predict(X_test)  mse = mean_squared_error(y_test, y_pred) print(f"Mean Squared Error: {mse}")  print("Coefficients:", lasso.coef_) 

Output:

regularization1
Lasso Regression

The output shows the model's prediction error and the importance of features with some coefficients reduced to zero due to L1 regularization.

2. Ridge Regression

A regression model that uses the L2 regularization technique is called Ridge regression. It adds the squared magnitude of the coefficient as a penalty term to the loss function(L).

\rm{Cost} = \frac{1}{n}\sum_{i=1}^{n}(y_i-\hat{y_i})^2 + \lambda \sum_{i=1}^{m}{w_i^2}

where,

  • n = Number of examples or data points
  • m = Number of features i.e predictor variables
  • y_i = Actual target value for the ith example
  • \hat{y}_i​ = Predicted target value for the ith example
  • w_i = Coefficients of the features
  • \lambda= Regularization parameter that controls the strength of regularization

Lets see how to implement this using python:

  • ridge = Ridge(alpha=1.0) : Creates a Ridge regression model with regularization strength alpha set to 1.0.
Python
from sklearn.linear_model import Ridge from sklearn.datasets import make_regression from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error  X, y = make_regression(n_samples=100, n_features=5, noise=0.1, random_state=42) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)  ridge = Ridge(alpha=1.0) ridge.fit(X_train, y_train) y_pred = ridge.predict(X_test)  mse = mean_squared_error(y_test, y_pred) print("Mean Squared Error:", mse) print("Coefficients:", ridge.coef_) 

Output:

regualrization2
Ridge Regression

The output shows the MSE showing model performance. Lower MSE means better accuracy. The coefficients reflect the regularized feature weights.

3. Elastic Net Regression

Elastic Net Regression is a combination of both L1 as well as L2 regularization. That shows that we add the absolute norm of the weights as well as the squared measure of the weights. With the help of an extra hyperparameter that controls the ratio of the L1 and L2 regularization.

\rm{Cost} = \frac{1}{n}\sum_{i=1}^{n}(y_i-\hat{y_i})^2 + \lambda\left((1-\alpha)\sum_{i=1}^{m}{|w_i|} + \alpha \sum_{i=1}^{m}{w_i^2}\right)

where

  • n = Number of examples (data points)
  • m = Number of features (predictor variables)
  • y_i​ = Actual target value for the ith example
  • \hat{y}_i​ = Predicted target value for the ith example
  • wi = Coefficients of the features
  • \lambda= Regularization parameter that controls the strength of regularization
  • α = Mixing parameter where 0 ≤ \alpha≤ 1 and \alpha= 1 corresponds to Lasso (L1) regularization, \alpha= 0 corresponds to Ridge (L2) regularization and Values between 0 and 1 provide a balance of both L1 and L2 regularization

Lets see how to implement this using python:

  • model = ElasticNet(alpha=1.0, l1_ratio=0.5) : Creates an Elastic Net model with regularization strength alpha=1.0 and L1/L2 mixing ratio 0.5.
Python
from sklearn.linear_model import ElasticNet from sklearn.datasets import make_regression from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error  X, y = make_regression(n_samples=100, n_features=10, noise=0.1, random_state=42) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)  model = ElasticNet(alpha=1.0, l1_ratio=0.5) model.fit(X_train, y_train)  y_pred = model.predict(X_test) mse = mean_squared_error(y_test, y_pred)  print("Mean Squared Error:", mse) print("Coefficients:", model.coef_) 

Output:

regularization3
Elastic Net Regression

The output shows MSE which measures how far off predictions are from actual values i.e lower is better and coefficients show feature importance.

Learn more about the difference between the regularization techniques here: Lasso vs Ridge vs Elastic Net

What are Overfitting and Underfitting?

Overfitting and underfitting are terms used to describe the performance of machine learning models in relation to their ability to generalize from the training data to unseen data.

overfitting_21-2

Overfitting happens when a machine learning model learns the training data too well including the noise and random details. This makes the model to perform poorly on new, unseen data because it memorizes the training data instead of understanding the general patterns.

For example, if we only study last week’s weather to predict tomorrow’s i.e our model might focus on one-time events like a sudden rainstorm which won’t help for future predictions.

Underfitting is the opposite problem which happens when the model is too simple to learn even the basic patterns in the data. An underfitted model performs poorly on both training and new data. To fix this we need to make the model more complex or add more features.

For example if we use only the average temperature of the year to predict tomorrow’s weather hence the model misses important details like seasonal changes which results in bad predictions.

What are Bias and Variance?

  • Bias refers to the errors which occur when we try to fit a statistical model on real-world data which does not fit perfectly well on some mathematical model. If we use a way too simplistic a model to fit the data then we are more probably face the situation of High Bias (underfitting) refers to the case when the model is unable to learn the patterns in the data at hand and perform poorly.
  • Variance shows the error value that occurs when we try to make predictions by using data that is not previously seen by the model. There is a situation known as high variance (overfitting) that occurs when the model learns noise that is present in the data.

Finding a proper balance between the two is also known as the Bias-Variance Tradeoff which helps us to design an accurate model.

Bias Variance tradeoff

The Bias-Variance Tradeoff refers to the balance between bias and variance which affect predictive model performance. Finding the right tradeoff is important for creating models that generalize well to new data.

  • The bias-variance tradeoff shows the inverse relationship between bias and variance. When one decreases, the other tends to increase and vice versa.
  • Finding the right balance is important. An overly simple model with high bias won't capture the underlying patterns while an overly complex model with high variance will fit the noise in the data.
ML--Bias-Vs-Variance-(1)-(1)-(1)

Benefits of Regularization

Now, let’s see various benefits of regularization which are as follows:

  1. Prevents Overfitting: Regularization helps models focus on underlying patterns instead of memorizing noise in the training data.
  2. Improves Interpretability: L1 (Lasso) regularization simplifies models by reducing less important feature coefficients to zero.
  3. Enhances Performance: Prevents excessive weighting of outliers or irrelevant features helps in improving overall model accuracy.
  4. Stabilizes Models: Reduces sensitivity to minor data changes which ensures consistency across different data subsets.
  5. Prevents Complexity: Keeps model from becoming too complex which is important for limited or noisy data.
  6. Handles Multicollinearity: Reduces the magnitudes of correlated coefficients helps in improving model stability.
  7. Allows Fine-Tuning: Hyperparameters like alpha and lambda control regularization strength helps in balancing bias and variance.
  8. Promotes Consistency: Ensures reliable performance across different datasets which reduces the risk of large performance shifts.

Mastering regularization techniques helps us to create models that balance complexity and accuracy which leads to better predictions in real-world problems.


Next Article
Regularization in Machine Learning

A

AlindGupta
Improve
Article Tags :
  • Machine Learning
  • AI-ML-DS
  • python
Practice Tags :
  • Machine Learning
  • python

Similar Reads

    Regularization Techniques in Machine Learning
    Overfitting is a major concern in the field of machine learning, as models aim to extract complex patterns from data. When a model learns to commit the training data to memory instead of making good generalizations to new data, this is known as overfitting. The model may perform poorly as a result w
    10 min read
    Regression in machine learning
    Regression in machine learning refers to a supervised learning technique where the goal is to predict a continuous numerical value based on one or more independent features. It finds relationships between variables so that predictions can be made. we have two types of variables present in regression
    5 min read
    Cross Validation in Machine Learning
    Cross-validation is a technique used to check how well a machine learning model performs on unseen data. It splits the data into several parts, trains the model on some parts and tests it on the remaining part repeating this process multiple times. Finally the results from each validation step are a
    7 min read
    Linear Regression in Machine learning
    Linear regression is a type of supervised machine-learning algorithm that learns from the labelled datasets and maps the data points with most optimized linear functions which can be used for prediction on new datasets. It assumes that there is a linear relationship between the input and output, mea
    15+ min read
    What is Machine Learning?
    Machine learning is a branch of artificial intelligence that enables algorithms to uncover hidden patterns within datasets. It allows them to predict new, similar data without explicit programming for each task. Machine learning finds applications in diverse fields such as image and speech recogniti
    9 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences