Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
How to Calculate Cosine Similarity in R?
Next article icon

How to Calculate R^2 with Scikit-Learn

Last Updated : 05 Aug, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

The coefficient of determination, denoted as R², is an essential metric in regression analysis. It indicates the extent to which the independent variables account for the variation in the dependent variable.

In this article, we will walk you through calculating R² using Scikit-Learn, a powerful Python library for machine learning.

What is R²?

R² quantifies the proportion of variance in the dependent variable that can be predicted from the independent variables. It ranges between 0 and 1, with 0 indicating that the model does not explain any of the variability and 1 indicating that the model explains all the variability.

Mathematically, R² is expressed as:

R^2 = 1 - \frac{\text{SS}_{res}}{\text{SS}_{tot}}

Here:

  • SS_{res} is the sum of squares of residuals (the difference between actual and predicted values).
  • SS_{tot} is the total sum of squares (the difference between actual values and the mean of actual values).

Calculating R2 with Scikit-Learn for Sample Data

Let's go through an example to calculate R² from sample data using simple linear regression model.

Step 1: Import Necessary Libraries

import numpy as np
from sklearn.metrics import r2_score

Step 2: Generate Sample Data

# Generate random data
np.random.seed(42)
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)

# Assuming a perfect model prediction (just for the sake of demonstration)
y_pred = 4 + 3 * X

Step 3: Computer the R2 using sklearn

# Flatten the arrays to use in r2_score
y = y.flatten()
y_pred = y_pred.flatten()

# Compute R² using Scikit-Learn
R2_sklearn = r2_score(y, y_pred)
print(f"R² (Scikit-Learn Calculation): {R2_sklearn}")

Complete Code

Python
import numpy as np from sklearn.metrics import r2_score  # Generate random data np.random.seed(42) X = 2 * np.random.rand(100, 1) y = 4 + 3 * X + np.random.randn(100, 1)  # Assuming a perfect model prediction (just for the sake of demonstration) y_pred = 4 + 3 * X  # Flatten the arrays to use in r2_score y = y.flatten() y_pred = y_pred.flatten()  # Compute R² using Scikit-Learn R2_sklearn = r2_score(y, y_pred) print(f"R² (Scikit-Learn Calculation): {R2_sklearn}") 

Output:

R² (Scikit-Learn Calculation): 0.7639751938835576

Calculating R2 for Simple Polynomial Regression Problem using Sklearn

Polynomial regression is a type of regression analysis in which the relationship between the independent variable X and the dependent variable y is modeled as an n-th degree polynomial. We will compute R-square value for polynomial regression model using python.

Step 1: Import Libraries

import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score

Step 2: Generate Sample Data

We'll create a simple nonlinear dataset:

# Generate random data
np.random.seed(42)
X = 6 * np.random.rand(100, 1) - 3
y = 0.5 * X**2 + X + 2 + np.random.randn(100, 1)

Step 3: Prepare Polynomial Features

Transform the input data to include polynomial features up to the desired degree (e.g., degree 2):

poly_features = PolynomialFeatures(degree=2, include_bias=False)
X_poly = poly_features.fit_transform(X)

Step 4: Fit the Polynomial Regression Model

Fit a linear regression model to the polynomial features:

model = LinearRegression()
model.fit(X_poly, y)
y_pred = model.predict(X_poly)

Step 5: Calculate R² Using Scikit-Learn

Verify the manual calculation using Scikit-Learn's r2_score function:

# Flatten the arrays to use in r2_score
y = y.flatten()
y_pred = y_pred.flatten()

# Compute R² using Scikit-Learn
R2_sklearn = r2_score(y, y_pred)
print(f"R² (Scikit-Learn Calculation): {R2_sklearn}")

Visualizing the Results

It's often helpful to visualize the polynomial regression curve along with the data points:

plt.scatter(X, y, color='blue', label='Actual')
# Sort the values for better plotting
sorted_indices = X.flatten().argsort()
plt.plot(X[sorted_indices], y_pred[sorted_indices], color='red', linewidth=2, label='Predicted')
plt.title('Actual vs Predicted (Polynomial Regression)')
plt.xlabel('X')
plt.ylabel('y')
plt.legend()
plt.show()

Complete Code

Python
import numpy as np import matplotlib.pyplot as plt from sklearn.preprocessing import PolynomialFeatures from sklearn.linear_model import LinearRegression from sklearn.metrics import r2_score  # Generate random data np.random.seed(42) X = 6 * np.random.rand(100, 1) - 3 y = 0.5 * X**2 + X + 2 + np.random.randn(100, 1)  poly_features = PolynomialFeatures(degree=2, include_bias=False) X_poly = poly_features.fit_transform(X)  model = LinearRegression() model.fit(X_poly, y) y_pred = model.predict(X_poly)  # Flatten the arrays to use in r2_score y = y.flatten() y_pred = y_pred.flatten()  # Compute R² using Scikit-Learn R2_sklearn = r2_score(y, y_pred) print(f"R² (Scikit-Learn Calculation): {R2_sklearn}")  plt.scatter(X, y, color='blue', label='Actual') # Sort the values for better plotting sorted_indices = X.flatten().argsort() plt.plot(X[sorted_indices], y_pred[sorted_indices], color='red', linewidth=2, label='Predicted') plt.title('Actual vs Predicted (Polynomial Regression)') plt.xlabel('X') plt.ylabel('y') plt.legend() plt.show() 

Output:

R² (Scikit-Learn Calculation): 0.8525067519009746
R2-sqaure
Polynomial Regression Curve

Conclusion

Calculating R² directly from sample data in Python is straightforward and provides valuable insight into your model's performance. By following the steps outlined above, you can easily implement and interpret R² in your regression analyses without relying on a predefined regression model. This approach is useful when you want to validate the goodness of fit of your predictions against actual data.


Next Article
How to Calculate Cosine Similarity in R?

A

alka1974
Improve
Article Tags :
  • Machine Learning
  • AI-ML-DS
  • Python scikit-module
  • AI-ML-DS With Python
Practice Tags :
  • Machine Learning

Similar Reads

  • How to Calculate F1 Score in R?
    In this article, we will be looking at the approach to calculate F1 Score using the various packages and their various functionalities in the R language. F1 Score The F-score or F-measure is a measure of a test's accuracy. It is calculated from the precision and recall of the test, where the precisi
    5 min read
  • How to Calculate SMAPE in R
    SMAPE stands for symmetric mean absolute percentage error. It is an accuracy measure and is used to determine the predictive accuracy of models that are based on relative errors. The relative error is computed as: relative error =  x / y Where x is the absolute error and y is the magnitude of exact
    3 min read
  • How to Obtain TP, TN, FP, FN with Scikit-Learn
    Answer: To obtain True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN) for evaluating classification models, Scikit-Learn offers a straightforward method using the confusion_matrix function. This function helps in extracting these metrics directly from your model'
    2 min read
  • How to Calculate Cosine Similarity in R?
    In this article, we are going to see how to calculate Cosine Similarity in the R Programming language. We can define cosine similarity as the measure of the similarity between two vectors of an inner product space. The formula to calculate the cosine similarity between two vectors is: [Tex]ΣXiYi / (
    2 min read
  • Step-by-Step Guide to Calculating RMSE Using Scikit-learn
    Root Mean Square Error (RMSE) is a widely used metrics for evaluating the accuracy of regression models. It not only provides a comprehensive measure of how closely predictions align with actual values but also emphasizes larger errors, making it particularly useful for identifying areas where model
    5 min read
  • How to Calculate the Standard Error of the Mean in R?
    In this article, we will discuss how to calculate the standard error of the mean in the R programming language. StandardError Of Mean is the standard deviation divided by the square root of the sample size. Formula: Standard Error: (Sample Standard Deviation of Sample)/(Square Root of the sample siz
    2 min read
  • How to calculate the sum of squares?
    The number system includes different types of numbers for example prime numbers, odd numbers, even numbers, rational numbers, whole numbers, etc. These numbers can be expressed in the form of figures as well as words accordingly. For example, numbers like 40 and 65 expressed in the form of figures c
    5 min read
  • How to Calculate SMAPE in Python?
    In this article, we will see how to compute one of the methods to determine forecast accuracy called the Symmetric Mean Absolute Percentage Error (or simply SMAPE) in Python.  The SMAPE is one of the alternatives to overcome the limitations with MAPE forecast error measurement. In contrast to the me
    3 min read
  • How to Calculate Precision in R Programming?
    In this article, we going to learn how to calculate precision using the confusion matrix in the R programming language. Precision A numerical quantity's precision indicates how precisely the amount is expressed. Typically, this is measured in bits, although it can also be in decimal digits. It relat
    3 min read
  • How to Calculate R-Squared for glm in R
    R-squared (R²) is a measure of goodness-of-fit that quantifies the proportion of variance in the dependent variable explained by the independent variables in a regression model. While commonly used in linear regression, R-squared can also be calculated for generalized linear models (GLMs), which enc
    3 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences