Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
Python | Linear Regression using sklearn
Next article icon

Solving Linear Regression in Python

Last Updated : 18 Apr, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Linear regression is a widely used statistical method to find the relationship between dependent variable and one or more independent variables. It is used to make predictions by finding a line that best fits the data we have. The most common approach to best fit a linear regression model is least-squares method which minimize the error between the predicted and actual values. Equation of a straight line is given as:

[Tex]y=mx+b[/Tex]

Where:

  • m is the slope of the line.
  • b is the intercept i.e the value of y when x=0.

To build a simple linear regression model we need to calculate the slope (m) and the intercept (b) that best fit the data points. These parameters can be calculated using mathematical formulas derived from the data. Consider a dataset where the independent attribute is represented by x and the dependent attribute is represented by y.

Slope (m): [Tex]m = \frac{S_{xy}}{S_{xx}}[/Tex]

Where:

  • Sxy​=[Tex]S_{xy} = \sum (x_i – \bar{x})(y_i – \bar{y}) [/Tex] is the sample covariance.
  • [Tex]S_{xx} = \sum (x_i – \bar{x})^2[/Tex] is the sample variance.

Intercept (b): [Tex]b = \bar{y} – m \cdot \bar{x}[/Tex]

  • Where xˉ and yˉ​ are the means of x and y, respectively.

As per the above formula

Slope = 28/10 = 2.8 Intercept = 14.6 – 2.8 * 3 = 6.2. Therefore the desired equation of the regression model is y = 2.8 x + 6.2

We use these values to predict the values of y for the given values of x.

Python Implementation

Below is the Python code to confirm the calculations and visualize the results.

Step 1: Import Libraries

In this we import all the necessary libraries such as numpy, matplotlib, sklearn and statsmodels.

Python
import numpy as np import matplotlib.pyplot as plt  from sklearn.linear_model import LinearRegression from sklearn.metrics import mean_squared_error, r2_score import statsmodels.api as sm 

Step 2: Define Dataset and Compute Slope and Intercept

Next we calculate the slope (b1) and intercept (b0) of the regression line using the least squares method. Also we create a scatter plot of the original data points to visualize the relationship between x and y.

Python
x = np.array([1,2,3,4,5])  y = np.array([7,14,15,18,19]) n = np.size(x)  x_mean = np.mean(x) y_mean = np.mean(y) x_mean,y_mean  Sxy = np.sum(x*y)- n*x_mean*y_mean Sxx = np.sum(x*x)-n*x_mean*x_mean  b1 = Sxy/Sxx b0 = y_mean-b1*x_mean print('slope b1 is', b1) print('intercept b0 is', b0)  plt.scatter(x,y) plt.xlabel('Independent variable X') plt.ylabel('Dependent variable y') 

Output:

Annotation-2025-04-18-035248

Slope and Intercept

Step 3: Plot Data Points and Regression Line

Now that we have the regression equation we use it to predict the y values for each x. Then we plot both the original points in red and the regression line in green to show the fit.

Python
y_pred = b1 * x + b0  plt.scatter(x, y, color = 'red') plt.plot(x, y_pred, color = 'green') plt.xlabel('X') plt.ylabel('y') 

Step 4: Evaluate Model Performance

To evaluate how well our model fits the data we calculate the squared error, mean squared error (MSE) and root mean square error (RMSE). These metrics tell us how far off our predictions are from the actual values

Python
error = y - y_pred se = np.sum(error**2) print('squared error is', se)  mse = se/n  print('mean squared error is', mse)  rmse = np.sqrt(mse) print('root mean square error is', rmse)  SSt = np.sum((y - y_mean)**2) R2 = 1- (se/SSt) print('R square is', R2) 

Output:

Annotation-2025-04-18-035535

Model Evaluation

The output shows the evaluation metrics for a regression model suggest that it has a good fit and accurate predictions.



Next Article
Python | Linear Regression using sklearn
author
ektamaini
Improve
Article Tags :
  • AI-ML-DS
  • Machine Learning
  • Write From Home
  • AI-ML-DS With Python
  • ML-Regression
Practice Tags :
  • Machine Learning

Similar Reads

  • Simple Linear Regression in Python
    Simple linear regression models the relationship between a dependent variable and a single independent variable. In this article, we will explore simple linear regression and it's implementation in Python using libraries such as NumPy, Pandas, and scikit-learn. Understanding Simple Linear Regression
    7 min read
  • Python | Linear Regression using sklearn
    Linear Regression is a machine learning algorithm based on supervised learning. It performs a regression task. Regression models a target prediction value based on independent variables. It is mostly used for finding out the relationship between variables and forecasting. Different regression models
    3 min read
  • Linear Regression using PyTorch
    Linear Regression is a very commonly used statistical method that allows us to determine and study the relationship between two continuous variables. The various properties of linear regression and its Python implementation have been covered in this article previously. Now, we shall find out how to
    4 min read
  • Linear Regression for Single Prediction
    Linear regression is a statistical method and machine learning foundation used to model relationship between a dependent variable and one or more independent variables. The primary goal is to predict the value of the dependent variable based on the values of the independent variables. Predicting a S
    6 min read
  • ML | Multiple Linear Regression using Python
    Linear regression is a fundamental statistical method widely used for predictive analysis. It models the relationship between a dependent variable and a single independent variable by fitting a linear equation to the data. Multiple Linear Regression is an extension of this concept that allows us to
    4 min read
  • Logistic Regression using Python
    A basic machine learning approach that is frequently used for binary classification tasks is called logistic regression. Though its name suggests otherwise, it uses the sigmoid function to simulate the likelihood of an instance falling into a specific class, producing values between 0 and 1. Logisti
    8 min read
  • Simple Linear Regression in R
    Regression shows a line or curve that passes through all the data points on the target-predictor graph in such a way that the vertical distance between the data points and the regression line is minimum What is Linear Regression?Linear Regression is a commonly used type of predictive analysis. Linea
    12 min read
  • Non-Linear Regression in R
    Non-Linear Regression is a statistical method that is used to model the relationship between a dependent variable and one of the independent variable(s). In non-linear regression, the relationship is modeled using a non-linear equation. This means that the model can capture more complex and non-line
    6 min read
  • Linear Regression (Python Implementation)
    Linear regression is a statistical method that is used to predict a continuous dependent variable i.e target variable based on one or more independent variables. This technique assumes a linear relationship between the dependent and independent variables which means the dependent variable changes pr
    14 min read
  • Weighted Least Squares Regression in Python
    Weighted Least Squares (WLS) regression is a powerful extension of ordinary least squares regression, particularly useful when dealing with data that violates the assumption of constant variance. In this guide, we will learn brief overview of Weighted Least Squares regression and demonstrate how to
    6 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences