Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
Linear Regression in Python using Statsmodels
Next article icon

Weighted Least Squares Regression in Python

Last Updated : 12 Apr, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

Weighted Least Squares (WLS) regression is a powerful extension of ordinary least squares regression, particularly useful when dealing with data that violates the assumption of constant variance.

In this guide, we will learn brief overview of Weighted Least Squares regression and demonstrate how to implement it in Python using the statsmodels library.

What is Least Squares Regression?

Least Squares Regression is a method used in statistics to find the best-fitting line or curve that summarizes the relationship between two or more variables. Imagine you're trying to draw a best-fitting line through a scatterplot of data points. This line summarizes the relationship between two variables. LSR, a fundamental statistical method, achieves exactly that. It calculates the line that minimizes the total squared difference between the observed data points and the values predicted by the line.

What is Weighted Least Squares Regression?

Weighted Least Squares (WLS) Regression is a type of statistical analysis used to fit a regression line to a set of data points. It's similar to the traditional Least Squares method, but it gives more importance (or "weight") to some data points over others. WLS regression assigns weights to each observation based on the variance of the error term, allowing for more accurate modeling of heteroscedastic data. Data points with lower variability or higher reliability get assigned higher weights. When fitting the regression line, WLS gives more importance to data points with higher weights, meaning they have a stronger influence on the final result. This helps to better account for variations in the data and can lead to a more accurate regression model, especially when there are unequal levels of variability in the data.

Formula: \hat{\beta} = (X^T W X)^{-1} X^T W y

Where,

  • \hat{\beta}​ is the vector of estimated coefficients.
  • X is the matrix of independent variables (with each row representing an observation and each column a different variable).
  • W is a diagonal matrix of weights, where larger weights indicate observations with greater importance or reliability.
  • y is the vector of dependent variable observations.

Weighted Least Squares Regression Implementation in Python

In Python, the statsmodels library is commonly used for various statistical modeling tasks, including ordinary least squares (OLS) regression. For weighted least squares (WLS) regression implementation we will use statsmodels library.

Steps for Weighted Least Squares Regression Implementation in Python

  • Define your sample data:
    • Create arrays for the independent variable(s) (X) and dependent variable (y).
    • Ensure that your dependent variable (y) has more variability or heteroscedasticity to justify the use of weighted least squares regression.
  • Calculate weights:
    • Calculate the errors by subtracting the mean of the dependent variable (y) from each observed value.
    • Compute the variance of the errors.
    • Calculate the weights as the inverse of the error variance. This step ensures that observations with higher variance contribute less to the estimation.
  • Add constant term:
    • Include a constant term in the independent variable(s) matrix (X) using sm.add_constant().
  • Fit the model:
    • Use sm.WLS() to specify the weighted least squares regression model.
    • Use .fit() to estimate the parameters of the model.
  • Inspect results

In the below code, the implementation is demonstrated using statsmodels library.

Python3
import numpy as np import statsmodels.api as sm  # Sample data X = np.array([1, 2, 3, 4, 5]) y = np.array([2.6, 3.7, 4.3, 5.8, 6.2])  # Adjusted y values with more variability  # Calculate weights based on the inverse of the variance of the errors errors = y - np.mean(y) error_variance = np.var(errors) weights = 1 / error_variance  # Fit weighted least squares regression model X = sm.add_constant(X) model = sm.WLS(y, X, weights=weights) results = model.fit()  # Print regression results print(results.summary()) 

Output:

                            WLS Regression Results                            
==============================================================================
Dep. Variable: y R-squared: 0.975
Model: WLS Adj. R-squared: 0.967
Method: Least Squares F-statistic: 118.5
Date: Wed, 10 Apr 2024 Prob (F-statistic): 0.00166
Time: 12:48:35 Log-Likelihood: 0.72561
No. Observations: 5 AIC: 2.549
Df Residuals: 3 BIC: 1.768
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 1.7300 0.283 6.105 0.009 0.828 2.632
x1 0.9300 0.085 10.885 0.002 0.658 1.202
==============================================================================
Omnibus: nan Durbin-Watson: 3.395
Prob(Omnibus): nan Jarque-Bera (JB): 0.537
Skew: 0.600 Prob(JB): 0.765
Kurtosis: 1.935 Cond. No. 8.37
==============================================================================
  • The R-squared value is 0.975, indicating that 97.5% of the variance in the dependent variable is explained by the independent variable(s). Adjusted R-squared is a modified version of R-squared that adjusts for the number of predictors in the model.The Adjusted R-squared value is 0.967.
  • The F-statistic tests the overall significance of the regression model. F-statistic value is 118.5 with a low p-value 0.00166 indicating that the model is statistically significant.

Ordinary Least Squares Regression Vs Weighted Least Squares Regression

Aspect

Ordinary Least Squares (OLS) Regression

Weighted Least Squares (WLS) Regression

Objective

Minimize the sum of squared differences between observed and predicted values.

Minimize the weighted sum of squared differences between observed and predicted values.

Assumption

Assumes constant variance (homoscedasticity) of errors.

Allows for varying variance (heteroscedasticity) of errors.

Weighting of Observations

Assigns equal weight to each observation.

Assigns weights to observations based on the variance of the error term associated with each observation.

Usage

Suitable for datasets with constant variance of errors.

Suitable for datasets with varying variance of errors.

Implementation

Implemented using the ordinary least squares method.

Implemented using the weighted least squares method.

Model Evaluation

Provides unbiased estimates of coefficients under homoscedasticity.

Provides more accurate estimates of coefficients under heteroscedasticity.

Example

Fit a straight line through data points.

Fit a line that adjusts for varying uncertainty in data points.

Advantages of Weighted Least Squares Regression

  • Handles Varying Data Uncertainty: WLS regression accommodates data where the uncertainty (variance) changes across observations, providing more accurate results compared to OLS regression.
  • Improved Parameter Estimates: By giving more weight to reliable data points, WLS regression offers more precise estimates of coefficients and standard errors, especially in the presence of heteroscedasticity.
  • Robustness: WLS regression can yield more robust estimates, making it suitable for various fields where data exhibit heteroscedasticity.

Disadvantages of Weighted Least Squares Regression

  • Need for Correct Weighting: Correctly specifying weights based on error variance is crucial; incorrect weights can lead to biased results.
  • Complexity in Weight Determination: Determining appropriate weights, especially in complex datasets, can be challenging and may require careful consideration.
  • Computational Overhead: Implementing WLS regression may involve additional computational complexity, especially with large datasets or complex weighting schemes.
  • Sensitivity to Outliers: WLS regression, like OLS, can be sensitive to outliers, which may affect estimation accuracy if not properly addressed.

Conclusion

Weighted Least Squares (WLS) regression offers a valuable enhancement to traditional regression methods by accommodating data with varying levels of uncertainty. By assigning weights based on error variance, WLS regression provides more accurate parameter estimates, making it a powerful tool across diverse fields from finance to healthcare.


Next Article
Linear Regression in Python using Statsmodels

P

pmishra01
Improve
Article Tags :
  • Machine Learning
  • AI-ML-DS
  • AI-ML-DS With Python
Practice Tags :
  • Machine Learning

Similar Reads

  • Locally weighted linear Regression using Python
    Locally weighted linear regression is the nonparametric regression methods that combine k-nearest neighbor based machine learning. It is referred to as locally weighted because for a query point the function is approximated on the basis of data near that and weighted because the contribution is weig
    4 min read
  • Weighted Lasso Regression in R
    In the world of data analysis and prediction, regression techniques are essential for understanding relationships between variables and making accurate forecasts. One standout method among many is Lasso regression. It not only helps in finding these relationships but also aids in creating models tha
    6 min read
  • Linear Regression in Python using Statsmodels
    In this article, we will discuss how to use statsmodels using Linear Regression in Python. Linear regression analysis is a statistical technique for predicting the value of one variable(dependent variable) based on the value of another(independent variable). The dependent variable is the variable th
    4 min read
  • Simple Linear Regression in Python
    Simple linear regression models the relationship between a dependent variable and a single independent variable. In this article, we will explore simple linear regression and it's implementation in Python using libraries such as NumPy, Pandas, and scikit-learn. Understanding Simple Linear Regression
    7 min read
  • Solving Linear Regression in Python
    Linear regression is a widely used statistical method to find the relationship between dependent variable and one or more independent variables. It is used to make predictions by finding a line that best fits the data we have. The most common approach to best fit a linear regression model is least-s
    3 min read
  • Python | Linear Regression using sklearn
    Linear Regression is a machine learning algorithm based on supervised learning. It performs a regression task. Regression models a target prediction value based on independent variables. It is mostly used for finding out the relationship between variables and forecasting. Different regression models
    3 min read
  • Stepwise Regression in Python
    Stepwise regression is a method of fitting a regression model by iteratively adding or removing variables. It is used to build a model that is accurate and parsimonious, meaning that it has the smallest number of variables that can explain the data. There are two main types of stepwise regression: F
    6 min read
  • Weighted Ridge Regression in R
    Weighted Ridge Regression extends regular Ridge Regression by assigning different weights to data points based on their importance. This allows for more flexibility and improved model accuracy by giving more influence to reliable data points. What is Ridge Regression?Ridge Regression is a method use
    4 min read
  • Weighted logistic regression in R
    Weighted logistic regression is an extension of logistic regression that allows for different observations to contribute differently to the estimation process. This is particularly useful in survey data where each observation might represent a different number of units in the population, or in cases
    4 min read
  • Ordinary Least Squares (OLS) Regression in R
    Ordinary Least Squares (OLS) Regression allows researchers to understand the impact of independent variables on the dependent variable and make predictions based on the model. Ordinary Least Squares (OLS) Regression in ROrdinary Least Squares (OLS) regression is a powerful statistical method used to
    6 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences