Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
Quadratic Discriminant Analysis
Next article icon

Linear and Quadratic Discriminant Analysis using Sklearn

Last Updated : 20 May, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

Linear Discriminant Analysis (LDA) and Quadratic Discriminant Analysis (QDA) are two well-known classification methods that are used in machine learning to find patterns and put things into groups. They are especially helpful when you have labeled data and want to classify new observations notes into pre-defined categories.

In this we will implement both these techniques, Linear and Quadratic Discriminant Analysis using Sklearn.

Table of Content

  • Understanding Linear and Quadratic Discriminant Analysis
  • Implementing Linear and Quadratic Discriminant Analysis with Scikit-Learn
    • Applying Linear Discriminant Analysis (LDA)
    • Applying Quadratic Discriminant Analysis (QDA)
    • Visualizing Linear and Quadratic Discriminant Analysis

Understanding Linear and Quadratic Discriminant Analysis

Linear Discriminant Analysis (LDA)

Linear Discriminant Analysis assumes that the data in each class is normally distributed and has the same correlation matrix. It finds a linear combination of features that best separates the classes apart, sometimes referred to as Fisher's linear discriminant. The idea is to maximize the distance between classes while projecting the data into a lower-dimensional space.

Under the presumptions, LDA determines the best linear decision boundary by minimizing the ratio of variation within a class to variance across classes.

The steps to compute LDA using sklearn are:

  • Compute the mean vectors for each class.
  • Compute the within-class and between-class scatter matrices.
  • Compute the eigenvalues and eigenvectors for the scatter matrices.
  • Select the top k eigenvectors that match to the k biggest eigenvalues to make a new feature space.
  • Project the data onto the new feature space.

Quadratic Discriminant Analysis (QDA)

QDA is similar to LDA but does not assume that the correlation matrices of each class are equal. This helps QDA to build more flexible decision limits by describing each class with its own correlation matrix.

The steps to compute QDA using sklearn are:

  • Compute the mean vector and correlation matrix for each class.
  • Use the quadratic form of the discriminant function to describe new data.

Implementing Linear and Quadratic Discriminant Analysis with Scikit-Learn

Scikit-Learn is a well-known Python machine learning package that offers effective implementations of Linear Discriminant Analysis (LDA) and Quadratic Discriminant Analysis (QDA) via their respective classes. To use LDA or QDA in Scikit-Learn, Let's go through with below steps

1. Import the Necessary Modules

Python
import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import make_classification from sklearn.model_selection import train_test_split from sklearn.discriminant_analysis import LinearDiscriminantAnalysis, QuadraticDiscriminantAnalysis from sklearn.metrics import accuracy_score, confusion_matrix, classification_report 

2. Generate Data

Python
# Generate synthetic data X, y = make_classification(n_samples=1000, n_features=2, n_informative=2, n_redundant=0,                            n_clusters_per_class=1, n_classes=3, random_state=42) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) 

Applying Linear Discriminant Analysis (LDA)

Python
# Initialize and train the LDA model lda = LinearDiscriminantAnalysis() lda.fit(X_train, y_train) y_pred_lda = lda.predict(X_test)  print("LDA Accuracy:", accuracy_score(y_test, y_pred_lda)) print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred_lda)) print("Classification Report:\n", classification_report(y_test, y_pred_lda)) 

Output:

LDA Accuracy: 0.8266666666666667
Confusion Matrix (LDA):
[[ 75 4 22]
[ 16 71 0]
[ 0 10 102]]
Classification Report (LDA):
precision recall f1-score support

0 0.82 0.74 0.78 101
1 0.84 0.82 0.83 87
2 0.82 0.91 0.86 112

accuracy 0.83 300
macro avg 0.83 0.82 0.82 300
weighted avg 0.83 0.83 0.83 300

Applying Quadratic Discriminant Analysis (QDA)

Python
# Initialize and train the QDA model qda = QuadraticDiscriminantAnalysis() qda.fit(X_train, y_train)  # Make predictions y_pred_qda = qda.predict(X_test)  # Evaluate the model print("QDA Accuracy:", accuracy_score(y_test, y_pred_qda)) print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred_qda)) print("Classification Report:\n", classification_report(y_test, y_pred_qda)) 

Output:

QDA Accuracy: 0.93
Confusion Matrix (QDA):
[[ 96 2 3]
[ 10 77 0]
[ 4 2 106]]
Classification Report (QDA):
precision recall f1-score support

0 0.87 0.95 0.91 101
1 0.95 0.89 0.92 87
2 0.97 0.95 0.96 112

accuracy 0.93 300
macro avg 0.93 0.93 0.93 300
weighted avg 0.93 0.93 0.93 300

Visualizing Linear and Quadratic Discriminant Analysis

For visualization let's plot decision boundaries , the decision border is a line that divides the two classes of data points. The goal of a classifier is to predict the class of a new data point, based on its features. The decision border shows the classifier's rule for splitting the classes.

Python
def plot_decision_boundaries(X, y, model, title, subplot_index):     plt.subplot(subplot_index)     x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1     y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1     xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.01),                          np.arange(y_min, y_max, 0.01))     Z = model.predict(np.c_[xx.ravel(), yy.ravel()])     Z = Z.reshape(xx.shape)     plt.contourf(xx, yy, Z, alpha=0.8)     plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k', marker='o')     plt.title(title)     plt.xlabel('Feature 1')     plt.ylabel('Feature 2')   plt.figure(figsize=(10, 4)) # Plot decision boundaries for LDA plot_decision_boundaries(X_test, y_test, lda, "LDA Decision Boundary", 121)  # Plot decision boundaries for QDA plot_decision_boundaries(X_test, y_test, qda, "QDA Decision Boundary", 122)  plt.tight_layout() plt.show() 

Output:

qda--lda
Decision Boundary Plots for LDA and QDA

The number of dots in the picture does not appear to be linked with the leftovers. Residue, in this case, refers to the difference between the expected value of a data point and its real value.

LDA projects data from a higher-dimensional space onto a lower-dimensional space in a way that maximizes the separation between different classes. In this case, the decision boundary likely separates the data points into two or more classes while QDA allows for a more complex connection. The QDA decision boundary looks to be more flexible than the LDA decision boundary, which may help it to better fit the data in some cases.

Conclusion

Finally, for supervised classification problems, Linear Discriminant Analysis (LDA) and Quadratic Discriminant Analysis (QDA) are effective methods. QDA allows each class to have its own covariance matrix, while LDA relaxes this condition by assuming that the classes have equal covariance matrices. Both approaches are practical and have their merits; Scikit-Learn offers handy implementations that make integrating them into machine learning pipelines simple.


Next Article
Quadratic Discriminant Analysis

S

sushmaa1ii
Improve
Article Tags :
  • Machine Learning
  • Blogathon
  • AI-ML-DS
  • Data Science Blogathon 2024
  • Sklearn
Practice Tags :
  • Machine Learning

Similar Reads

  • Quadratic Discriminant Analysis
    Linear Discriminant Analysis Now, Let's consider a classification problem represented by a Bayes Probability distribution P(Y=k | X=x), LDA does it differently by trying to model the distribution of X given the predictors class (I.e. the value of Y) P(X=x| Y=k): [Tex]P(Y=k | X=x) = \frac{P(X=x | Y=k
    4 min read
  • Discriminant Function Analysis Using R
    Discriminant Function Analysis (DFA) is a statistical technique to classify data into specific groups on the basis of independent variables. It has various applications in finance, biology, and marketing. Key ConceptsDependent Variable: Categorical variable to be predicted (e.g., species).Independen
    2 min read
  • Linear Discriminant Analysis in Machine Learning
    When working with high-dimensional datasets it is important to apply dimensionality reduction techniques to make data exploration and modeling more efficient. One such technique is Linear Discriminant Analysis (LDA) which helps in reducing the dimensionality of data while retaining the most signific
    6 min read
  • Linear Discriminant Analysis in R Programming
    One of the most popular or well established Machine Learning technique is Linear Discriminant Analysis (LDA ). It is mainly used to solve classification problems rather than supervised classification problems. It is basically a dimensionality reduction technique. Using the Linear combinations of pre
    6 min read
  • Canonical Correlation Analysis (CCA) using Sklearn
    Canonical Correlation Analysis (CCA) is a statistical method used in data analysis to identify and quantify the relationships between two sets of variables. When working with multivariate data—that is, when there are several variables in each of the two sets and we want to know how they connect—it i
    10 min read
  • Gaussian Discriminant Analysis
    Gaussian Discriminant Analysis (GDA) is a supervised learning algorithm used for classification tasks in machine learning. It is a variant of the Linear Discriminant Analysis (LDA) algorithm that relaxes the assumption that the covariance matrices of the different classes are equal. GDA works by ass
    7 min read
  • Regularized Discriminant Analysis
    Regularized Discriminant analysis Linear Discriminant analysis and QDA work straightforwardly for cases where a number of observations is far greater than the number of predictors n>p. In these situations, it offers very advantages such as ease to apply (Since we don't have to calculate the covar
    3 min read
  • Normal and Shrinkage Linear Discriminant Analysis for Classification in Scikit Learn
    In this article, we will try to understand the difference between Normal and Shrinkage Linear Discriminant Analysis for Classification. We will try to implement the same using sci-kit learn library in Python. But first, let's try to understand what is LDA. What is Linear discriminant analysis (LDA)?
    4 min read
  • Gaussian Naive Bayes using Sklearn
    In the world of machine learning, Gaussian Naive Bayes is a simple yet powerful algorithm used for classification tasks. It belongs to the Naive Bayes algorithm family, which uses Bayes' Theorem as its foundation. The goal of this post is to explain the Gaussian Naive Bayes classifier and offer a de
    8 min read
  • Classification Metrics using Sklearn
    Machine learning classification is a powerful tool that helps us make predictions and decisions based on data. Whether it's determining whether an email is spam or not, diagnosing diseases from medical images, or predicting customer churn, classification algorithms are at the heart of many real-worl
    14 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences