Skip to content
geeksforgeeks
  • Tutorials
    • Python
    • Java
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
    • Practice Coding Problems
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Python for Machine Learning
  • Machine Learning with R
  • Machine Learning Algorithms
  • EDA
  • Math for Machine Learning
  • Machine Learning Interview Questions
  • ML Projects
  • Deep Learning
  • NLP
  • Computer vision
  • Data Science
  • Artificial Intelligence
Open In App
Next Article:
Naive Bayes Classifiers
Next article icon

Naive Bayes Classifiers

Last Updated : 21 May, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Naive Bayes is a classification algorithm that uses probability to predict which category a data point belongs to, assuming that all features are unrelated. This article will give you an overview as well as more advanced use and implementation of Naive Bayes in machine learning.

Illustration behind the Naive Bayes algorithm. We estimate P(x_α|y) independently in each dimension (middle two images) and then obtain an estimate of the full data distribution by assuming conditional independence P(x|y)=∏_αP(x_α|y)(very right image).

Key Features of Naive Bayes Classifiers

The main idea behind the Naive Bayes classifier is to use Bayes' Theorem to classify data based on the probabilities of different classes given the features of the data. It is used mostly in high-dimensional text classification

  • The Naive Bayes Classifier is a simple probabilistic classifier and it has very few number of parameters which are used to build the ML models that can predict at a faster speed than other classification algorithms.
  • It is a probabilistic classifier because it assumes that one feature in the model is independent of existence of another feature. In other words, each feature contributes to the predictions with no relation between each other.
  • Naïve Bayes Algorithm is used in spam filtration, Sentimental analysis, classifying articles and many more.

Why it is Called Naive Bayes?

It is named as "Naive" because it assumes the presence of one feature does not affect other features. The "Bayes" part of the name refers to its basis in Bayes’ Theorem.

Consider a fictional dataset that describes the weather conditions for playing a game of golf. Given the weather conditions, each tuple classifies the conditions as fit(“Yes”) or unfit(“No”) for playing golf. Here is a tabular representation of our dataset.

OutlookTemperatureHumidityWindyPlay Golf
0RainyHotHighFalseNo
1RainyHotHighTrueNo
2OvercastHotHighFalseYes
3SunnyMildHighFalseYes
4SunnyCoolNormalFalseYes
5SunnyCoolNormalTrueNo
6OvercastCoolNormalTrueYes
7RainyMildHighFalseNo
8RainyCoolNormalFalseYes
9SunnyMildNormalFalseYes
10RainyMildNormalTrueYes
11OvercastMildHighTrueYes
12OvercastHotNormalFalseYes
13SunnyMildHighTrueNo

The dataset is divided into two parts, namely, feature matrix and the response vector.

  • Feature matrix contains all the vectors(rows) of dataset in which each vector consists of the value of dependent features. In above dataset, features are ‘Outlook’, ‘Temperature’, ‘Humidity’ and ‘Windy’.
  • Response vector contains the value of class variable(prediction or output) for each row of feature matrix. In above dataset, the class variable name is ‘Play golf’.

Assumption of Naive Bayes

The fundamental Naive Bayes assumption is that each feature makes an:

  • Feature independence: This means that when we are trying to classify something, we assume that each feature (or piece of information) in the data does not affect any other feature.
  • Continuous features are normally distributed: If a feature is continuous, then it is assumed to be normally distributed within each class.
  • Discrete features have multinomial distributions: If a feature is discrete, then it is assumed to have a multinomial distribution within each class.
  • Features are equally important: All features are assumed to contribute equally to the prediction of the class label.
  • No missing data: The data should not contain any missing values.

Introduction to Bayes' Theorem

Bayes’ Theorem provides a principled way to reverse conditional probabilities. It is defined as:

P(y|X) = \frac{P(X|y) \cdot P(y)}{P(X)}

Where:

  • P(y|X): Posterior probability, probability of class y given features X
  • P(X|y): Likelihood, probability of features X given class y
  • P(y): Prior probability of class y
  • P(X): Marginal likelihood or evidence

Naive Bayes Working

1. Terminology

Consider a classification problem (like predicting if someone plays golf based on weather). Then:

  • y is the class label (e.g. "Yes" or "No" for playing golf)
  • X = (x_1, x_2, ..., x_n) is the feature vector (e.g. Outlook, Temperature, Humidity, Wind)

A sample row from the dataset:

X = \text{(Rainy, Hot, High, False)}, \quad y = \text{No}

This represents:

What is the probability that someone will not play golf given that the weather is Rainy, Hot, High humidity, and No wind?

2. The Naive Assumption

The "naive" in Naive Bayes comes from the assumption that all features are independent given the class. That is:

P(x_1, x_2, ..., x_n | y) = P(x_1 | y) \cdot P(x_2 | y) \cdots P(x_n | y)

Thus, Bayes' theorem becomes:

P(y|x_1, ..., x_n) = \frac{P(y) \cdot \prod_{i=1}^{n} P(x_i | y)}{P(x_1)P(x_2)...P(x_n)}

Since the denominator is constant for a given input, we can write:

P(y|x_1, ..., x_n) \propto P(y) \cdot \prod_{i=1}^{n} P(x_i | y)

3. Constructing the Naive Bayes Classifier

We compute the posterior for each class y and choose the class with the highest probability:

\hat{y} = \arg\max_{y} P(y) \cdot \prod_{i=1}^{n} P(x_i | y)

This becomes our Naive Bayes classifier.

4. Example: Weather Dataset

Let’s take a dataset used for predicting if golf is played based on:

  • Outlook: Sunny, Rainy, Overcast
  • Temperature: Hot, Mild, Cool
  • Humidity: High, Normal
  • Wind: True, False
NaiveBayesExample
Example Tables for Naive Bayes

Example Input: X = (Sunny, Hot, Normal, False)

Goal: Predict if golf will be played (Yes or No).

5. Pre-computation from Dataset

Class Probabilities:

From dataset of 14 rows:

  • P(\text{Yes}) = \frac{9}{14}
  • P(\text{No}) = \frac{5}{14}

Conditional Probabilities (Tables 1–4):

Feature

Value

P (Value | Yes)

P (Value | No)

Outlook

Sunny

2/9

3/5

Temperature

Hot

2/9

2/5

Humidity

Normal

6/9

1/5

Wind

False

6/9

2/5

6. Calculate Posterior Probabilities

For Class = Yes:

P(\text{Yes | today}) \propto \frac{2}{9} \cdot \frac{2}{9} \cdot \frac{6}{9} \cdot \frac{6}{9} \cdot \frac{9}{14}

P(\text{Yes | today}) \approx 0.02116

For Class = No:

P(\text{No | today}) \propto \frac{3}{5} \cdot \frac{2}{5} \cdot \frac{1}{5} \cdot \frac{2}{5} \cdot \frac{5}{14}

P(\text{No | today}) \approx 0.0068

7. Normalize Probabilities

To compare:

P(\text{Yes | today}) = \frac{0.02116}{0.02116 + 0.0068} \approx 0.756

P(\text{No | today}) = \frac{0.0068}{0.02116 + 0.0068} \approx 0.244

8. Final Prediction

Since:

P(\text{Yes | today}) > P(\text{No | today})

The model predicts: Yes (Play Golf)

Naive Bayes for Continuous Features

For continuous features, we assume a Gaussian distribution:

P(x_i | y) = \frac{1}{\sqrt{2\pi\sigma^2_y}} \exp\left( -\frac{(x_i - \mu_y)^2}{2\sigma^2_y} \right)

Where:

  • \mu_y is the mean of feature x_i for class y
  • \sigma^2_y is the variance of feature x_i for class y

This leads to what is called Gaussian Naive Bayes.

Types of Naive Bayes Model

There are three types of Naive Bayes Model :

1. Gaussian Naive Bayes

In Gaussian Naive Bayes, continuous values associated with each feature are assumed to be distributed according to a Gaussian distribution. A Gaussian distribution is also called Normal distribution When plotted, it gives a bell shaped curve which is symmetric about the mean of the feature values as shown below:

2. Multinomial Naive Bayes

Multinomial Naive Bayesis used when features represent the frequency of terms (such as word counts) in a document. It is commonly applied in text classification, where term frequencies are important.

3. Bernoulli Naive Bayes

Bernoulli Naive Bayes deals with binary features, where each feature indicates whether a word appears or not in a document. It is suited for scenarios where the presence or absence of terms is more relevant than their frequency. Both models are widely used in document classification tasks

Advantages of Naive Bayes Classifier

  • Easy to implement and computationally efficient.
  • Effective in cases with a large number of features.
  • Performs well even with limited training data.
  • It performs well in the presence of categorical features.
  • For numerical features data is assumed to come from normal distributions

Disadvantages of Naive Bayes Classifier

  • Assumes that features are independent, which may not always hold in real-world data.
  • Can be influenced by irrelevant attributes.
  • May assign zero probability to unseen events, leading to poor generalization.

Applications of Naive Bayes Classifier

  • Spam Email Filtering: Classifies emails as spam or non-spam based on features.
  • Text Classification: Used in sentiment analysis, document categorization, and topic classification.
  • Medical Diagnosis: Helps in predicting the likelihood of a disease based on symptoms.
  • Credit Scoring: Evaluates creditworthiness of individuals for loan approval.
  • Weather Prediction: Classifies weather conditions based on various factors.

Next Article
Naive Bayes Classifiers

K

kartik
Improve
Article Tags :
  • Machine Learning
  • AI-ML-DS
  • Machine Learning
  • AI-ML-DS With Python
Practice Tags :
  • Machine Learning
  • Machine Learning

Similar Reads

    Passive Aggressive Classifiers
    The Passive-Aggressive algorithms are a family of Machine learning algorithms that are not very well known by beginners and even intermediate Machine Learning enthusiasts. However, they can be very useful and efficient for certain applications. Note: This is a high-level overview of the algorithm ex
    5 min read
    Gaussian Naive Bayes
    Gaussian Naive Bayes is a type of Naive Bayes method working on continuous attributes and the data features that follows Gaussian distribution throughout the dataset. This “naive” assumption simplifies calculations and makes the model fast and efficient. Gaussian Naive Bayes is widely used because i
    6 min read
    Naive Bayes Classifier in R Programming
    Naive Bayes is a Supervised Non-linear classification algorithm in R Programming. Naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Baye's theorem with strong(Naive) independence assumptions between the features or variables. The Naive Bayes algorithm is call
    3 min read
    Multinomial Naive Bayes Classifier in R
    The Multinomial Naive Bayes (MNB) classifier is a popular machine learning algorithm, especially useful for text classification tasks such as spam detection, sentiment analysis, and document categorization. In this article, we discuss about the basics of the MNB classifier and how to implement it in
    6 min read
    Ridge Classifier
    Supervised Learning is the type of Machine Learning that uses labelled data to train the model. Both Regression and Classification belong to the category of Supervised Learning. Regression: This is used to predict a continuous range of values using one or more features. These features act as the ind
    10 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences