Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
LightGBM Boosting Algorithms
Next article icon

How CatBoost algorithm works

Last Updated : 24 Apr, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

CatBoost is an acronym that refers to "Categorical Boosting" and is intended to perform well in classification and regression tasks. CatBoost's ability to handle categorical variables without the requirement for manual encoding is one of its primary advantages. It employs a method known as Ordered Boosting to handle the difficulties faced by categorical features such as large cardinality. This enables CatBoost to handle categorical data automatically, saving the user time and effort.CatBoost's basic idea is its ability to effectively and efficiently handle categorical features. It implements a novel technique called Ordered Boosting, which generates a numerical representation by permuting the categorical variables. This method maintains the category information while allowing the model to use the powerful gradient-boosting technique.

What is CatBoost?

CatBoost, the cutting-edge algorithm developed by Yandex is always a go-to solution for seamless, efficient, and mind-blowing machine learning, classification and regression tasks. With its innovative Ordered Boosting algorithm, CatBoost takes the predictions to new heights by harnessing the power of decision trees. In this article, you'll explore, the workings of catboost algorithm.

Key features related to CatBoost :

The key features related to CatBoost are as follows:

  1. Gradient boosting: It is a powerful ensemble learning technique that combines weak prediction models, often decision trees, to construct a powerful predictive model. It works by iteratively adding new models to the ensemble, each one trained to correct the errors made by the previous models. Gradient boosting is used by CatBoost to increase model accuracy by focusing on misclassified examples.
  2. Categorical Features: Categorical features, such as colour or type, are variables that reflect qualitative data. CatBoost handles categorical characteristics effectively without the need for substantial preprocessing or one-shot encoding, making it an effective tool for real-world datasets.
  3. Learning rate: The learning rate controls the step size at which the model learns during the boosting phase. To balance the model's learning speed and accuracy, CatBoost automatically picks an ideal learning rate based on the dataset features.
  4. L2 regularization: It is also known as ridge regularization, introduces a penalty term into the loss function to prevent overfitting and improve the generalization ability of the model. In the context of CatBoost, L2 regularization is a key feature that helps to control the complexity of the boosted trees. It achieves this by adding a regularization term to the loss function used during the training process. 

Workings of Catboost

CatBoost is a powerful gradient-boosting technique designed for machine learning tasks, particularly those involving structured input. It leverages the concept of gradient boosting, which is an ensemble learning method. The algorithm starts by making an initial guess, often the mean of the target variable. It then gradually constructs an ensemble of decision trees, with each tree aiming to reduce the errors or residuals from the previous trees.

One of the key strengths of CatBoost is its ability to handle categorical features effectively. It employs a technique called "ordered boosting" to directly process categorical data, leading to faster training and improved model performance. This is achieved by encoding the categorical features in a way that preserves the natural ordering of the categories.

To prevent overfitting, CatBoost incorporates regularization techniques. These techniques introduce penalties or constraints during the training process to discourage the model from becoming too complex and fitting thе training data too closely. Regularization helps to generalize the model and make it more robust to unseen data.

The algorithm iteratively constructs the ensemble of trees by minimizing the loss function using gradient descent. At each iteration, it calculates the negative gradient of the loss function with respect to the current predictions and fits a new tree to the negative gradient. The learning rate determines the step size taken during gradient descent. The process is repeated until a predetermined number of trees have been added or a convergence criterion has been met. When making predictions, CatBoost combines the predictions from all the trees in the ensemblе. This aggregation of predictions results in highly accurate and reliable models.

Mathematically,

CatBoost can be represented as follows:

Given a training datasеt with N samples and M features, where each sample is denoted as (x_i, y_i), as x_i is a vector of M features and y_i is the corresponding target variablе, CatBoost aims to learn a function F(x) that predicts the target variable y.

F(x) = F_0(x) + \sum_{m=1}^M \sum_{i=1}^N f_m(x_i)

where,

  • F(x) represents thе overall prediction function that CatBoost aims to learn. It takes an input vector x and predicts the corresponding target variable y.
  • F_0(x) is the initial guess or the baseline prediction. It is often set as the mean of the target variable in the training dataset. This term captures the overall average behavior of the target variable.
  • Σ_{m=1}^M represents the summation over the ensemble of trees. M denotes the total number of trees in the ensemble.
  • Σ_{i=1}^N represents the summation over the training samples. N denotes the total number of training samples.
  • f_m(x_i) represents the prediction of the m-th tree for the i-th training sample. Each tree in the ensemble contributes to the overall prediction by making its own prediction for each training sample.

The equation states that the overall prediction F(x) is obtained by summing up the initial guess F_0(x) with thе predictions of each tree f_m(x_i) for each training sample. This summation is performed for all trees (m) and all training samples (i).

Getting started with CatBoost :

Install the Packages

!pip install catboost

Step 1: Importing Necessary Libraries

Before we begin coding, we must first import the appropriate libraries. We'll use the pandas package for data manipulation and the CatBoost library for algorithm implementation.

Python3
import pandas as pd from catboost import CatBoostClassifier from sklearn.metrics import accuracy_score from sklearn.metrics import classification_report from matplotlib import pyplot as plt from sklearn.preprocessing import LabelEncoder 

Step 2: Loading the Dataset

Dataset link :Titanic Dataset

Python3
titanic_data = pd.read_csv('titanic.csv') titanic_data = titanic_data.drop(['Name', 'Ticket', 'Cabin'], axis=1) 

Step 3: Preprocessing the Dataset

Preprocessing processes will be performed for the dataset. Missing values will be handled, categorical variables will be converted to numeric representations, and the data will be divided into training and testing sets.

Python
#handle missing values titanic_data['Age'].fillna(titanic_data['Age'].mean(), inplace=True) titanic_data['Embarked'].fillna(titanic_data['Embarked'].mode()[0], inplace=True)  # Convert categorical variables to numeric le=LabelEncoder() titanic_data[['Sex','Embarked']] = titanic_data[['Sex','Embarked']].apply(le.fit_transform)  # Split the data into features and target X = titanic_data[['Pclass', 'Sex', 'Age', 'SibSp', 'Parch', 'Fare']] y = titanic_data['Survived']  # Split into training and testing sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) 

Step 4: Setup and Training the CatBoost Model

We'll now initialize the CatBoostClassifier and define the training hyperparameters. We'll determine the number of iterations, learning rate, and tree depth. Finally, the model will be fitted to the training data.

Python3
model = CatBoostClassifier(iterations=100, learning_rate=0.1, depth=6) model.fit(X_train, y_train) # fit the model to training data 

Step 5: Assessing the Model's Performance

We can evaluate the model's performance on the testing data after it has been trained. To understand the precision, recall, and F1-score of the model, we'll compute the accuracy score and provide a classification report.

Python3
y_pred = model.predict(X_test) # Predict on the testing data  accuracy = accuracy_score(y_test, y_pred) #model performance classification_report = classification_report(y_test, y_pred)  print("Accuracy:", accuracy) print("Classification Report:\n", classification_report) 

Output:

98:    learn: 0.3625223    total: 257ms    remaining: 2.59ms  99:    learn: 0.3621516    total: 259ms    remaining: 0us  Accuracy: 0.7988826815642458  Classification Report:                 precision    recall  f1-score   support             0       0.79      0.89      0.84       105             1       0.81      0.68      0.74        74      accuracy                           0.80       179     macro avg       0.80      0.78      0.79       179  weighted avg       0.80      0.80      0.80       179  

Step 6: Feature Importance with CatBoost

CatBoost includes an in-built feature importance approach for determining the importance of each feature in the model. A bar plot can be used to show the feature significance scores.

Python
feature_importance = model.get_feature_importance() feature_names = X.columns  plt.bar(feature_names, feature_importance) plt.xlabel("Feature Importance") plt.title("CatBoost Feature Importance") plt.show() 

Output:

Feature importance-Geeksforgeeks
feature importance generated by matplotlib

Conclusion

To summarize, CatBoost is a powerful and user-friendly gradient boosting library that is appropriate for a wide range of applications. Whether you're a newbie searching for a simple approach to machine learning or an experienced practitioner looking for top-tier performance, CatBoost is a useful tool to have in your toolbox. However, as with any tool, its success is dependent on the individual problem and dataset, therefore it's always a good idea to experiment with it and compare it to other techniques.


Next Article
LightGBM Boosting Algorithms

S

shivankeagle
Improve
Article Tags :
  • Machine Learning
  • Geeks Premier League
  • AI-ML-DS
  • Geeks Premier League 2023
  • CatBoost
Practice Tags :
  • Machine Learning

Similar Reads

  • LightGBM Boosting Algorithms
    A machine learning approach called "boosting" turns several poor learners into strong learners. A model that is a poor learner can only marginally outperform random guessing, but a model that is a strong learner can attain great accuracy and generalization. Boosting employs weak learners through ite
    15+ min read
  • Prim's Algorithm in C
    Prim’s algorithm is a greedy algorithm that finds the minimum spanning tree (MST) for a weighted undirected graph. It starts with a single vertex and grows the MST one edge at a time by adding the smallest edge that connects a vertex in the MST to a vertex outside the MST. In this article, we will l
    6 min read
  • Prim's Algorithm in C++
    Prim's Algorithm is a greedy algorithm that is used to find the Minimum Spanning Tree (MST) for a weighted, undirected graph. MST is a subset of the graph's edges that connects all vertices together without any cycles and with the minimum possible total edge weight In this article, we will learn the
    6 min read
  • Widrow-Hoff Algorithm
    Widrow-Hoff Algorithm is developed by Bernard Widrow and his student Ted Hoff in the 1960s for minimizing the mean square error between a desired output and output produce by a linear predictor. The aim of the article is explore the fundamentals of the Widrow-Hoff algorithm and its impact on the evo
    5 min read
  • Kahn’s Algorithm in C++
    In this post, we will see the implementation of Kahn’s Algorithm in C++. What is Kahn’s Algorithm?Kahn’s Algorithm is a classic algorithm used for topological sorting of a directed acyclic graph (DAG). Topological sorting is a linear ordering of vertices such that for every directed edge u -> v,
    4 min read
  • Twofish Encryption Algorithm
    When it comes to data protection, encryption methods act as our buffering agents. One example of an excellent block cipher is the Twofish encryption algorithm. Although it was a competitor of another block cipher in the Advanced Encryption Standard competition and was later succeeded by the latter,
    6 min read
  • Johnson Algorithm in C++
    Johnson’s Algorithm is an algorithm used to find the shortest paths between all pairs of vertices in a weighted graph. It is especially useful for sparse graphs and can handle negative weights, provided there are no negative weight cycles. This algorithm uses both Bellman-Ford and Dijkstra's algorit
    8 min read
  • Johnson Algorithm in C
    Johnson's Algorithm is an efficient algorithm used to find the shortest paths between all pairs of vertices in a weighted graph. It works even for graphs with negative weights, provided there are no negative weight cycles. This algorithm is particularly useful for sparse graphs and combines both Dij
    5 min read
  • Floyd-Warshall Algorithm in C
    Floyd-Warshall algorithm is a dynamic programming algorithm used to find the shortest paths between all pairs of vertices in a weighted graph. It works for both directed and undirected graphs and can handle negative weights, provided there are no negative weight cycles. In this article, we will lear
    5 min read
  • Floyd-Warshall Algorithm in C++
    The Floyd-Warshall algorithm is a dynamic programming technique used to find the shortest paths between all pairs of vertices in a weighted graph. This algorithm is particularly useful for graphs with dense connections and can handle both positive and negative edge weights, though it cannot handle n
    4 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences