Skip to content
geeksforgeeks
  • Tutorials
    • Python
    • Java
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
    • Practice Coding Problems
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Data preprocessing
  • Data Manipulation
  • Data Analysis using Pandas
  • EDA
  • Pandas Exercise
  • Pandas AI
  • Numpy
  • Matplotlib
  • Plotly
  • Data Analysis
  • Machine Learning
  • Data science
Open In App
Next Article:
Data Normalization with Pandas
Next article icon

Data Normalization with Pandas

Last Updated : 12 May, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Data normalization is the process of scaling numeric features to a standard range, preventing large values from dominating the learning process in machine learning models. It is a important step in machine learning and data analysis ensure that numerical features are on a similar scale for optimal model performance. Normalization helps to improve algorithm performance particularly for distance-based models like K-Nearest Neighbors (KNN) and Support Vector Machines (SVM). It is important because:

  • Avoids numerical instability in models
  • Speeds up convergence in gradient-based algorithms
  • Ensures all features contribute equally to the analysis

Steps for Data Normalization in Pandas

Here we will apply some techniques to normalize the data and discuss these with the help of examples. For this let's understand the steps needed for data normalization with Pandas.

  1. Import the required libraries
  2. Load or create a dataset
  3. Apply different normalization techniques
  4. Visualize the results

Let's create a sample dataset using Pandas and visualize it.

Python
import pandas as pd   import matplotlib.pyplot as plt    df = pd.DataFrame([     [180000, 110, 18.9, 1400],       [360000, 905, 23.4, 1800],       [230000, 230, 14.0, 1300],       [60000, 450, 13.5, 1500] ], columns=['Col A', 'Col B', 'Col C', 'Col D'])  print(df)  df.plot(kind='bar') plt.show() 

Output:

Normalization Techniques in Pandas

1. Maximum Absolute Scaling

This technique rescales each feature between -1 and 1 by dividing all values by the maximum absolute value in that column. This technique is especially useful when your data doesn’t contain negative numbers and you want to preserve the data’s sparsity. We can apply the maximum absolute scaling in Pandas using the .max() and .abs() methods as shown below. Let's apply normalization techniques one by one.

Python
max_scaled = df.copy()  for column in df_max_scaled.columns:     max_scaled[column] = max_scaled[column] / max_scaled[column].abs().max()  print(max_scaled)  max_scaled.plot(kind='bar') plt.show() 

Output :

As we can see in above output all values now lie between -1 and 1. Each value is shown in relation to the largest value in that column.

2. The min-max feature scaling

The min-max approach also called normalization rescales the feature to a hard and fast range of [0,1] by subtracting the minimum value of the feature then dividing by the range. . It works well for models like K-Nearest Neighbors (KNN) which compare distance between data points. We can apply the min-max scaling in Pandas using the .min() and .max() methods.

Python
scaled = df.copy()  for column in df_min_max_scaled.columns:     scaled[column] = (scaled[column] - scaled[column].min()) / (scaled[column].max() - scaled[column].min())  print(scaled) scaled.plot(kind='bar') plt.show() 

Output :

After scaling the smallest value becomes 0 and the largest becomes 1. All other values lie between these two. This makes it easier for the machine learning model to handle features fairly.

3. The z-score method

The z-score method often called standardization changes the values in each column so that they have a mean of 0 and a standard deviation of 1. This technique is best when your data follow a normal distribution or when you want to treat values in terms of how far they are from the average.

Python
z_scaled = df.copy()  for column in z_scaled.columns:     z_scaled[column] = (z_scaled[column] - z_scaled[column].mean()) / z_scaled[column].std()  print(z_scaled)  z_scaled.plot(kind='bar') plt.show() 

Output : 

After applying this method each feature is centered around zero and its spread is standardized. This helps in models like logistic regression, SVM and neural networks to perform better.


Next Article
Data Normalization with Pandas

D

deepanshu_rustagi
Improve
Article Tags :
  • Python
  • Python-pandas
Practice Tags :
  • python

Similar Reads

    Data Manipulation in Python using Pandas
    In Machine Learning, the model requires a dataset to operate, i.e. to train and test. But data doesn’t come fully prepared and ready to use. There are discrepancies like Nan/ Null / NA values in many rows and columns. Sometimes the data set also contains some of the rows and columns which are not ev
    6 min read
    Manipulating DataFrames with Pandas - Python
    Before manipulating the dataframe with pandas we have to understand what is data manipulation. The data in the real world is very unpleasant & unordered so by performing certain operations we can make data understandable based on one's requirements, this process of converting unordered data into
    4 min read
    Streamlined Data Ingestion with Pandas
    Data Ingestion is the process of, transferring data, from varied sources to an approach, where it can be analyzed, archived, or utilized by an establishment. The usual steps, involved in this process, are drawing out data, from its current place, converting the data, and, finally loading it, in a lo
    9 min read
    Normalize A Column In Pandas
    In this article, we will learn how to normalize a column in Pandas. Let's discuss some concepts first : Pandas: Pandas is an open-source library that's built on top of the NumPy library. It is a Python package that provides various data structures and operations for manipulating numerical data and s
    3 min read
    Creating a Pandas Series
    A Pandas Series is like a single column of data in a spreadsheet. It is a one-dimensional array that can hold many types of data such as numbers, words or even other Python objects. Each value in a Series is associated with an index, which makes data retrieval and manipulation easy. This article exp
    3 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences