Skip to content
geeksforgeeks
  • Tutorials
    • Python
    • Java
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
    • Practice Coding Problems
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Data preprocessing
  • Data Manipulation
  • Data Analysis using Pandas
  • EDA
  • Pandas Exercise
  • Pandas AI
  • Numpy
  • Matplotlib
  • Plotly
  • Data Analysis
  • Machine Learning
  • Data science
Open In App
Next Article:
Pandas Dataframe Difference
Next article icon

Pandas Dataframe Difference

Last Updated : 16 Dec, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

When working with multiple DataFrames, you might want to compute the differences between them, such as identifying rows that are in one DataFrame but not in another. Pandas provides various ways to compute the difference between DataFrames, whether it's comparing rows, columns, or entire DataFrames. This is useful in data analysis, especially when you need to track changes between datasets over time or compare two similar datasets.

In this article, we will explore methods to find the difference between DataFrames using Pandas.

Python
import pandas as pd  # Create DataFrames for Dataset 1 and Dataset 2 data1 = {'Name': ['John', 'Alice', 'Bob', 'Eve'],           'Age': [25, 30, 22, 35],           'Gender': ['Male', 'Female', 'Male', 'Female']} df1 = pd.DataFrame(data1)  data2 = {'Name': ['John', 'Alice', 'Charlie', 'Eve'],           'Age': [25, 32, 28, 35],           'Gender': ['Male', 'Female', 'Male', 'Female']} df2 = pd.DataFrame(data2) 


Finding Rows in One DataFrame but Not in Another

The most common way to find the difference between DataFrames is to identify rows that are in one DataFrame but not in the other. This can be done using the merge() method with the indicator=True option or by using isin() method.

  • Use merge() with indicator=True to identify differences.
Python
# Merge the DataFrames with the 'indicator' flag to track the source of each row merged_df = pd.merge(df1, df2, how='outer', indicator=True)  # Find rows that are only in df1 but not in df2 diff_df1 = merged_df[merged_df['_merge'] == 'left_only'] print(diff_df1)  # Find rows that are only in df2 but not in df1 diff_df2 = merged_df[merged_df['_merge'] == 'right_only'] print(diff_df2) 


Screenshot-2024-12-13-125406

The merge() method is used with the indicator=True flag to add a new column (_merge) that shows whether a row is only in df1, only in df2, or in both.We then filter for rows where _merge is 'left_only' (rows unique to df1) or 'right_only' (rows unique to df2).

Finding the Difference in Values (Element-wise)

If you want to find the difference between corresponding elements in two DataFrames, you can subtract one DataFrame from another. This works for numerical data and compares corresponding values row-wise and column-wise.

Python
# Subtract df2 from df1 (numerical columns only) df_diff = df1.select_dtypes(include=['number']) - df2.select_dtypes(include=['number']) print(df_diff) 
Screenshot-2024-12-13-132758

select_dtypes(include=['number']) method selects only the numerical columns for subtraction.Subtraction of corresponding values in df1 and df2 produces a new DataFrame with the element-wise differences.

Using isin to Find Values Not Shared Between DataFrames

The isin() method is another powerful tool to compare rows between DataFrames. It allows you to filter for rows in one DataFrame that do not appear in the other.

Python
# Find rows in df1 that are not in df2 df_diff = df1[~df1['Name'].isin(df2['Name'])] print(df_diff) 
Screenshot-2024-12-13-133440

The isin() method checks if each value in the Name column of df1 is present in the Name column of df2. The tilde (~) negates the result, meaning we filter for rows in df1 whose Name does not exist in df2.

Comparing DataFrame Indexes

You may also want to compare the indexes of two DataFrames to see if they are the same or different. You can use the .index attribute to compare indexes between DataFrames.


Python
# Compare indexes between df1 and df2 index_diff = df1.index.difference(df2.index) print(index_diff) 
Screenshot-2024-12-13-150731

The difference() method returns the indexes that are present in df1 but not in df2. This is useful when you want to check whether the row labels (indexes) are the same across DataFrames.

Summary:

Pandas provides multiple methods for finding the difference between DataFrames, each suited for specific use cases:

  • merge() with the indicator=True flag is great for finding rows that differ between DataFrames.
  • Subtraction is useful for comparing numerical values element-wise.
  • isin() is helpful for filtering rows that are not shared between DataFrames.
  • difference() can be used to compare DataFrame indexes.

These techniques can be combined and customized to suit a variety of data comparison tasks in your analysis workflow.

Related Articles:

  • How To Compare Two Dataframes with Pandas compare?
  • How to compare values in two Pandas Dataframes?

Next Article
Pandas Dataframe Difference

A

abhirajksingh
Improve
Article Tags :
  • Python
  • Pandas
  • AI-ML-DS
  • Python-pandas
  • Python pandas-dataFrame
  • Pandas-DataFrame-Methods
Practice Tags :
  • python

Similar Reads

    Pandas Merge Dataframe
    Merging DataFrames is a common operation when working with multiple datasets in Pandas. The `merge()` function allows you to combine two DataFrames based on a common column or index. In this article, we will explore how to merge DataFrames using various options and techniques.We will load the datase
    5 min read
    Python | Pandas dataframe.diff()
    Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas dataframe.diff() is used to find the first discrete difference of objects over
    2 min read
    Difference of two columns in Pandas dataframe
    Difference of two columns in pandas dataframe in Python is carried out by using following methods : Method #1 : Using ” -” operator. Python3 import pandas as pd # Create a DataFrame df1 = { 'Name':['George','Andrea','micheal', 'maggie','Ravi','Xien','Jalpa'], 'score1':[62,47,55,74,32,77,86], 'score2
    2 min read
    Different ways to create Pandas Dataframe
    It is the most commonly used Pandas object. The pd.DataFrame() function is used to create a DataFrame in Pandas. There are several ways to create a Pandas Dataframe in Python.Example: Creating a DataFrame from a DictionaryPythonimport pandas as pd # initialize data of lists. data = {'Name': ['Tom',
    7 min read
    Pandas DataFrame
    A Pandas DataFrame is a two-dimensional table-like structure in Python where data is arranged in rows and columns. It’s one of the most commonly used tools for handling data and makes it easy to organize, analyze and manipulate data. It can store different types of data such as numbers, text and dat
    10 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences