Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • Data preprocessing
  • Data Manipulation
  • Data Analysis using Pandas
  • EDA
  • Pandas Exercise
  • Pandas AI
  • Numpy
  • Matplotlib
  • Plotly
  • Data Analysis
  • Machine Learning
  • Data science
Open In App
Next Article:
Python | Pandas DataFrame.astype()
Next article icon

Joining two Pandas DataFrames using merge()

Last Updated : 12 Nov, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

The merge() function is designed to merge two DataFrames based on one or more columns with matching values. The basic idea is to identify columns that contain common data between the DataFrames and use them to align rows.

Let’s understand the process of joining two pandas DataFrames using merge(), explaining the key concepts, parameters, and practical examples to make the process clear and accessible.

joining_two_pandas_dataframes_using_merge_

Joining two Pandas DataFrames using merge()

If the column names are the same in both tables, you just need to use on to specify that column name. For example:

Python
import pandas as pd  # DataFrames to merge df1 = pd.DataFrame({'ID': [1, 2, 3], 'Name': ['Alice', 'Bob', 'Charlie']}) df2 = pd.DataFrame({'ID': [1, 2, 4], 'Age': [24, 27, 22]})  # Merge DataFrames on the 'ID' column using an inner join merged_df = pd.merge(df1, df2, on='ID', how='inner') print(merged_df) 


Merged df:

   ID   Name  Age
0 1 Alice 24
1 2 Bob 27

This example performs an inner join, resulting in a DataFrame that includes only the rows with matching ID values.

How merge() Function Works in Pandas?

The core idea behind merge() is simple: it allows to specify how the rows from two DataFrames should be aligned based on one or more keys (columns or indexes). The result is a new DataFrame that contains data from both original DataFrames. Basic Syntax of merge():

pd.merge(left, right, how='inner', on=None, left_on=None, right_on=None)

Where:

  • left: The first DataFrame.
  • right: The second DataFrame.
  • how: Specifies the type of join (default is ‘inner’).
  • on: Column(s) to join on. If not specified, Pandas will attempt to merge on columns with the same name in both DataFrames.
  • left_on and right_on: Specify different columns from each DataFrame to join on if they don’t share the same column names.

The Join method is to determine which rows to keep based on matches between the two DataFrames. There are four types of joins, we will discuss each one in the following examples.

Combining Two Pandas DataFrames with merge(): Examples

1. Inner Join: Keeping Only Matching Rows

An inner join keeps rows from both DataFrames where there is a match in the specified column(s).

Python
import pandas as pd df1 = pd.DataFrame({"fruit" : ["apple", "banana", "avocado"],                      "market_price" : [21, 14, 35]})  display("The first DataFrame")  display(df1)  df2 = pd.DataFrame({"fruit" : ["banana", "apple", "avocado"],                      "wholesaler_price" : [65, 68, 75]})  display("The second DataFrame")  display(df2)     # joining the DataFrames  display("The merged DataFrame")  pd.merge(df1, df2, on = "fruit", how = "inner") 

Output :

2. Outer Join: Including All Rows from Both DataFrames

An outer join includes all rows from both DataFrames so If we use how = "Outer" , it returns all elements in df1 and df2 but if element column are null then its return NaN value.

Python
pd.merge(df1, df2, on = "fruit", how = "outer") 

Output :

3. Left Join: Keeping All Rows from the Left DataFrame

A left join keeps all rows from the left DataFrame, adding only matching rows from the right.

Python
pd.merge(df1, df2, on = "fruit", how = "left") 

Output :

4. Right Join: Keeping All Rows from the Right DataFrame

A right join keeps all rows from the right DataFrame, adding only matching rows from the left.

Python
pd.merge(df1, df2, on = "fruit", how = "right") 

Output :

Key Takeaways

Here are the main points to remember when joining two DataFrames using merge():

  • Common Columns: Ensure that the columns you are joining on are correctly identified and named.
  • Join Types: Choose the appropriate join type (inner, left, right, outer) based on your data and analysis needs.
  • Handling Duplicates: Use suffixes to manage duplicate column names that arise from the merge.
  • Index vs Columns: Decide whether to join on columns or indexes using on, left_on, right_on, left_index, and right_index parameters.


Next Article
Python | Pandas DataFrame.astype()
author
kumar_satyam
Improve
Article Tags :
  • AI-ML-DS
  • Python
  • pandas-dataframe-program
  • Python pandas-dataFrame
  • Python-pandas
Practice Tags :
  • python

Similar Reads

  • Extracting rows using Pandas .iloc[] in Python
    Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages that makes importing and analyzing data much easier. here we are learning how to Extract rows using Pandas .iloc[] in Python. Pandas .iloc
    7 min read
  • Python | Pandas Dataframe.rename()
    Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas rename() method is used to rename any index, column or row. Renaming of column
    3 min read
  • Python | Pandas DataFrame.where()
    Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas where() method in Python is used to check a data frame for one or more conditio
    2 min read
  • Python | Delete rows/columns from DataFrame using Pandas.drop()
    Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages which makes importing and analyzing data much easier. In this article, we will how to delete a row in Excel using Pandas as well as delete
    4 min read
  • Pandas dataframe.groupby() Method
    Pandas groupby() function is a powerful tool used to split a DataFrame into groups based on one or more columns, allowing for efficient data analysis and aggregation. It follows a "split-apply-combine" strategy, where data is divided into groups, a function is applied to each group, and the results
    6 min read
  • Pandas DataFrame corr() Method
    Pandas dataframe.corr() is used to find the pairwise correlation of all columns in the Pandas Dataframe in Python. Any NaN values are automatically excluded. To ignore any non-numeric values, use the parameter numeric_only = True. In this article, we will learn about DataFrame.corr() method in Pytho
    4 min read
  • Pandas query() Method
    Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages that makes importing and analyzing data much easier. Analyzing data requires a lot of filtering operations. Pandas Dataframe provide many
    2 min read
  • Python | Pandas dataframe.insert()
    Pandas insert method allows the user to insert a column in a data frame or series(1-D Data frame). A column can also be inserted manually in a data frame by the following method, but there isn't much freedom here. For example, even column location can't be decided and hence the inserted column is al
    8 min read
  • Pandas dataframe.sum()
    DataFrame.sum() function in Pandas allows users to compute the sum of values along a specified axis. It can be used to sum values along either the index (rows) or columns, while also providing flexibility in handling missing (NaN) values. Example: [GFGTABS] Python import pandas as pd data = { 'A
    4 min read
  • Pandas DataFrame mean() Method
    Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas DataFrame mean() Pandas dataframe.mean() function returns the mean of the value
    2 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences