Skip to content
geeksforgeeks
  • Tutorials
    • Python
    • Java
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
    • Practice Coding Problems
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Python Tutorial
  • Interview Questions
  • Python Quiz
  • Python Glossary
  • Python Projects
  • Practice Python
  • Data Science With Python
  • Python Web Dev
  • DSA with Python
  • Python OOPs
Open In App
Next Article:
Pandas: Detect Mixed Data Types and Fix it
Next article icon

Pandas: Detect Mixed Data Types and Fix it

Last Updated : 06 Oct, 2023
Comments
Improve
Suggest changes
Like Article
Like
Report

The Python library commonly used for working with data sets and can help users in analyzing, exploring, and manipulating data is known as the Pandas library. When any column of the Pandas data frame doesn't contain a single type of data, either numeric or string, but contains mixed type of data, both numeric as well as string, such column is called a mixed data type column.

Table of Content

  • What are mixed types in Pandas columns?
  • How to identify mixed types in Pandas columns
  • How to deal with mixed types in Pandas columns

What are mixed types in Pandas columns?

As you know, Pandas data frame can have multiple columns, thus when a certain column doesn't have a specified kind of data, i.e., doesn't have a certain data type, but contains mixed data, i.e., numeric as well as string values, then that column is tend to have mixed data type.

For example:

data_frame = pd.DataFrame( [['tom', 10], ['nick', '15'], ['juli', 14.8]], columns=['Name', 'Age'])

Here, the Age column contains string as well as the numeric type of data, the Age column has a mixed data type.

Causes of mixed data types

  • Missing Values (NaN)
  • Inconsistent Formatting
  • Data Entry Errors

Missing Values (NaN):

A floating-point value that represents undefined or unrepresentable data is known as NaN. The most common use case of NaN occurrence is the 0/0 case, which leads to mixed data types and ultimately leads to incorrect results.

Inconsistent Formatting:

The inconsistent formatting in the Pandas data frame is observed due to the cells with wrong format. Thus, it is crucial to transform each cell of column to a correct format.

Data Entry Errors:

There occurs various instances when the user makes a mistake while entering the data in a column in Pandas data frame. It can be any error, entering string data in numeric type column or leaving null value in the column or anything. Such errors can also lead to mixed data types and thus need to be fixed.

How to identify mixed types in Pandas columns

You might have used info() function to detect the data type of Pandas data frame, but using info() function is not possible in case of mixed data types. For detecting the mixed data types, you need to traverse each column of Pandas data frame, and get the data type using api.types.infer_dtypes() function.

Syntax:

for column in data_frame.columns:

print(pd.api.types.infer_dtype(data_frame[column]))

Here,

  • data_frame: It is the Pandas data frame for which you want to detect if it has mixed data types or not.

Example:

The data frame used in this example to detect mixed data type is as follows:

Python3
# Python program to detect mixed data types in Pandas data frame  # Import the library Pandas import pandas as pd    # Create the pandas DataFrame data_frame = pd.DataFrame( [['tom', 10], ['nick', '15'], ['juli', 14.8]], columns=['Name', 'Age'])  # Traverse data frame to detect mixed data types for column in data_frame.columns:     print(column,':',pd.api.types.infer_dtype(data_frame[column])) 

Output:

Name : string
Age : mixed-integer

How to deal with mixed types in Pandas columns

For fixing the mixed data types in Pandas data frame, you need to convert entire column into one data type. This can be done using astype() function or to_numeric() function.

Using astype() function:

A crucial function in Pandas which is used to cast an object to a specified data type is known as astype() function. In this way, we will see how we can fix mixed data types using astype() function.

Syntax:

data_frame[column] = data_frame[column].astype(int)

Here,

  • data_frame: It is the Pandas data frame for which you want to fix mixed data types.
  • column: It defines all the columns of the Pandas data frame.
  • int: Here, int is the data type in which you want to transform type of each column of Pandas data frame. You can also use str, float, etc. here depending on which data type you want to transform.

Example:

The data frame used in this example to fix mixed data type is as follows:

Python3
# Python program to fix mixed data types using astype() in Pandas data frame  # Import the library Pandas import pandas as pd    # Create the pandas DataFrame data_frame = pd.DataFrame( [['tom', 10], ['nick', '15'], ['juli', 14.8]], columns=['Name', 'Age'])  # Transforming mixed data types to single data type data_frame[column] = data_frame[column].astype(int)  # Traverse data frame to detect data types after fix for column in data_frame.columns:     print(column,':',pd.api.types.infer_dtype(data_frame[column])) 

Output:

Name : string
Age : integer

Using to_numeric() function:

The to_numeric() function is used to convert an argument to a numeric data type. In this way, we will see how we can fix mixed data types using to_numeric() function.

Syntax:

data_frame[column] = data_frame[column].apply(lambda x: pd.to_numeric(x, errors = 'ignore'))

Here,

  • data_frame: It is the Pandas data frame for which you want to fix mixed data types.
  • column: It defines all the columns of the Pandas data frame.

Example:

The data frame used in this example to fix mixed data type is as follows:

Python3
# Python program to fix mixed data types using to_numeric() in Pandas data frame  # Import the library Pandas import pandas as pd    # Create the pandas DataFrame data_frame = pd.DataFrame( [['tom', 10], ['nick', '15'], ['juli', 14.8]], columns=['Name', 'Age'])  # Transforming mixed data types to single data type data_frame[column] = data_frame[column].apply(lambda x: pd.to_numeric(x, errors = 'ignore'))  # Traverse data frame to detect data types after fix for column in data_frame.columns:   print(pd.api.types.infer_dtype(data_frame[column])) 

Output:

Name : string
Age : floating

Conclusion

Pandas columns with mixed types can cause problems when analyzing data, but they can be found and resolved using the techniques in this article. Data scientists and software developers can guarantee the accuracy and dependability of their analysis by properly cleaning and preparing the data.



Next Article
Pandas: Detect Mixed Data Types and Fix it

I

ishita28rai
Improve
Article Tags :
  • Python
  • Geeks Premier League
  • Geeks Premier League 2023
Practice Tags :
  • python

Similar Reads

    Nullable Integer Data Type in Pandas
    The concept of a nullable integer data type in Pandas addresses a common challenge in data handling, managing integer data that may contain missing values. Before the introduction of nullable integer types, missing values in integer arrays were typically handled by upcasting to floating-point types,
    4 min read
    Change Data Type for one or more columns in Pandas Dataframe
    When working with data in Pandas working with right data types for your columns is important for accurate analysis and efficient processing. Pandas offers several simple ways to change or convert the data types of columns in a DataFrame. In this article, we'll look at different methods to help you e
    3 min read
    Append data to an empty Pandas DataFrame
    Let us see how to append data to an empty Pandas DataFrame. Creating the Data Frame and assigning the columns to it python # importing the module import pandas as pd # creating the DataFrame of int and float a = [[1, 1.2], [2, 1.4], [3, 1.5], [4, 1.8]] t = pd.DataFrame(a, columns =["A",
    2 min read
    Get the data type of column in Pandas - Python
    Let’s see how to get data types of columns in the pandas dataframe. First, Let’s create a pandas dataframe. Example: Python3 # importing pandas library import pandas as pd # List of Tuples employees = [ ('Stuti', 28, 'Varanasi', 20000), ('Saumya', 32, 'Delhi', 25000), ('Aaditya', 25, 'Mumbai', 40000
    3 min read
    How to Check the Data Type in Pandas DataFrame?
    Pandas DataFrame is a Two-dimensional data structure of mutable size and heterogeneous tabular data. There are different Built-in data types available in Python.  Two methods used to check the datatypes are pandas.DataFrame.dtypes and pandas.DataFrame.select_dtypes. Creating a Dataframe to Check Dat
    2 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences