How to Replace Values in Column Based on Condition in Pandas?

Last Updated : 15 Nov, 2024

Let's explore different methods to replace values in a Pandas DataFrame column based on conditions.

Replace Values Using dataframe.loc[] Function

The dataframe.loc[] function allows us to access a subset of rows or columns based on specific conditions, and we can replace values in those subsets.

df.loc[df['column_name'] == 'some_value', 'column_name'] = 'new_value'

Consider a dataset with columns 'name', 'gender', 'math score', and 'test preparation'. In this example, we will replace all occurrences of 'male' with 1 in the gender column.

Python

import pandas as pd  # Data Student = {     'Name': ['John', 'Jay', 'sachin', 'Geetha', 'Amutha', 'ganesh'],     'gender': ['male', 'male', 'male', 'female', 'female', 'male'],     'math score': [50, 100, 70, 80, 75, 40],     'test preparation': ['none', 'completed', 'none', 'completed', 'completed', 'none'], }  # Creating a DataFrame object df = pd.DataFrame(Student)  # Replacing 'male' with 1 in the 'gender' column df.loc[df["gender"] == "male", "gender"] = 1 print(df)

Output:

     Name  gender  math score test preparation
0    John       1          50             none
1     Jay       1         100        completed
2  sachin       1          70             none
3  Geetha  female          80        completed
4  Amutha  female          75        completed
5  ganesh       1          40             none

We can replace values in Column based on Condition in Pandas using the following methods:

Replace Values Using np.where()

The np.where() function from the NumPy library is another powerful tool for conditionally replacing values.

df['column_name'] = np.where(df['column_name'] == 'some_value', 'value_if_true', 'value_if_false')

Here, we will replace 'female' with 0 and 'male' with 1 in the gender column.

Python

import numpy as np  # Data Student = {     'Name': ['John', 'Jay', 'sachin', 'Geetha', 'Amutha', 'ganesh'],     'gender': ['male', 'male', 'male', 'female', 'female', 'male'],     'math score': [50, 100, 70, 80, 75, 40],     'test preparation': ['none', 'completed', 'none', 'completed', 'completed', 'none'], }  # Creating a DataFrame object df = pd.DataFrame(Student)  # Replacing 'female' with 0 and 'male' with 1 in the 'gender' column df["gender"] = np.where(df["gender"] == "female", 0, 1) print(df)

Output:

     Name  gender  math score test preparation
0    John       1          50             none
1     Jay       1         100        completed
2  sachin       1          70             none
3  Geetha       0          80        completed
4  Amutha       0          75        completed
5  ganesh       1          40             none

Replace Values Using Masking

Pandas' mask() function can be used to replace values where a condition is met.

 df['column_name'].mask(df['column_name'] == 'some_value', 'new_value', inplace=True)

In this example, we replace 'female' with 0 in the gender column using the mask() function.

Python

import pandas as pd  # Data Student = {     'Name': ['John', 'Jay', 'sachin', 'Geetha', 'Amutha', 'ganesh'],     'gender': ['male', 'male', 'male', 'female', 'female', 'male'],     'math score': [50, 100, 70, 80, 75, 40],     'test preparation': ['none', 'completed', 'none', 'completed', 'completed', 'none'], }  # Creating a DataFrame object df = pd.DataFrame(Student)  # Replacing 'female' with 0 in the 'gender' column df['gender'].mask(df['gender'] == 'female', 0, inplace=True) print(df)

Output:

     Name gender  math score test preparation
0    John   male          50             none
1     Jay   male         100        completed
2  sachin   male          70             none
3  Geetha      0          80        completed
4  Amutha      0          75        completed
5  ganesh   male          40             none

Replace Values Using apply() and Lambda Functions

The apply() function in combination with a lambda function is a flexible method for applying conditional replacements based on more complex logic.

Here, we will replace 'female' with 0 in the gender column using the apply() function and lambda.

Python

import pandas as pd  # Data Student = {     'Name': ['John', 'Jay', 'sachin', 'Geetha', 'Amutha', 'ganesh'],     'gender': ['male', 'male', 'male', 'female', 'female', 'male'],     'math score': [50, 100, 70, 80, 75, 40],     'test preparation': ['none', 'completed', 'none', 'completed', 'completed', 'none'], }  # Creating a DataFrame object df = pd.DataFrame(Student)  # Replacing 'female' with 0 using apply and lambda df['gender'] = df['gender'].apply(lambda x: 0 if x == 'female' else x) print(df)

Output:

     Name gender  math score test preparation
0    John   male          50             none
1     Jay   male         100        completed
2  sachin   male          70             none
3  Geetha      0          80        completed
4  Amutha      0          75        completed
5  ganesh   male          40             none

In this article, we’ve explored four effective methods to replace values in a Pandas DataFrame column based on conditions: using loc[], np.where(), masking, and apply() with a lambda function.

Split dataframe in Pandas based on values in multiple columns

sanjaysdev0901

Improve

Article Tags :

How to Replace Values in Column Based on Condition in Pandas?

Replace Values Using dataframe.loc[] Function

Replace Values Using np.where()

Replace Values Using Masking

Replace Values Using apply() and Lambda Functions

Similar Reads