How to Replace Values in Column Based on Condition in Pandas?
Last Updated : 15 Nov, 2024
Let's explore different methods to replace values in a Pandas DataFrame column based on conditions.
Replace Values Using dataframe.loc[] Function
The dataframe.loc[] function allows us to access a subset of rows or columns based on specific conditions, and we can replace values in those subsets.
df.loc[df['column_name'] == 'some_value', 'column_name'] = 'new_value'
Consider a dataset with columns 'name', 'gender', 'math score', and 'test preparation'. In this example, we will replace all occurrences of 'male' with 1 in the gender column.
Python import pandas as pd # Data Student = { 'Name': ['John', 'Jay', 'sachin', 'Geetha', 'Amutha', 'ganesh'], 'gender': ['male', 'male', 'male', 'female', 'female', 'male'], 'math score': [50, 100, 70, 80, 75, 40], 'test preparation': ['none', 'completed', 'none', 'completed', 'completed', 'none'], } # Creating a DataFrame object df = pd.DataFrame(Student) # Replacing 'male' with 1 in the 'gender' column df.loc[df["gender"] == "male", "gender"] = 1 print(df)
Output:
Name gender math score test preparation
0 John 1 50 none
1 Jay 1 100 completed
2 sachin 1 70 none
3 Geetha female 80 completed
4 Amutha female 75 completed
5 ganesh 1 40 none
We can replace values in Column based on Condition in Pandas using the following methods:
Replace Values Using np.where()
The np.where() function from the NumPy library is another powerful tool for conditionally replacing values.
df['column_name'] = np.where(df['column_name'] == 'some_value', 'value_if_true', 'value_if_false')
Here, we will replace 'female' with 0 and 'male' with 1 in the gender column.
Python import numpy as np # Data Student = { 'Name': ['John', 'Jay', 'sachin', 'Geetha', 'Amutha', 'ganesh'], 'gender': ['male', 'male', 'male', 'female', 'female', 'male'], 'math score': [50, 100, 70, 80, 75, 40], 'test preparation': ['none', 'completed', 'none', 'completed', 'completed', 'none'], } # Creating a DataFrame object df = pd.DataFrame(Student) # Replacing 'female' with 0 and 'male' with 1 in the 'gender' column df["gender"] = np.where(df["gender"] == "female", 0, 1) print(df)
Output:
Name gender math score test preparation
0 John 1 50 none
1 Jay 1 100 completed
2 sachin 1 70 none
3 Geetha 0 80 completed
4 Amutha 0 75 completed
5 ganesh 1 40 none
Replace Values Using Masking
Pandas' mask() function can be used to replace values where a condition is met.
df['column_name'].mask(df['column_name'] == 'some_value', 'new_value', inplace=True)
In this example, we replace 'female' with 0 in the gender column using the mask() function.
Python import pandas as pd # Data Student = { 'Name': ['John', 'Jay', 'sachin', 'Geetha', 'Amutha', 'ganesh'], 'gender': ['male', 'male', 'male', 'female', 'female', 'male'], 'math score': [50, 100, 70, 80, 75, 40], 'test preparation': ['none', 'completed', 'none', 'completed', 'completed', 'none'], } # Creating a DataFrame object df = pd.DataFrame(Student) # Replacing 'female' with 0 in the 'gender' column df['gender'].mask(df['gender'] == 'female', 0, inplace=True) print(df)
Output:
Name gender math score test preparation
0 John male 50 none
1 Jay male 100 completed
2 sachin male 70 none
3 Geetha 0 80 completed
4 Amutha 0 75 completed
5 ganesh male 40 none
Replace Values Using apply() and Lambda Functions
The apply() function in combination with a lambda function is a flexible method for applying conditional replacements based on more complex logic.
Here, we will replace 'female' with 0 in the gender column using the apply() function and lambda.
Python import pandas as pd # Data Student = { 'Name': ['John', 'Jay', 'sachin', 'Geetha', 'Amutha', 'ganesh'], 'gender': ['male', 'male', 'male', 'female', 'female', 'male'], 'math score': [50, 100, 70, 80, 75, 40], 'test preparation': ['none', 'completed', 'none', 'completed', 'completed', 'none'], } # Creating a DataFrame object df = pd.DataFrame(Student) # Replacing 'female' with 0 using apply and lambda df['gender'] = df['gender'].apply(lambda x: 0 if x == 'female' else x) print(df)
Output:
Name gender math score test preparation
0 John male 50 none
1 Jay male 100 completed
2 sachin male 70 none
3 Geetha 0 80 completed
4 Amutha 0 75 completed
5 ganesh male 40 none
In this article, we’ve explored four effective methods to replace values in a Pandas DataFrame column based on conditions: using loc[], np.where(), masking, and apply() with a lambda function.
Similar Reads
Replace Values Based on Condition in R
In this article, we will examine various methods to replace values based on conditions in the R Programming Language. How to replace values based on conditionR language offers a method to replace values based on conditions efficiently. By using these methods provided by R, it is possible to replace
3 min read
How to Filter Rows Based on Column Values with query function in Pandas?
In this article, let's see how to filter rows based on column values. Query function can be used to filter rows based on column values. Consider below Dataframe: [GFGTABS] Python3 import pandas as pd data = [['A', 10], ['B', 15], ['C', 14], ['D', 12]] df = pd.DataFra
1 min read
Split dataframe in Pandas based on values in multiple columns
In this article, we are going to see how to divide a dataframe by various methods and based on various parameters using Python. To divide a dataframe into two or more separate dataframes based on the values present in the column we first create a data frame. Creating a DataFrame for demonestration[G
3 min read
How to Count Occurrences of Specific Value in Pandas Column?
Let's learn how to count occurrences of a specific value in columns within a Pandas DataFrame using .value_counts() method and conditional filtering. Count Occurrences of Specific Values using value_counts()To count occurrences of values in a Pandas DataFrame, use the value_counts() method. This fun
4 min read
How to Drop rows in DataFrame by conditions on column values?
In this article, we are going to see several examples of how to drop rows from the dataframe based on certain conditions applied on a column. Pandas provide data analysts a way to delete and filter data frame using dataframe.drop() method. We can use this method to drop such rows that do not satisfy
3 min read
How to Select Column Values to Display in Pandas Groupby
Pandas is a powerful Python library used extensively in data analysis and manipulation. One of its most versatile and widely used functions is groupby, which allows users to group data based on specific criteria and perform various operations on these groups. This article will delve into the details
5 min read
How to convert index in a column of the Pandas dataframe?
Each row in a dataframe (i.e level=0) has an index value i.e value from 0 to n-1 index location and there are many ways to convert these index values into a column in a pandas dataframe. First, let's create a Pandas dataframe. Here, we will create a Pandas dataframe regarding student's marks in a pa
4 min read
How to Select Rows from a Dataframe based on Column Values ?
Selecting rows from a Pandas DataFrame based on column values is a fundamental operation in data analysis using pandas. The process allows to filter data, making it easier to perform analyses or visualizations on specific subsets. Key takeaway is that pandas provides several methods to achieve this,
4 min read
Replace the column contains the values 'yes' and 'no' with True and False In Python-Pandas
Letâs discuss a program To change the values from a column that contains the values 'YES' and 'NO' with TRUE and FALSE.  First, Let's see a dataset. Code: [GFGTABS] Python3 # import pandas library import pandas as pd # load csv file df = pd.read_csv("supermarkets.csv") # show the datafram
2 min read
Replace all the NaN values with Zero's in a column of a Pandas dataframe
Replacing the NaN or the null values in  a dataframe can be easily performed using a single line DataFrame.fillna() and DataFrame.replace() method. We will discuss these methods along with an example demonstrating how to use it.                            DataFrame.fillna()
3 min read