Simplest way to select a specific or multiple columns in pandas dataframe is by using bracket notation, where you place the column name inside square brackets. Let's consider following example:
Python import pandas as pd data = {'Name': ['John', 'Alice', 'Bob', 'Eve', 'Charlie'], 'Age': [25, 30, 22, 35, 28], 'Gender': ['Male', 'Female', 'Male', 'Female', 'Male'], 'Salary': [50000, 55000, 40000, 70000, 48000] } df = pd.DataFrame(data) # select column Age by Bracket method score_column = df['Age'] print(score_column)
Output0 25 1 30 2 22 3 35 4 28 Name: Age, dtype: int64
This method allows to easily access a single column of data. Now, let's select multiple columns, you need to pass a list of column names inside double brackets.
Python # Select both 'Age' and 'Salary' columns subset_columns = df[['Age', 'Salary']] print(subset_columns)
Output Age Salary 0 25 50000 1 30 55000 2 22 40000 3 35 70000 4 28 48000
This approach enables to select and manipulate multiple columns simultaneously.
In addition to the this method, there are several other approaches to select columns in a Pandas DataFrame:
1. Selecting Columns with loc
The loc[] method selects rows and columns by label. When you want to select specific columns using labels, you can use this method to retrieve the desired columns efficiently.
Python selected_columns = df.loc[:, ['Name', 'Gender']] print(selected_columns)
Output Name Gender 0 John Male 1 Alice Female 2 Bob Male 3 Eve Female 4 Charlie Male
2. Selecting Columns Using Iloc
The iloc[] method is used for selecting rows and columns by their integer index positions. This is helpful when you know the position of the columns rather than their names.
Python selected_with_iloc = df.iloc[:, [0, 1]] print(selected_with_iloc)
Output Name Age 0 John 25 1 Alice 30 2 Bob 22 3 Eve 35 4 Charlie 28
3. Selecting Columns Using filter
The filter() method is useful when you want to select columns based on certain conditions, such as column names that match a specific pattern. This method can be used to select columns with a substring match or regex pattern.
Python # Select columns that contain 'Age' or 'Salary' filtered_columns = df.filter(like='Age') print(filtered_columns)
Output Age 0 25 1 30 2 22 3 35 4 28
4. Selecting Columns by Data Type
If you want to select columns based on their data types (e.g., selecting only numeric columns), use the select_dtypes() method.
Python numeric_columns = df.select_dtypes(include=['number']) print(numeric_columns)
Output Age Salary 0 25 50000 1 30 55000 2 22 40000 3 35 70000 4 28 48000
Here are some key takeaways:
- Use bracket notation (df['column_name']) for selecting a single column.
- Use double square brackets (df[['column1', 'column2']]) for selecting multiple columns.
- Explore loc[], iloc[], filter(), and select_dtypes() for more advanced selection techniques based on labels, positions, or conditions.
Similar Reads
Pandas Drop Column When working with large datasets, there are often columns that are irrelevant or redundant. Pandas provides an efficient way to remove these unnecessary columns using the `drop()` function. In this article, we will cover various methods to drop columns from a DataFrame.Pythonimport pandas as pd data
4 min read
Search A pandas Column For A Value Prerequisites: pandas In this article let's discuss how to search data frame for a given specific value using pandas. Function usedwhere() -is used to check a data frame for one or more condition and return the result accordingly. By default, The rows not satisfying the condition are filled with NaN
2 min read
Slicing Column Values in Pandas Slicing column values in Pandas is a crucial operation in data manipulation and analysis. Pandas, a powerful Python library, provides various methods to slice and extract specific data from DataFrames. This article will delve into the different techniques for slicing column values, highlighting thei
5 min read
Randomly Select Columns from Pandas DataFrame In this article, we will discuss how to randomly select columns from the Pandas Dataframe. According to our requirement, we can randomly select columns from a pandas Database method where pandas df.sample() method helps us randomly select rows and columns. Syntax of pandas sample() method: Return a
3 min read
Pandas DataFrame.columns In Pandas, DataFrame.columns attribute returns the column names of a DataFrame. It gives access to the column labels, returning an Index object with the column labels that may be used for viewing, modifying, or creating new column labels for a DataFrame.Note: This attribute doesn't require any param
2 min read