[파이썬] pandas 여러 조건에 따른 그룹화

Pandas is a powerful library in Python that provides various functionalities for data manipulation and analysis. One of its core features is the ability to group data based on certain conditions. In this blog post, we will explore how to use Pandas to group data based on multiple conditions.

The Dataset

Let’s assume we have a dataset containing information about students in a school. The dataset consists of the following columns:

Our goal is to group the data based on multiple conditions such as grade and gender, and then perform some operations on the grouped data.

Grouping Data with Multiple Conditions

To group the data based on multiple conditions, we can use the groupby() function in Pandas. This function allows us to group data based on one or more columns.

Here’s an example code snippet that demonstrates how to group the data based on grade and gender:

import pandas as pd

# Read the dataset into a pandas DataFrame
df = pd.read_csv('students.csv')

# Group the data based on grade and gender
grouped_data = df.groupby(['Grade', 'Gender'])

# Perform some operations on the grouped data
for (grade, gender), data in grouped_data:
    # Calculate the average score for each group
    average_score = data['Score'].mean()
    
    # Print the average score for each group
    print(f"Grade {grade}, {gender}: {average_score}")

In the above code, we first import the pandas library and read the dataset into a DataFrame using the read_csv() function. We then use the groupby() function to group the data based on the ‘Grade’ and ‘Gender’ columns. The resulting grouped_data object is a DataFrameGroupBy object.

We can then iterate over the groups using a for loop and perform some operations on each group. In this example, we calculate the average score for each group using the mean() function.

Conclusion

Grouping data based on multiple conditions is a powerful feature of Pandas that allows us to perform complex operations on subsets of our dataset. By using the groupby() function, we can easily group our data and apply various operations such as aggregations, filtering, and transformations. This capability makes Pandas a valuable tool for data analysis and manipulation tasks.

Remember to use this feature wisely and adjust it to your specific use case. Happy coding with Pandas!