[파이썬] pandas 데이터 슬라이딩

Pandas is a powerful data manipulation and analysis library in Python. It provides various methods to access, filter, and manipulate data. One common operation is data slicing or data sliding, which allows us to extract a subset of data based on specific conditions or criteria.

In this blog post, we will explore the different techniques for data sliding in pandas using Python.

DataFrame Slicing

DataFrame slicing allows us to extract rows or columns from a pandas DataFrame based on specific conditions or criteria. Here are some examples:

Selecting Rows with Condition

To select rows based on a specific condition, we can use boolean indexing. For example, let’s say we have a DataFrame called data with a column called ‘age’ and we want to select all rows where the age is greater than 30.

import pandas as pd

data = pd.DataFrame({'name': ['John', 'Alice', 'Bob'],
                     'age': [32, 28, 45]})

selected_data = data[data['age'] > 30]
print(selected_data)

Output:

  name  age
0 John   32
2  Bob   45

Selecting Columns

To select specific columns from a DataFrame, we can use indexing. For example, let’s say we have a DataFrame called data with columns ‘name’, ‘age’, and ‘salary’ and we want to select only the ‘name’ and ‘age’ columns.

import pandas as pd

data = pd.DataFrame({'name': ['John', 'Alice', 'Bob'],
                     'age': [32, 28, 45],
                     'salary': [50000, 60000, 70000]})

selected_data = data[['name', 'age']]
print(selected_data)

Output:

   name  age
0  John   32
1 Alice   28
2   Bob   45

Series Slicing

Similar to DataFrame slicing, we can also apply slicing on a pandas Series. Here are some examples:

Selecting Elements with Condition

To select elements from a Series based on a specific condition, we can use boolean indexing. For example, let’s say we have a Series called ages with ages of a group of people, and we want to select only the ages greater than 30.

import pandas as pd

ages = pd.Series([32, 28, 45, 40, 35])

selected_ages = ages[ages > 30]
print(selected_ages)

Output:

0    32
2    45
3    40
4    35
dtype: int64

Selecting Elements by Index

We can select specific elements from a Series based on their index using slicing. For example, let’s say we have a Series called grades with student grades, and we want to select grades for the first three students.

import pandas as pd

grades = pd.Series([90, 85, 92, 88, 95])

selected_grades = grades[:3]
print(selected_grades)

Output:

0    90
1    85
2    92
dtype: int64

Conclusion

Pandas provides powerful slicing capabilities to extract subsets of data based on specific conditions or criteria. In this blog post, we explored how to use DataFrame and Series slicing in pandas using Python.

By leveraging these techniques, you can easily manipulate and analyze large datasets, perform complex calculations, and extract valuable insights from your data.