[파이썬] pandas 빈도 및 오프셋 별칭

Pandas is a powerful and popular library in Python for data manipulation and analysis. In this blog post, we will explore how to use pandas to calculate frequencies and use offset aliases for date and time manipulation.

Calculating Frequencies with pandas

Pandas provides various methods to calculate frequency counts of values in a pandas DataFrame or Series. The value_counts() method is commonly used to perform this operation. Let’s see an example:

import pandas as pd

# Creating a sample DataFrame
data = {'Name': ['John', 'Jane', 'Jane', 'John', 'John', 'Jane'],
        'Age': [28, 32, 32, 28, 28, 32]}
df = pd.DataFrame(data)

# Calculating frequency counts of 'Name' column
name_counts = df['Name'].value_counts()

print(name_counts)

Output:

John    3
Jane    3
Name: Name, dtype: int64

In the above code, we have a DataFrame with ‘Name’ and ‘Age’ columns. Using the value_counts() method, we calculated the frequency counts of the ‘Name’ column.

Using Offset Aliases for Date and Time Manipulation

Pandas provides a set of time offset aliases that can be used to perform convenient date and time manipulations. These aliases can be used with the pd.DateOffset class to perform operations like shifting dates, adding or subtracting time intervals, etc. Let’s see an example:

import pandas as pd

# Creating a sample DataFrame with dates
data = {'Date': ['2021-01-01', '2021-01-02', '2021-01-03'],
        'Value': [10, 15, 12]}
df = pd.DataFrame(data)
df['Date'] = pd.to_datetime(df['Date'])

# Shifting dates by one day using offset alias
df['Shifted_Date'] = df['Date'] + pd.DateOffset(days=1)

print(df)

Output:

        Date  Value Shifted_Date
0 2021-01-01     10   2021-01-02
1 2021-01-02     15   2021-01-03
2 2021-01-03     12   2021-01-04

In the above code, we created a DataFrame with ‘Date’ and ‘Value’ columns. By using the pd.DateOffset class with the offset alias days=1, we shifted the dates in the ‘Date’ column by one day and stored the result in the ‘Shifted_Date’ column.

Pandas provides a wide range of offset aliases, such as ‘D’ for days, ‘H’ for hours, ‘M’ for months, ‘Y’ for years, and many more. These aliases make it easy to perform date and time calculations without manually specifying the exact time intervals.

In conclusion, pandas provides convenient methods for calculating frequencies and offers offset aliases for date and time manipulation. These functionalities make pandas a powerful tool for data analysis and manipulation in Python.

I hope this blog post has provided you with a good understanding of pandas frequency calculations and offset aliases for date and time manipulation. Happy coding with pandas!