[파이썬] pandas 퍼센트 변화 계산

Pandas is a powerful data manipulation library in Python that provides various functionalities to analyze and transform data. One common task in data analysis is calculating the percentage change between values. In this blog post, we will explore how to use Pandas to calculate percentage changes in Python.

Before we start, make sure you have Pandas installed. You can install Pandas using the following command:

pip install pandas

Now, let’s dive into the code!

Importing the required libraries

To get started, we need to import the necessary libraries. In this case, we will import Pandas using the standard import statement:

import pandas as pd

Creating a sample DataFrame

Let’s create a sample DataFrame to illustrate the percentage change calculation. We will use the DataFrame constructor provided by Pandas:

data = {'A': [10, 20, 30, 40, 50],
        'B': [15, 25, 35, 45, 55]}
df = pd.DataFrame(data)

Our DataFrame df looks like this:

    A   B
0  10  15
1  20  25
2  30  35
3  40  45
4  50  55

Calculating the percentage change

Now that we have our DataFrame set up, let’s calculate the percentage change. Pandas provides the pct_change() method, which calculates the percentage change between successive row values. Here’s how you can use it:

df_pct_change = df.pct_change()

The resulting DataFrame df_pct_change will contain the percentage change values:

          A         B
0       NaN       NaN
1  1.000000  0.666667
2  0.500000  0.400000
3  0.333333  0.285714
4  0.250000  0.222222

Handling missing values

You might have noticed that the first row of the df_pct_change DataFrame contains NaN values. This is because there’s no previous row to calculate the percentage change. You can handle missing values by using the fillna() method:

df_pct_change = df_pct_change.fillna(0)

Now, the modified df_pct_change DataFrame will look like this:

          A         B
0  0.000000  0.000000
1  1.000000  0.666667
2  0.500000  0.400000
3  0.333333  0.285714
4  0.250000  0.222222

Conclusion

In this blog post, we learned how to calculate the percentage change in Pandas using the pct_change() method. We also explored how to handle missing values with the fillna() method. This functionality is useful for analyzing time series data or comparing the performance of different variables.

Pandas makes it easy to perform complex calculations and transformations on data, making it a popular choice among data analysts and scientists. Keep exploring the documentation and experiment with different functions to unleash the full power of Pandas in your data analysis projects.

Happy coding!