[파이썬] seaborn 시계열 데이터 시각화

Seaborn is a popular Python data visualization library that provides a high-level interface for creating beautiful and informative statistical graphics. It is built on top of Matplotlib and integrates seamlessly with Pandas dataframes.

In this blog post, we will explore how to use Seaborn to visualize time series data. Time series data is a sequence of data points collected at regular time intervals. It is commonly encountered in various domains such as finance, economics, weather forecasting, and more.

Getting Started

Before we begin, make sure you have Seaborn installed. You can install it using pip:

pip install seaborn

Next, let’s import the necessary libraries:

import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

Loading Time Series Data

To demonstrate time series visualization, let’s use the famous “Air Passengers” dataset. It records the number of international airline passengers for each month from 1949 to 1960.

# Load the Air Passengers dataset
df = pd.read_csv('air_passengers.csv', parse_dates=['Month'], index_col='Month')

Line Plot

One of the most common ways to visualize time series data is through a line plot. It shows how the variable (in our case, the number of airline passengers) changes over time.

# Line plot
sns.lineplot(data=df)
plt.ylabel('Number of Passengers')
plt.title('Air Passengers Over Time')
plt.show()

Line Plot

Area Plot

An area plot is similar to a line plot, but with the area under the line filled. It can be used to represent stacked quantitative data over time.

# Area plot
sns.lineplot(data=df, ci=None)
plt.fill_between(df.index, df['Passengers'], alpha=0.4)
plt.ylabel('Number of Passengers')
plt.title('Air Passengers Over Time')
plt.show()

Area Plot

Bar Plot

A bar plot can be used to compare categorical data over time. It shows the value of a variable for different categories at each time point.

# Bar plot
sns.barplot(data=df, x=df.index.year, y='Passengers', ci=None)
plt.xlabel('Year')
plt.ylabel('Number of Passengers')
plt.title('Air Passengers by Year')
plt.show()

Bar Plot

Heatmap

A heatmap is useful for visualizing the correlation between different variables over time. It uses color intensity to represent the strength of the relationship.

# Compute correlation matrix
corr = df.corr()

# Heatmap
sns.heatmap(data=corr, annot=True, cmap='coolwarm')
plt.title('Correlation Heatmap')
plt.show()

Heatmap

Conclusion

Seaborn provides a wide range of options for visualizing time series data in Python. In this blog post, we explored some of the commonly used plots such as line plots, area plots, bar plots, and heatmaps. These visualizations can help uncover patterns, trends, and relationships in time series data, enabling us to gain valuable insights.

Remember to explore the Seaborn documentation for more advanced techniques and customizations. Happy plotting!