[파이썬] seaborn 범주형 데이터 시각화

Seaborn is a powerful data visualization library in Python that provides a high-level interface for creating beautiful and informative graphics. One of the key features of Seaborn is its ability to handle and visualize categorical data effectively. In this blog post, we will explore some of the commonly used Seaborn functions for categorical data visualization.

Bar Plots

A bar plot is one of the most common ways to visualize categorical data. It shows the relationship between a categorical variable and a numerical variable. Seaborn provides the barplot() function to create bar plots.

import seaborn as sns

# Load a dataset
tips = sns.load_dataset('tips')

# Create a basic bar plot
sns.barplot(x='day', y='total_bill', data=tips)

In the example above, we load a dataset called ‘tips’ from the Seaborn library. We then use the barplot() function to create a bar plot, where the x-axis represents the days of the week (a categorical variable) and the y-axis represents the total bill amount (a numerical variable).

Categorical Scatter Plots

Categorical scatter plots are useful when we want to visualize the relationship between two categorical variables. Seaborn provides the stripplot() function to create categorical scatter plots.

# Create a categorical scatter plot
sns.stripplot(x='day', y='total_bill', data=tips)

The code above creates a scatter plot where the x-axis represents the days of the week and the y-axis represents the total bill amount. Each point in the scatter plot represents a data point from the dataset.

Count Plots

Count plots are used to visualize the distribution of a single categorical variable. Seaborn provides the countplot() function to create count plots.

# Create a count plot
sns.countplot(x='day', data=tips)

The code above creates a count plot where the x-axis represents the days of the week, and the y-axis represents the count of occurrences of each day in the dataset.

Box Plots

Box plots are used to visualize the distribution and statistical summary of numerical variables for different categories. Seaborn provides the boxplot() function to create box plots.

# Create a box plot
sns.boxplot(x='day', y='total_bill', data=tips)

In the code snippet above, we create a box plot where the x-axis represents the days of the week and the y-axis represents the total bill amount. The box plot summarizes the distribution of the total bill amount for each day.

Conclusion

Seaborn provides a variety of functions to visualize categorical data effectively. In this blog post, we covered some of the commonly used functions such as bar plots, categorical scatter plots, count plots, and box plots. Remember to explore the Seaborn documentation to learn more about advanced techniques and customization options available for categorical data visualization.