[파이썬] seaborn 박스 플롯(Box plot)

In data analysis and visualization, box plots provide a concise summary of the distribution of a dataset. They display the minimum, first quartile, median, third quartile, and maximum values of a dataset, as well as any potential outliers.

In this blog post, we will explore how to create box plots using the Python data visualization library, Seaborn.

Installation

Before we begin, let’s make sure that Seaborn is installed. Open your terminal and run the following command:

pip install seaborn

Importing Seaborn

Once Seaborn is installed, we can import it into our Python script or notebook using the import statement:

import seaborn as sns

Loading the Data

Let’s assume we have a dataset that we want to visualize using a box plot. We will load the data into a Pandas DataFrame. For this example, we will use the famous Iris dataset:

import seaborn as sns
import pandas as pd

# Load the Iris dataset from Seaborn
iris = sns.load_dataset("iris")

Creating a Box Plot

To create a box plot using Seaborn, we can use the boxplot function.

import seaborn as sns

# Create a box plot using Seaborn
sns.boxplot(x="species", y="sepal_width", data=iris)

In the above example, we are creating a box plot of the sepal width variable in the Iris dataset, grouped by the species.

Customizing the Box Plot

Seaborn provides a variety of options to customize the appearance of box plots. Here are a few examples:

Changing the Color Palette

You can change the color palette of the box plot using the palette parameter. Seaborn provides several built-in color palettes, such as “deep”, “muted”, “bright”, and “pastel”.

import seaborn as sns

# Create a box plot with a custom color palette
sns.boxplot(x="species", y="sepal_width", data=iris, palette="pastel")

Removing Outliers

By default, box plots show individual data points that are considered outliers. You can remove these outliers by setting the showfliers parameter to False.

import seaborn as sns

# Create a box plot without outliers
sns.boxplot(x="species", y="sepal_width", data=iris, showfliers=False)

Adding a Title and Labels

You can add a title and labels to your box plot using the title, xlabel, and ylabel parameters.

import seaborn as sns

# Create a box plot with a title and labels
sns.boxplot(x="species", y="sepal_width", data=iris)
plt.title("Box Plot of Sepal Width by Species")
plt.xlabel("Species")
plt.ylabel("Sepal Width")

Conclusion

Box plots are powerful tools for visualizing the distribution of a dataset. With Seaborn, creating box plots in Python is straightforward and customizable. By utilizing Seaborn’s variety of options, you can create informative and visually appealing box plots for your data analysis tasks.

Please refer to the Seaborn documentation for more details on customizations and advanced usage.

Happy coding with Seaborn box plots!