[파이썬] seaborn 스트립 플롯(Strip plot)

Seaborn is a popular data visualization library in Python that is built on top of Matplotlib. It provides a high-level interface for creating beautiful and informative statistical graphics. One of the useful plots available in Seaborn is the strip plot, which allows us to visualize the distribution of a continuous variable grouped by a categorical variable.

What is a Strip Plot?

A strip plot is a type of scatter plot that displays the relationship between two variables. It is called a “strip” plot because each data point is represented by a horizontal strip or swarm, with one variable along the x-axis and the other variable along the y-axis. This plot is especially useful when you want to visualize the distribution of a continuous variable across different categories.

How to Create a Strip Plot with Seaborn?

To create a strip plot in Python using Seaborn, you first need to import the necessary libraries:

import seaborn as sns
import matplotlib.pyplot as plt

Next, load your data or create a DataFrame. Let’s assume we have a DataFrame df with two columns: category (categorical variable) and value (continuous variable).

import pandas as pd

data = {
    'category': ['A', 'A', 'B', 'B', 'C', 'C'],
    'value': [10, 15, 12, 18, 8, 6]
}

df = pd.DataFrame(data)

Once you have the data, you can create a strip plot using the stripplot function from Seaborn:

sns.stripplot(x='category', y='value', data=df)
plt.show()

This will produce a simple strip plot with category on the x-axis and value on the y-axis. Each data point will be represented by a strip at a particular location on the x-axis, according to its category.

Customizing the Strip Plot

Seaborn provides various options to customize the appearance of the strip plot. You can modify the color, size, orientation, and other aesthetics to make the plot more informative and visually appealing. Here are a few examples:

sns.stripplot(x='category', y='value', data=df, color='blue')
sns.stripplot(x='category', y='value', data=df, jitter=0.2)
sns.stripplot(x='category', y='value', data=df, marker='D')

By experimenting with these options and exploring the Seaborn documentation, you can create custom strip plots that suit your specific needs.

Conclusion

Seaborn’s strip plot is a useful visualization tool to explore the distribution of a continuous variable across different categories. With just a few lines of code, you can create informative and visually appealing strip plots in Python.