[파이썬] bokeh 병렬 좌표 플롯

In this blog post, we will explore how to create a parallel coordinate plot using Bokeh in Python. A parallel coordinate plot is a powerful visualization technique that is particularly useful for visualizing high-dimensional data. It provides a compact way to compare multiple variables simultaneously.

Installing Bokeh

Before we begin, make sure you have Bokeh installed. You can install it using pip:

pip install bokeh

Importing the necessary libraries

To get started, let’s import the necessary libraries:

import numpy as np
from bokeh.plotting import figure, show
from bokeh.models import ColumnDataSource, Range1d, LinearAxis

Generating a random dataset

For demonstration purposes, let’s generate a random dataset with 5 variables:

np.random.seed(0)
data = np.random.random(size=(100, 5))

Creating the parallel coordinate plot

To create the parallel coordinate plot, we need to define the dimensions and ranges of the axes, as well as the source of the data. We will use the ColumnDataSource class provided by Bokeh to store our data.

# Create a DataFrame from the data
df = pd.DataFrame(data, columns=['Var1', 'Var2', 'Var3', 'Var4', 'Var5'])

# Create a ColumnDataSource
source = ColumnDataSource(df)

# Create a figure object
p = figure(width=800, height=400)

# Define the ranges for each axis
p.x_range = Range1d(0, 1)
p.y_range = Range1d(0, 1)

# Create the vertical axes for each variable
for i, var in enumerate(df.columns):
    p.yaxis[i] = LinearAxis(axis_label=var, bounds=(0, 1))

# Plot the lines connecting the points
for i, var in enumerate(df.columns[:-1]):
    p.line(var, df.columns[i+1], source=source, line_color='blue')

Adding interactivity

To make the plot more interactive, we can add tooltips and hover effects to display additional information when the user hovers over the data points. We can also add brushing to allow the user to select subsets of the data.

from bokeh.models import HoverTool, BoxSelectTool

# Add tooltips to display additional information
hover = HoverTool(tooltips=[(var, f'@{var}') for var in df.columns])

# Add hover and brushing tools
p.add_tools(hover, BoxSelectTool())

# Show the plot
show(p)

Conclusion

In this blog post, we learned how to create a parallel coordinate plot using Bokeh in Python. This visualization technique is particularly useful for exploring high-dimensional datasets, allowing us to compare multiple variables simultaneously. With Bokeh’s interactivity features, we can enhance the plot by adding tooltips, hover effects, and brushing for data selection.