[파이썬] lightgbm 대화식 시각화 도구 활용

LightGBM is a popular gradient boosting framework that is widely used for solving machine learning problems. It is known for its efficiency, scalability, and speed. One of the key benefits of using LightGBM is its ability to handle large datasets with high dimensionality.

In addition to its powerful modelling capabilities, LightGBM also provides tools for visualizing and understanding the trained models. These visualization tools help to gain insights into the model’s behavior, feature importance, and decision-making process.

In this blog post, we will explore how to utilize the interactive visualization tools provided by LightGBM in Python. We will demonstrate the usage of two main tools: plot_importance and plot_tree.

Plotting Feature Importance

Feature importance analysis is crucial in understanding the significance of different features in a model. LightGBM provides a built-in function called plot_importance which allows us to visualize the feature importance scores.

import lightgbm as lgb
import matplotlib.pyplot as plt

# Train the LightGBM model
model = lgb.LGBMClassifier()
model.fit(X_train, y_train)

# Plot feature importance
lgb.plot_importance(model, max_num_features=10)
plt.show()

In the above code, we first train a LightGBM classifier model using the LGBMClassifier class. Then, we pass the trained model to plot_importance function along with the max_num_features parameter which specifies the number of top features to be displayed in the plot. Finally, we use plt.show() to display the feature importance plot.

This plot will show the importance score of each feature in descending order. Features with higher scores indicate a higher level of importance.

Plotting Decision Trees

Another useful tool provided by LightGBM is the ability to plot individual decision trees. This can be done using the plot_tree function.

# Plot the first tree of the model
lgb.plot_tree(model, tree_index=0)
plt.show()

In the above code, we use the plot_tree function to plot the first tree of the trained model. The tree_index parameter allows us to specify the index of the tree to be visualized. Again, we use plt.show() to display the tree plot.

This plot will show the structure of the decision tree, including the splitting points and the leaf nodes. It can be very helpful in understanding how the model makes decisions based on the input features.

Conclusion

LightGBM provides powerful visualization tools that enable us to gain insights into the trained models. The plot_importance function allows us to visualize feature importance, while the plot_tree function helps us understand the decision-making process of individual trees.

By utilizing these visualization tools, we can better understand the behavior of our LightGBM models and make more informed decisions when it comes to feature selection and model interpretation.

Give these tools a try in your next LightGBM project and let us know how they help you in the comments below!