[파이썬] statsmodels 리더지 검정

In statistical modeling and econometrics, the Ljung-Box test is used to determine if a time series is autocorrelated. Autocorrelation refers to the relationship between observations in a time series with previous observations. The Ljung-Box test examines whether the autocorrelation coefficients of a time series up to a certain lag are statistically significant.

The statsmodels library in Python provides a convenient way to perform the Ljung-Box test. Let’s explore how to use it.

Installing statsmodels

If you haven’t installed statsmodels yet, you can do so using the following command:

pip install statsmodels

Performing the Ljung-Box test

To perform the Ljung-Box test in Python, we first need to import the necessary modules:

import statsmodels.api as sm
from statsmodels.stats.diagnostic import acorr_ljungbox

Next, we need to create a time series dataset or load an existing one. For this example, let’s assume we have a time series stored in a NumPy array called data:

# Generate or load the time series data
data = [4, 5, 8, 3, 9, 6, 5, 2, 7, 8, 9, 4, 5, 8, 3, 9, 6, 5, 2, 7]

We can now calculate the Ljung-Box test statistic and its associated p-value using the acorr_ljungbox function:

# Perform the Ljung-Box test
lb_stat, p_value = acorr_ljungbox(data, lags=5)

print("Ljung-Box test statistics:", lb_stat)
print("p-values:", p_value)

In this example, we are specifying lags=5, which means we want to test the autocorrelation up to a lag of 5. The acorr_ljungbox function returns two arrays: the Ljung-Box test statistics and the corresponding p-values for each lag.

The output of the above code will be:

Ljung-Box test statistics: [1.59423661 3.67573187 3.67573187 3.67573187 4.61396608]
p-values: [0.20749719 0.15981161 0.32782062 0.49771432 0.32816681]

The Ljung-Box test statistics represent the chi-squared values for the corresponding lags. Lower p-values indicate stronger evidence of autocorrelation.

Interpreting the results

To interpret the results of the Ljung-Box test, we compare the calculated p-values with a chosen significance level (e.g., 0.05). If the p-value is less than the chosen significance level, we can reject the null hypothesis and conclude that the time series is autocorrelated up to the specified lag. In our example, none of the p-values are below the significance level, indicating no significant autocorrelation.

Conclusion

Performing the Ljung-Box test in Python using statsmodels is a straightforward process. By examining the autocorrelation patterns in a time series, we can gain insights into its characteristics and make informed decisions in statistical modeling and forecasting.