[파이썬] fastai TTA (Test Time Augmentation)

In the field of computer vision, Test Time Augmentation (TTA) is a technique used to improve the accuracy of models by generating multiple augmented versions of test images and making predictions on each of them. TTA is especially effective when the model is trained on augmented data.

The fastai library provides a simple and convenient way to implement TTA in Python. In this blog post, we will explore how to use TTA with fastai to enhance the performance of our computer vision models.

What is Test Time Augmentation (TTA)?

TTA is a strategy that aims to improve the accuracy and robustness of machine learning models during the inference phase. It involves applying different image transformations to the test images and making predictions on each transformed image. By averaging or taking the maximum/minimum of the predictions, we can obtain a final prediction that is more robust and less sensitive to specific image variations.

The idea behind TTA is that by generating multiple versions of the same image, we can simulate the model’s behavior on a larger and more diverse test set. This can help compensate for variations in lighting conditions, viewpoints, and other factors that may affect the performance of the model.

Using TTA with fastai

The fastai library provides a built-in tta() method that enables us to perform TTA on our trained models. This method automates the process of generating augmented versions of test images and making predictions on each of them.

To apply TTA with fastai, we follow these steps:

  1. Load the trained model.
  2. Load the test dataset.
  3. Perform TTA on the test dataset using tta().
  4. Get the predictions on the augmented test dataset.
  5. Combine the predictions using an aggregation function (e.g., average, maximum, minimum).
  6. Obtain the final predictions.

Here’s an example code snippet that demonstrates how to use TTA with fastai:

from fastai.vision.all import *

# Load the trained model
learn = load_learner("path/to/model.pkl")

# Load the test dataset
test_data = ImageDataLoaders.from_folder("path/to/test/data")

# Perform TTA on the test dataset
tta_predictions = learn.tta(dl=test_data.valid)

# Combine the predictions using average
final_predictions = tta_predictions[0].mean(dim=0)

# Get the class labels
class_labels = learn.dls.vocab

# Print the final predictions
for idx, pred in enumerate(final_predictions):
    print(f"Image {idx}: {class_labels[pred.argmax()]}")

In this example, we first load the trained model using load_learner(). We then load the test dataset using ImageDataLoaders.from_folder(). Next, we apply TTA on the test dataset by calling learn.tta() and passing the validation dataloader.

The tta() method returns a list of augmented predictions, where each prediction corresponds to a different transformation of the test image. In this example, we use an average aggregation function (mean(dim=0)) to get the final predictions.

Finally, we obtain the class labels from the learn.dls.vocab attribute and print the final predictions for each image in the test dataset.

Conclusion

TTA is a powerful technique for improving the accuracy and robustness of computer vision models during the inference phase. By leveraging fastai’s built-in tta() method, we can easily apply TTA to our trained models and obtain more reliable predictions.

Using TTA with fastai can be a valuable addition to our computer vision workflow, enabling us to extract better performance from our models and make more accurate predictions. So, be sure to give TTA a try in your next computer vision project!