[TensorFlow] Attention 레이어 예제

17 Aug 2023

tensorflow

tf.keras.layers.Attention 레이어는 주로 자연어 처리(NLP)에서 사용되며, 입력 시퀀스의 각 단어에 대한 가중치를 계산하여 중요한 정보에 더 많은 관심을 두는 메커니즘을 제공합니다. 아래는 tf.keras.layers.Attention 레이어를 사용한 간단한 예제입니다.

import tensorflow as tf
from tensorflow.keras.layers import Input, Embedding, LSTM, Attention, Concatenate, Dense
from tensorflow.keras.models import Model

## 더미 시퀀스 데이터 생성
dummy_sequence = tf.random.normal(shape=(4, 10, 32))  # 4개의 시퀀스, 각 시퀀스 길이: 10, 특성 개수: 32

## 입력 레이어 정의
input_layer = Input(shape=(10, 32))

## LSTM 레이어 정의
lstm_layer = LSTM(64, return_sequences=True)(input_layer)

## Attention 레이어 정의
attention = Attention()([lstm_layer, lstm_layer])

## Concatenate 레이어 정의
concat_layer = Concatenate(axis=-1)([lstm_layer, attention])

## 밀집 레이어 정의
output_layer = Dense(10, activation='softmax')(concat_layer)

## 모델 생성
model = Model(inputs=input_layer, outputs=output_layer)

## 모델 컴파일
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

## 모델 요약
model.summary()

## 모델 훈련
model.fit(dummy_sequence, [0, 1, 2, 3], epochs=5)` 

위 예제에서는 입력 레이어와 LSTM 레이어를 정의하고, 그 뒤에 Attention 레이어를 사용하여 입력 시퀀스에 대한 가중치를 계산합니다. Attention 레이어의 출력은 원래 시퀀스와 같은 크기의 시퀀스입니다. 이후 Concatenate 레이어를 사용하여 LSTM 레이어의 출력과 Attention 레이어의 출력을 연결합니다. 마지막으로 Dense 레이어를 추가하여 출력을 생성합니다.

tf.keras.layers.Attention 레이어는 입력 시퀀스의 단어 간의 관계를 감지하는 데 사용될 수 있으며, 자연어 처리 작업에서 중요한 역할을 합니다.