본문 바로가기

자연어 처리 과정

What's the meaning of "Shifted right" in Transformer?

https://datascience.stackexchange.com/a/88983

 

What are the inputs to the first decoder layer in a Transformer model during the training phase?

I am trying to wrap my head around how the Transformer architecture works. I think I have a decent top-level understanding of the encoder part, sort of how the Key, Query, and Value tensors work in...

datascience.stackexchange.com

 

 

 

decoder로의 input은 <sos> token이 sequence의 맨 앞에 추가되어

원래 존재하던 sequence는 오른쪽으로 한 칸씩 sequence 된다는 것이다.

'자연어 처리 과정' 카테고리의 다른 글

BERT  (0) 2023.01.09
Transformer - overview  (1) 2023.01.06
RNN with attention(seq2seq with attention)  (0) 2023.01.04
Why do we need to add bias in neural networks?  (0) 2022.12.28
Why RNN share the same weights?  (0) 2022.12.28