본문 바로가기

자연어 처리 과정

When an activation function is non-zero centered why it invokes zig-zag path?

https://rohanvarma.me/inputnormalization/

 

Downsides of the sigmiod activation and why you should center your inputs

Software Engineer @ Facebook

rohanvarma.me

 

'자연어 처리 과정' 카테고리의 다른 글

Training with SAM optimizer  (0) 2023.02.10
Flat minima VS Sharp minima  (0) 2023.01.26
Contiguous는 도대체 뭘까?  (0) 2023.01.18
Pruning이란?  (0) 2023.01.16
What is Learning Rate Warmup?  (1) 2023.01.16