Two of them are both the model aligned by human feedback, which is called RLHF, and trained with instruction fine-tuning.
The differences of them is below.
References
https://www.theinsaneapp.com/2023/05/instructgpt-vs-chatgpt.html
'자연어 처리 과정' 카테고리의 다른 글
Autograd explained diagram (0) | 2024.02.14 |
---|---|
Chain rule practice (0) | 2023.08.17 |
Transformer: Scaled dot-product attention (0) | 2023.08.13 |
Transformer: Multi-head attention (0) | 2023.08.13 |
Quotient rule for derivative of softmax with respect to fk(x) (0) | 2023.08.09 |