🪴 My Second Brain

Search

❯

❯

❯

Inference

Mar 13, 2024, 1 min read

Explaning How Transformer Use Context to Build Predictions

언어 모델이 문장을 생성할 때 이전의 context를 어떻게 활용하는 지 분석하기 위한 ALTI-Logit explanation을 제시. 언어적 현상의 증거와 예측하고자 하는 단어가 멀리 떨어진 경우 이전에 사용되던 Logit method에 비해 높은 성능을 보임.

모델이 어떻게 언어를 학습하는가 보다는 학습이 이런 방향으로 진행되었다라고 하는 게 맞는듯

Ferrando, Javier, et al. “Explaining How Transformers Use Context to Build Predictions.” arXiv preprint arXiv:2305.12535 (2023).

Are Emergent Abilities of Large Language Models a Mirage?

Graph View

Explaning How Transformer Use Context to Build Predictions
Are Emergent Abilities of Large Language Models a Mirage?

Backlinks

No backlinks found

Created with Quartz v4.2.2 © 2024

Github
Tistory