🪴 My Second Brain

Search

❯

❯

❯

타임라인

Mar 13, 2024, 1 min read

#NLP
#CV

Attention Is All You Need (2017.06)

#NLP Transformer

CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data (2019.11)

Scaling Laws for Neural Language Models (2020.01)

Language Models are Few-Shot Learners (2020.05)

The pile: An 800gb dataset of diverse text for language modeling (2020.12)

Deep Learning on a Data Diet: Finding Important Examples Early in Training (2021.07)

Beyond neural scaling laws: beating power law scaling via data pruning (2022.06)

#CV SSL prototype

SemDeDup: Data-efficient learning at web-scale through semantic deduplication (2023.04)

Textbooks Are All You Need (2023.06)

#NLP phi-1

D4: Improving LLM Pretraining via Document De-Duplication and Diversification (2023.08)

#NLP SemDedDup + SSL prototype

When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale (2023.09)

Pretraining on the Test Set Is All You Need (2023.09)

Textbooks Are All You Need II: phi-1.5 technical report (2023.11)

비슷한 연구

Graph View

Pretraining on the Test Set Is All You Need (2023.09)
비슷한 연구

Backlinks

No backlinks found

Created with Quartz v4.2.2 © 2024

Github
Tistory