🪴 My Second Brain
Search
Search
Search
Explorer
논문 정리
Computer Vision
Dataset
Beyond neural scaling laws beating power law scaling via data pruning
Deep Learning on a Data Diet Finding Important Examples Early in Training
SemDeDup Data-efficient learning at web-scale through semantic deduplication
Language Model
Dataset
Deduplicaion
LSH-MinHash
Memorization Without Overfitting Analyzing the Training Dynamics of Large Language Models
Pruning
When Less is More Investigating Data Pruning for Pretraining LLMs at Scale
Quality
Language Models are Few-Shot Learners
The pile An 800gb dataset of diverse text for language modeling
D4 Improving LLM Pretraining via Document De-Duplication and Diversification
Scaling Laws for Neural Language Models
Textbooks are All You Need
Embedding
Inference
Language Models are Few-Shot Learners
Learning
Tokenization
메모
타임라인
작업실
번역 중
AI
Deep contextualized word representations
Explaining How Transformers Use Context to Build Predictions
Rethinking the Inception Architecture for Computer Vision
작성 요령
Table
tools
Words
사전 학습용 텍스트 데이터셋 평가
Dark mode
Light mode
Home
❯
Table
Table
Graph View
Backlinks
No backlinks found
Dark mode
Light mode