🪴 My Second Brain
Search
Search
Search
Dark mode
Light mode
Explorer
논문 정리
Computer Vision
Dataset
Beyond neural scaling laws beating power law scaling via data pruning
Deep Learning on a Data Diet Finding Important Examples Early in Training
SemDeDup Data-efficient learning at web-scale through semantic deduplication
Language Model
Dataset
Deduplicaion
LSH-MinHash
Memorization Without Overfitting Analyzing the Training Dynamics of Large Language Models
Pruning
When Less is More Investigating Data Pruning for Pretraining LLMs at Scale
Quality
Language Models are Few-Shot Learners
The pile An 800gb dataset of diverse text for language modeling
D4 Improving LLM Pretraining via Document De-Duplication and Diversification
Scaling Laws for Neural Language Models
Textbooks are All You Need
Embedding
Inference
Language Models are Few-Shot Learners
Learning
Tokenization
메모
타임라인
작업실
번역 중
AI
Deep contextualized word representations
Explaining How Transformers Use Context to Build Predictions
Rethinking the Inception Architecture for Computer Vision
작성 요령
Table
tools
Words
사전 학습용 텍스트 데이터셋 평가
Home
❯
논문 정리
❯
Computer Vision
❯
Dataset
Folder: 논문-정리/Computer-Vision/Dataset
3 items under this folder.
Mar 13, 2024
Beyond neural scaling laws beating power law scaling via data pruning
Mar 13, 2024
Deep Learning on a Data Diet Finding Important Examples Early in Training
#작성중
Mar 13, 2024
SemDeDup Data-efficient learning at web-scale through semantic deduplication