🪴 My Second Brain

Search

SearchSearch
            • Beyond neural scaling laws beating power law scaling via data pruning
            • Deep Learning on a Data Diet Finding Important Examples Early in Training
            • SemDeDup Data-efficient learning at web-scale through semantic deduplication
              • LSH-MinHash
              • Memorization Without Overfitting Analyzing the Training Dynamics of Large Language Models
              • When Less is More Investigating Data Pruning for Pretraining LLMs at Scale
              • Language Models are Few-Shot Learners
              • The pile An 800gb dataset of diverse text for language modeling
            • D4 Improving LLM Pretraining via Document De-Duplication and Diversification
            • Scaling Laws for Neural Language Models
            • Textbooks are All You Need
          • Embedding
          • Inference
          • Language Models are Few-Shot Learners
          • Learning
          • Tokenization
          • 메모
          • 타임라인
            • Deep contextualized word representations
            • Explaining How Transformers Use Context to Build Predictions
            • Rethinking the Inception Architecture for Computer Vision
            • 작성 요령
      • Table
      • tools
      • Words
      • 사전 학습용 텍스트 데이터셋 평가
    Home

    ❯

    논문 정리

    ❯

    Language Model

    Folder: 논문-정리/Language-Model

    15 items under this folder.

    • Mar 13, 2024

      Scaling Laws for Neural Language Models

      • Mar 13, 2024

        Textbooks are All You Need

        • Mar 13, 2024

          Embedding

          • Mar 13, 2024

            Inference

            • Mar 13, 2024

              Language Models are Few-Shot Learners

              • Mar 13, 2024

                Learning

                • Mar 13, 2024

                  Tokenization

                  • Mar 13, 2024

                    메모

                    • Mar 13, 2024

                      타임라인

                      • #NLP
                      • #CV
                    • Mar 13, 2024

                      D4 Improving LLM Pretraining via Document De-Duplication and Diversification

                      • Mar 13, 2024

                        LSH-MinHash

                        • Mar 13, 2024

                          Memorization Without Overfitting Analyzing the Training Dynamics of Large Language Models

                          • Mar 13, 2024

                            When Less is More Investigating Data Pruning for Pretraining LLMs at Scale

                            • Mar 13, 2024

                              Language Models are Few-Shot Learners

                              • Mar 13, 2024

                                The pile An 800gb dataset of diverse text for language modeling


                                Created with Quartz v4.2.2 © 2024

                                • Github
                                • Tistory