NLP Papers

Looking for a specific paper or subject?


  • CLLMs: Consistency Large Language Models

    CLLMs: Consistency Large Language Models

    In this post we dive into Consistency Large Language Models, or CLLMs in short, which were introduced in a recent research paper that goes by the same name. Before diving in, if you prefer a video format then check out the following video: Motivation Top LLMs such as GPT-4, LLaMA3 and more, are pushing the…

  • ReFT: Representation Finetuning for Language Models

    ReFT: Representation Finetuning for Language Models

    In this post we dive into a recent research paper which presents a promising novel direction for fine-tuning LLMs, achieving remarkable results when considering both parameters count and performance. Before diving in, if you prefer a video format then check out the following video: Motivation – Finetuning a Pre-trained Transformer is Expensive A common method…

  • Stealing Part of a Production Language Model

    Stealing Part of a Production Language Model

    Many of the top large language models today such as GPT-4, Claude 3 and Gemini are closed source, so a lot about the inner workings of these models is not known to the public. One justification for this is usually the competitive landscape, since companies are investing a lot of money and effort to create…

  • The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

    The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

    In this post, we dive into a new and exciting research paper by Microsoft, titled: “The Era of 1-bit LLMs: All Large Language Models are 1.58 bits”. In recent years, we’ve seen a tremendous success of large language models with models such as GPT, LLaMA and more. As we move forward, we see that the…

  • Self-Rewarding Language Models by Meta AI

    Self-Rewarding Language Models by Meta AI

    On January 18, Mark Zuckerberg announced that the long-term goal of Meta AI is to build general intelligence, and open-source it responsibly. So Meta AI is officially working on building an open-source AGI. On the same day, Meta AI have released a new research paper titled “Self-Rewarding Language Models”, which can be a step that…

  • Fast Inference of Mixture-of-Experts Language Models with Offloading

    Fast Inference of Mixture-of-Experts Language Models with Offloading

    In this post, we dive into a new research paper, titled: “Fast Inference of Mixture-of-Experts Language Models with Offloading”. Motivation LLMs Are Getting Larger In recent years, large language models are in charge of remarkable advances in AI, with models such as GPT-3 and 4 which are closed source and with open source models such…

  • LLM in a flash: Efficient Large Language Model Inference with Limited Memory

    LLM in a flash: Efficient Large Language Model Inference with Limited Memory

    In this post we dive into a new research paper from Apple titled: “LLM in a flash: Efficient Large Language Model Inference with Limited Memory”. Before divining in, if you prefer a video format then check out our video review for this paper: Motivation In recent years, we’ve seen a tremendous success of large language…

  • Orca 2: Teaching Small Language Models How to Reason

    Orca 2: Teaching Small Language Models How to Reason

    Several months ago, Microsoft released the first version of Orca, which achieved remarkable results, even surpassing ChatGPT on data from BigBench-Hard dataset, and the ideas from Orca 1 helped to create better language models released in the recent period. The Orca 2 model, presented in the paper we review in this post, achieves significantly better…

  • CODEFUSION: A Pre-trained Diffusion Model for Code Generation

    CODEFUSION: A Pre-trained Diffusion Model for Code Generation

    CODEFUSION is a new code generation model which was introduced in a research paper from Microsoft, titled: “CODEFUSION: A Pre-trained Diffusion Model for Code Generation”. Recently, we’ve observed a significant progress with code generation using AI, which is mostly based on large language models (LLMs), so we refer to them as code LLMs. With a…

  • Table-GPT: Empower LLMs To Understand Tables

    Table-GPT: Empower LLMs To Understand Tables

    Nowadays, we are witnessing a tremendous progress with large language models (LLMs) such as ChatGPT, Llama and more, where we can feed a LLM with a text instruction or question, and most of the times get an accurate response from the model. However, if we’ll try to feed the model with a table data, in…

Scroll to Top