NLP

Mixture-of-Agents Enhances Large Language Model Capabilities

Motivation In recent years we witness remarkable advancements in AI and specifically in natural language understanding, which are driven by large language models. Today, there are various different LLMs out there such as GPT-4, Llama 3, Qwen, Mixtral and many more. In this post we review a recent paper, titled: “Mixture-of-Agents Enhances Large Language Model …

Mixture-of-Agents Enhances Large Language Model Capabilities Read More »

Arithmetic Transformers with Abacus Positional Embeddings

Introduction In the recent years, we witness remarkable success driven by large language models (LLMs). While LLMs perform well in various domains, such as natural language problems and code generation, there is still a lot of room for improvement with complex multi-step and algorithmic reasoning. To do research about algorithmic reasoning capabilities without pouring significant …

Arithmetic Transformers with Abacus Positional Embeddings Read More »

ReFT: Representation Finetuning for Language Models

In this post we dive into a recent research paper which presents a promising novel direction for fine-tuning LLMs, achieving remarkable results when considering both parameters count and performance. Before diving in, if you prefer a video format then check out the following video: Motivation – Finetuning a Pre-trained Transformer is Expensive A common method …

ReFT: Representation Finetuning for Language Models Read More »

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

In this post, we dive into a new and exciting research paper by Microsoft, titled: “The Era of 1-bit LLMs: All Large Language Models are 1.58 bits”. In recent years, we’ve seen a tremendous success of large language models with models such as GPT, LLaMA and more. As we move forward, we see that the …

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Read More »

Fast Inference of Mixture-of-Experts Language Models with Offloading

In this post, we dive into a new research paper, titled: “Fast Inference of Mixture-of-Experts Language Models with Offloading”. Motivation LLMs Are Getting Larger In recent years, large language models are in charge of remarkable advances in AI, with models such as GPT-3 and 4 which are closed source and with open source models such …

Fast Inference of Mixture-of-Experts Language Models with Offloading Read More »

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

In this post we dive into a new research paper from Apple titled: “LLM in a flash: Efficient Large Language Model Inference with Limited Memory”. Before divining in, if you prefer a video format then check out our video review for this paper: Motivation In recent years, we’ve seen a tremendous success of large language …

LLM in a flash: Efficient Large Language Model Inference with Limited Memory Read More »

Orca 2: Teaching Small Language Models How to Reason

Several months ago, Microsoft released the first version of Orca, which achieved remarkable results, even surpassing ChatGPT on data from BigBench-Hard dataset, and the ideas from Orca 1 helped to create better language models released in the recent period. The Orca 2 model, presented in the paper we review in this post, achieves significantly better …

Orca 2: Teaching Small Language Models How to Reason Read More »

Scroll to Top