NLP

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

In this post, we dive into a new and exciting research paper by Microsoft, titled: “The Era of 1-bit LLMs: All Large Language Models are 1.58 bits”. In recent years, we’ve seen a tremendous success of large language models with models such as GPT, LLaMA and more. As we move forward, we see that the …

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Read More »

Fast Inference of Mixture-of-Experts Language Models with Offloading

In this post, we dive into a new research paper, titled: “Fast Inference of Mixture-of-Experts Language Models with Offloading”. Motivation LLMs Are Getting Larger In recent years, large language models are in charge of remarkable advances in AI, with models such as GPT-3 and 4 which are closed source and with open source models such …

Fast Inference of Mixture-of-Experts Language Models with Offloading Read More »

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

In this post we dive into a new research paper from Apple titled: “LLM in a flash: Efficient Large Language Model Inference with Limited Memory”. Before divining in, if you prefer a video format then check out our video review for this paper: Motivation In recent years, we’ve seen a tremendous success of large language …

LLM in a flash: Efficient Large Language Model Inference with Limited Memory Read More »

Orca 2: Teaching Small Language Models How to Reason

Several months ago, Microsoft released the first version of Orca, which achieved remarkable results, even surpassing ChatGPT on data from BigBench-Hard dataset, and the ideas from Orca 1 helped to create better language models released in the recent period. The Orca 2 model, presented in the paper we review in this post, achieves significantly better …

Orca 2: Teaching Small Language Models How to Reason Read More »

CODEFUSION: A Pre-trained Diffusion Model for Code Generation

CODEFUSION is a new code generation model which was introduced in a research paper from Microsoft, titled: “CODEFUSION: A Pre-trained Diffusion Model for Code Generation”. Recently, we’ve observed a significant progress with code generation using AI, which is mostly based on large language models (LLMs), so we refer to them as code LLMs. With a …

CODEFUSION: A Pre-trained Diffusion Model for Code Generation Read More »

Opro framework overview

Large Language Models As Optimizers – OPRO by Google DeepMind

OPRO (Optimization by PROmpting), is a new approach to leverage large language models as optimizers, which was introduced by Google DeepMind in a research paper titled “Large Language Models As Optimizers”. Large language models are very good at getting a prompt, such as an instruction or a question, and yield a useful response that match …

Large Language Models As Optimizers – OPRO by Google DeepMind Read More »

Active Evol-Instruct

WizardMath – Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct

Welcome WizardMath, a new open-source large language model contributed by Microsoft. While top large language models such as GPT-4 have demonstrated remarkable capabilities in various tasks including mathematical reasoning, they are not open-source. And for open-source large language models such as LLaMA-2 the situation is different, and until now they did not demonstrate strong math …

WizardMath – Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct Read More »

Scroll to Top