NLP Papers

Looking for a specific paper or subject?


Code Llama repository-level reasoning

Code Llama Paper Explained

Discover an in-depth review of Code Llama paper, a specialized version of the Llama 2 model designed for coding tasks…
Active Evol-Instruct

WizardMath – Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct

Diving into WizardMath, a LLM for mathematical reasoning contributed by Microsoft, surpassing models such as WizardLM and LLaMA-2…
Imitation learning

Orca Research Paper Explained

In this post we dive into Orca’s paper which shows how to do imitation tuning effectively, outperforms ChatGPT with about 7% of its size!…
Dilated attention overview

LongNet: Scaling Transformers to 1B Tokens with Dilated Attention

In this post we dive into the LongNet research paper which introduced the Dilated Attention mechanism and explain how it works…
LIMA overview

LIMA from Meta AI – Less Is More for Alignment of LLMs

In this post we explain LIMA, a LLM by Meta AI which was fine-tuned on only 1000 samples, yet it achieves competitive results with top LLMs…
Shepherd example

Shepherd: A Critic for Language Model Generation

Dive into Shepherd, a LLM from Meta AI which is purposed to critique responses from other LLMs, a step in resolving LLMs hallucinations…
LLM attacks

Universal and Transferable Adversarial LLM Attacks

LLMs are aligned for safety to avoid generation of harmful content. In this post we review a paper that is able to successfully attack LLMs…
Soft MoE

From Sparse to Soft Mixture of Experts

In this post we review Google DeepMind’s paper that introduces Soft Mixture of Experts, a fully-differentiable sparse Transformer…
Scroll to Top