NLP Papers
Looking for a specific paper or subject?
WizardMath – Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Diving into WizardMath, a LLM for mathematical reasoning contributed by Microsoft, surpassing models such as WizardLM and LLaMA-2…
Orca Research Paper Explained
In this post we dive into Orca’s paper which shows how to do imitation tuning effectively, outperforms ChatGPT with about 7% of its size!…
LongNet: Scaling Transformers to 1B Tokens with Dilated Attention
In this post we dive into the LongNet research paper which introduced the Dilated Attention mechanism and explain how it works…
LIMA from Meta AI – Less Is More for Alignment of LLMs
In this post we explain LIMA, a LLM by Meta AI which was fine-tuned on only 1000 samples, yet it achieves competitive results with top LLMs…
Shepherd: A Critic for Language Model Generation
Dive into Shepherd, a LLM from Meta AI which is purposed to critique responses from other LLMs, a step in resolving LLMs hallucinations…
Universal and Transferable Adversarial LLM Attacks
LLMs are aligned for safety to avoid generation of harmful content. In this post we review a paper that is able to successfully attack LLMs…
From Sparse to Soft Mixture of Experts
In this post we review Google DeepMind’s paper that introduces Soft Mixture of Experts, a fully-differentiable sparse Transformer…