Code Llama Paper Explained
Discover an in-depth review of Code Llama paper, a specialized version of the Llama 2 model designed for coding tasks
Code Llama Paper Explained Read More »
Discover an in-depth review of Code Llama paper, a specialized version of the Llama 2 model designed for coding tasks
Code Llama Paper Explained Read More »
Diving into WizardMath, a LLM for mathematical reasoning contributed by Microsoft, surpassing models such as WizardLM and LLaMA-2.
In this post we dive into Orca’s paper which shows how to do imitation tuning effectively, outperforms ChatGPT with about 7% of its size!
Orca Research Paper Explained Read More »
In this post we dive into the LongNet research paper which introduced the Dilated Attention mechanism and explain how it works
LongNet: Scaling Transformers to 1B Tokens with Dilated Attention Read More »
In this post we explain LIMA, a LLM by Meta AI which was fine-tuned on only 1000 samples, yet it achieves competitive results with top LLMs
LIMA from Meta AI – Less Is More for Alignment of LLMs Read More »
Dive into Shepherd, a LLM from Meta AI which is purposed to critique responses from other LLMs, a step in resolving LLMs hallucinations.
Shepherd: A Critic for Language Model Generation Read More »
LLMs are aligned for safety to avoid generation of harmful content. In this post we review a paper that is able to successfully attack LLMs.
Universal and Transferable Adversarial LLM Attacks Read More »
In this post we review Google DeepMind’s paper that introduces Soft Mixture of Experts, a fully-differentiable sparse Transformer.
From Sparse to Soft Mixture of Experts Read More »