WizardMath – Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Diving into WizardMath, a LLM for mathematical reasoning contributed by Microsoft, surpassing models such as WizardLM and LLaMA-2.
Diving into WizardMath, a LLM for mathematical reasoning contributed by Microsoft, surpassing models such as WizardLM and LLaMA-2.
In this post we dive into Orca’s paper which shows how to do imitation tuning effectively, outperforms ChatGPT with about 7% of its size!
In this post we dive into the LongNet research paper which introduced the Dilated Attention mechanism and explain how it works
LongNet: Scaling Transformers to 1B Tokens with Dilated Attention Read More »
DINOv2 by Meta AI finally gives us a foundational model for computer vision. We’ll explain what it means and why DINOv2 can count as such
DINOv2 from Meta AI – A Foundational Model in Computer Vision Read More »
Dive into I-JEPA, Image-based Joint-Embedding Predictive Architecture, the first model based on Yann LeCun’s vision for a more human-like AI.
I-JEPA: The First Human-Like Computer Vision Model Read More »
ImageBind is a multimodality model by Meta AI. In this post, we dive into ImageBind research paper to understand what it is and how it works.
Consistency models are a new type of generative models which were introduced by Open AI, and in this post we will dive into how they work
Consistency Models – Optimizing Diffusion Models Inference Read More »
In this post we explain LIMA, a LLM by Meta AI which was fine-tuned on only 1000 samples, yet it achieves competitive results with top LLMs
LIMA from Meta AI – Less Is More for Alignment of LLMs Read More »
Dive into Shepherd, a LLM from Meta AI which is purposed to critique responses from other LLMs, a step in resolving LLMs hallucinations.
Shepherd: A Critic for Language Model Generation Read More »
LLMs are aligned for safety to avoid generation of harmful content. In this post we review a paper that is able to successfully attack LLMs.
Universal and Transferable Adversarial LLM Attacks Read More »