NExT-GPT: Any-to-Any Multimodal LLM
In this post we dive into NExT-GPT, a multimodal large language model (MM-LLM), that can both understand and respond with multiple modalities
NExT-GPT: Any-to-Any Multimodal LLM Read More »
In this post we dive into NExT-GPT, a multimodal large language model (MM-LLM), that can both understand and respond with multiple modalities
NExT-GPT: Any-to-Any Multimodal LLM Read More »
In this post we dive into the Large Language Models As Optimizers paper by Google DeepMind, which introduces OPRO (Optimization by PROmpting).
Large Language Models As Optimizers – OPRO by Google DeepMind Read More »
In this post we cover FACET, a new dataset created by Meta AI in order to evaluate a benchmark for fairness of computer vision models
FACET: Fairness in Computer Vision Evaluation Benchmark Read More »
Discover an in-depth review of Code Llama paper, a specialized version of the Llama 2 model designed for coding tasks
Code Llama Paper Explained Read More »
Diving into WizardMath, a LLM for mathematical reasoning contributed by Microsoft, surpassing models such as WizardLM and LLaMA-2.
In this post we dive into Orca’s paper which shows how to do imitation tuning effectively, outperforms ChatGPT with about 7% of its size!
Orca Research Paper Explained Read More »
In this post we dive into the LongNet research paper which introduced the Dilated Attention mechanism and explain how it works
LongNet: Scaling Transformers to 1B Tokens with Dilated Attention Read More »
DINOv2 by Meta AI finally gives us a foundational model for computer vision. We’ll explain what it means and why DINOv2 can count as such
DINOv2 from Meta AI – Finally a Foundational Model in Computer Vision Read More »
Dive into I-JEPA, Image-based Joint-Embedding Predictive Architecture, the first model based on Yann LeCun’s vision for a more human-like AI.
I-JEPA: The First Human-Like Computer Vision Model Read More »
ImageBind is a multimodality model by Meta AI. In this post, we dive into ImageBind research paper to understand what it is and how it works.
ImageBind: One Embedding Space To Bind Them All Read More »