AI Papers Academy, Author at AI Papers Academy

Vision Transformers Explained | The ViT Paper

In this post we go back to the important vision transformers paper, to understand how ViT adapted transformers to computer vision

Dive into Orca 2 research paper, the second version of the successful Orca small language model from Microsoft,

Following LCM-LoRA release, in this post we explore the evolution of diffusion models up to latent consistency models with LoRA

In this post we dive into Microsoft’s CODEFUSION, an approach to use diffusion models for code generation that achieves remarkable results

In this post we dive into Table-GPT, a novel research by Microsoft, that empowers LLMs to understand tabular data

In this post we explain the paper “Vision Transformers Need Registers” by Meta AI, that explains an interesting behavior in DINOv2 features

In this post we dive into Emu, a text-to-image generation model by Meta AI, which is quality-tuned to generate highly aesthetic images.

In this post we dive into NExT-GPT, a multimodal large language model (MM-LLM), that can both understand and respond with multiple modalities

In this post we dive into the Large Language Models As Optimizers paper by Google DeepMind, which introduces OPRO (Optimization by PROmpting).

In this post we cover FACET, a new dataset created by Meta AI in order to evaluate a benchmark for fairness of computer vision models