Computer Vision Archives - AI Papers Academy

DINOv3 Paper Explained: The Computer Vision Foundation Model

In this post we break down Meta AI’s DINOv3 research paper, which introduces a state-of-the-art Computer Vision foundation models family

Dive into Continuous Thought Machines, a novel architecture that strive to push AI closer to how the human brain works

Dive into Perception Language Models by Meta, a family of fully open SOTA vision-language models with detailed visual understanding

Dive into DeepSeek Janus Pro, another magnificent open-source release, this time a multimodal AI model that rivals top multimodal models!

In this post we dive into Sapiens, a new family of computer vision models by Meta AI that show remarkable advancement in human-centric tasks!

In this post we dive into Mixture of Nested Experts, a new method presented by Google that can dramatically reduce AI computational cost

Explore V-JEPA, which stands for Video Joint-Embedding Predicting Architecture. Another step in Meta AI’s journey for human-like AI

In this post we go back to the important vision transformers paper, to understand how ViT adapted transformers to computer vision

Following LCM-LoRA release, in this post we explore the evolution of diffusion models up to latent consistency models with LoRA

In this post we explain the paper “Vision Transformers Need Registers” by Meta AI, that explains an interesting behavior in DINOv2 features