Computer Vision Papers
Looking for a specific paper or subject?
Sapiens: Foundation for Human Vision Models
In this post we dive into Sapiens, a new family of computer vision models by Meta AI that show remarkable advancement in human-centric tasks!…
Mixture of Nested Experts: Adaptive Processing of Visual Tokens
In this post we dive into Mixture of Nested Experts, a new method presented by Google that can dramatically reduce AI computational cost…
How Meta AI ‘s Human-Like V-JEPA Works?
Explore V-JEPA, which stands for Video Joint-Embedding Predicting Architecture. Another step in Meta AI’s journey for human-like AI…
Introduction to Vision Transformers | Original ViT Paper Explained
In this post we go back to the important vision transformers paper, to understand how ViT adapted transformers to computer vision…
From Diffusion Models to LCM-LoRA
Following LCM-LoRA release, in this post we explore the evolution of diffusion models up to latent consistency models with LoRA…
Vision Transformers Need Registers – Fixing a Bug in DINOv2?
In this post we explain the paper “Vision Transformers Need Registers” by Meta AI, that explains an interesting behavior in DINOv2 features…
Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack
In this post we dive into Emu, a text-to-image generation model by Meta AI, which is quality-tuned to generate highly aesthetic images…
FACET: Fairness in Computer Vision Evaluation Benchmark
In this post we cover FACET, a new dataset created by Meta AI in order to evaluate a benchmark for fairness of computer vision models…
- 1
- 2