Sapiens: Foundation for Human Vision Models
In this post we dive into Sapiens, a new family of computer vision models by Meta AI that show remarkable advancement in human-centric tasks!
Sapiens: Foundation for Human Vision Models Read More »
In this post we dive into Sapiens, a new family of computer vision models by Meta AI that show remarkable advancement in human-centric tasks!
Sapiens: Foundation for Human Vision Models Read More »
In this post we dive into Mixture of Nested Experts, a new method presented by Google that can dramatically reduce AI computational cost
Mixture of Nested Experts: Adaptive Processing of Visual Tokens Read More »
Explore V-JEPA, which stands for Video Joint-Embedding Predicting Architecture. Another step in Meta AI’s journey for human-like AI
How Meta AI ‘s Human-Like V-JEPA Works? Read More »
In this post we go back to the important vision transformers paper, to understand how ViT adapted transformers to computer vision
Introduction to Vision Transformers | Original ViT Paper Explained Read More »
Following LCM-LoRA release, in this post we explore the evolution of diffusion models up to latent consistency models with LoRA
From Diffusion Models to LCM-LoRA Read More »
In this post we explain the paper “Vision Transformers Need Registers” by Meta AI, that explains an interesting behavior in DINOv2 features
Vision Transformers Need Registers – Fixing a Bug in DINOv2? Read More »
In this post we dive into Emu, a text-to-image generation model by Meta AI, which is quality-tuned to generate highly aesthetic images.
Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack Read More »
In this post we cover FACET, a new dataset created by Meta AI in order to evaluate a benchmark for fairness of computer vision models
FACET: Fairness in Computer Vision Evaluation Benchmark Read More »
DINOv2 by Meta AI finally gives us a foundational model for computer vision. We’ll explain what it means and why DINOv2 can count as such
DINOv2 from Meta AI – Finally a Foundational Model in Computer Vision Read More »
Dive into I-JEPA, Image-based Joint-Embedding Predictive Architecture, the first model based on Yann LeCun’s vision for a more human-like AI.
I-JEPA: The First Human-Like Computer Vision Model Read More »