Multimodality Papers
Looking for a specific paper or subject?
-
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
In this post we dive into TinyGPT-V, a new multimodal large language model which was introduced in a research paper titled “TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones”. Before divining in, if you prefer a video format then check out our video review for this paper: Motivation In recent years we’ve seen a…
-
Meta-Transformer: A Unified Framework for Multimodal Learning
In this post we dive into Meta-Transformer, a multimodal learning method, which was presented in a research paper titled Meta-Transformer: A Unified Framework for Multimodal Learning. In the paper, the researchers show they were able to process information from 12(!) different modalities that we see in the picture above, which includes image, text, audio, infrared,…