All Papers - AI Papers Academy

Looking for a specific paper or subject?

14 December 2024NLP

Coconut by Meta AI – Better LLM Reasoning With Chain of CONTINUOUS Thought?

Discover how Meta AI’s Chain of Continuous Thought (Coconut) empowers large language models (LLMs) to reason in their own language…

24 November 2024NLP

Hymba by NVIDIA: A Hybrid Mamba-Transformer Language Model

Discover NVIDIA’s Hymba model that combines Transformers and State Space Models for state-of-the-art performance in small language models…

16 November 2024Multimodality,NLP

LLaMA-Mesh by Nvidia: LLM for 3D Mesh Generation

Dive into Nvidia’s LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models, a LLM which was adapted to understand 3D objects…

5 November 2024NLP

Tokenformer: Rethinking Transformer Scaling with Tokenized Model Parameters

Dive into Tokenformer, a novel architecture that improves Transformers to support incremental model growth without training from scratch…

29 October 2024NLP

Generative Reward Models: Merging the Power of RLHF and RLAIF for Smarter AI

In this post we dive into a Stanford research presenting Generative Reward Models, a hybrid Human and AI RL to improve LLMs…

25 August 2024Computer Vision

Sapiens by Meta AI: Foundation for Human Vision Models

In this post we dive into Sapiens, a new family of computer vision models by Meta AI that show remarkable advancement in human-centric tasks!…

12 August 2024Computer Vision

Mixture of Nested Experts: Adaptive Processing of Visual Tokens

In this post we dive into Mixture of Nested Experts, a new method presented by Google that can dramatically reduce AI computational cost…

17 July 2024NLP

Introduction to Mixture-of-Experts | Original MoE Paper Explained

Diving into the original Google paper which introduced the Mixture-of-Experts (MoE) method, which was critical to AI progress…

12 June 2024NLP

Mixture-of-Agents Enhances Large Language Model Capabilities

In this post we explain the Mixture-of-Agents method, which shows a way to unite open-source LLMs to win GPT-4o on AlpacaEval 2.0…

1 June 2024NLP

Arithmetic Transformers with Abacus Positional Embeddings

In this post we dive into Abacus Embeddings, which dramatically enhance Transformers arithmetic capabilities with strong logical extrapolation…

23 May 2024NLP

CLLMs: Consistency Large Language Models

In this post we dive into Consistency Large Language Models (CLLMs), a new family of models which can dramatically speedup LLMs inference!…

14 April 2024NLP

ReFT: Representation Finetuning for Language Models

Learn about Representation Finetuning (ReFT) by Stanford University, a method to fine-tune large language models (LLMs) efficiently…

3 April 2024NLP

Stealing Part of a Production Language Model

What if we could discover OpenAI models internal weights? In this post we dive into a paper which presents an attack that steals LLMs data…

10 March 2024Computer Vision

How Meta AI ‘s Human-Like V-JEPA Works?

Explore V-JEPA, which stands for Video Joint-Embedding Predicting Architecture. Another step in Meta AI’s journey for human-like AI…

2 March 2024NLP

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

In this post we dive into the era of 1-bit LLMs paper by Microsoft, which shows a promising direction for low cost large language models…

20 January 2024NLP

Self-Rewarding Language Models by Meta AI

In this post we dive into the Self-Rewarding Language Models paper by Meta AI, which can possibly be a step towards open-source AGI…

13 January 2024NLP

Fast Inference of Mixture-of-Experts Language Models with Offloading

Diving into a research paper introducing an innovative method to enhance LLM inference efficiency using memory offloading…

30 December 2023Multimodality

TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones

In this post we dive into TinyGPT-V, a small but mighty Multimodal LLM which brings Phi-2 success to vision-language tasks…