Looking for a specific paper or subject?


rStar-Math Preview

rStar-Math by Microsoft: Can SLMs Beat OpenAI o1 in Math?

Discover how System 2 thinking through Monte Carlo Tree Search enables rStar-Math to rival OpenAI’s o1 in math, using Small Language Models!…
Large Concept Model High-level Architecture

Large Concept Models (LCMs) by Meta: The Era of AI After LLMs?

Explore Meta’s Large Concept Models (LCMs) - an AI model that processes concepts instead of tokens. Can it become the next LLM architecture?…
BLT architecture

Byte Latent Transformer (BLT) by Meta AI: A Tokenizer-free LLM Revolution

Explore Byte Latent Transformer (BLT) by Meta AI: A tokenizer-free LLM that scales better than tokenization-based LLMs…
Coconut preview

Coconut by Meta AI – Better LLM Reasoning With Chain of CONTINUOUS Thought?

Discover how Meta AI’s Chain of Continuous Thought (Coconut) empowers large language models (LLMs) to reason in their own language…

Hymba by NVIDIA: A Hybrid Mamba-Transformer Language Model

Discover NVIDIA’s Hymba model that combines Transformers and State Space Models for state-of-the-art performance in small language models…
LLaMA-Mesh-preview2

LLaMA-Mesh by Nvidia: LLM for 3D Mesh Generation

Dive into Nvidia’s LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models, a LLM which was adapted to understand 3D objects…
Tokenformer vs Transformer architecture

Tokenformer: Rethinking Transformer Scaling with Tokenized Model Parameters

Dive into Tokenformer, a novel architecture that improves Transformers to support incremental model growth without training from scratch…
GenRM preview image

Generative Reward Models: Merging the Power of RLHF and RLAIF for Smarter AI

In this post we dive into a Stanford research presenting Generative Reward Models, a hybrid Human and AI RL to improve LLMs…
Scroll to Top