Looking for a specific paper or subject?


Recent Posts

  • The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
    In this post, we dive into a new and exciting research paper by Microsoft, titled: “The Era of 1-bit LLMs: All Large Language Models are 1.58 bits”. In recent years, we’ve seen a tremendous success of large language models with models such as GPT, LLaMA and more. As we move forward, we see that the models size is getting larger and larger, and in order to efficiently run large language models we need a substantial amount of compute and memory resources, which are not accessible for most people. Additionally, we also hear environmental concerns due to high energy consumption. And …

    The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Read More »

  • Self-Rewarding Language Models by Meta AI
    On January 18, Mark Zuckerberg announced that the long-term goal of Meta AI is to build general intelligence, and open-source it responsibly. So Meta AI is officially working on building an open-source AGI. On the same day, Meta AI have released a new research paper titled “Self-Rewarding Language Models”, which can be a step that takes Meta AI forward in this direction. Before diving in, if you prefer a video format then check out the following video: Motivation Pre-trained large language models are being improved by getting feedback about the model output from humans, and then train on that feedback. …

    Self-Rewarding Language Models by Meta AI Read More »

  • Fast Inference of Mixture-of-Experts Language Models with Offloading
    In this post, we dive into a new research paper, titled: “Fast Inference of Mixture-of-Experts Language Models with Offloading”. Motivation LLMs Are Getting Larger In recent years, large language models are in charge of remarkable advances in AI, with models such as GPT-3 and 4 which are closed source and with open source models such as LLaMA 1 and 2, and more. However, as we move forward, these models are getting larger and larger and it becomes important to find ways to improve the efficiency of running large language models. Mixture of Experts Improves LLMs Efficiency One development that has …

    Fast Inference of Mixture-of-Experts Language Models with Offloading Read More »

  • TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
    In this post we dive into TinyGPT-V, a new multimodal large language model which was introduced in a research paper titled “TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones”. Before divining in, if you prefer a video format then check out our video review for this paper: Motivation In recent years we’ve seen a tremendous progress with large language models (LLMs) such as GPT-4, LLaMA-2 and more, where we can feed a LLM with a prompt and get in response a meaningful answer. More recently, LLMs have been taken another step forward by being able to also understand images. …

    TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones Read More »


Top Posts

  • Code Llama repository-level reasoning

    Code Llama Paper Explained

    Code Llama is a new family of open-source large language models for code by Meta AI that includes three type of models. Each type was released with 7B, 13B and 34B params. In this post we’ll explain the research paper behind them, titled “Code Llama: Open Foundation Models for Code”, to understand how these models…

  • DINOv2 as foundational model

    DINOv2 from Meta AI – Finally a Foundational Model in Computer Vision

    DINOv2 is a computer vision model from Meta AI that claims to finally provide a foundational model in computer vision, closing some of the gap from natural language processing where it is already common for a while now. In this post, we’ll explain what does it mean to be a foundational model in computer vision…

  • I-JEPA example

    I-JEPA – A Human-Like Computer Vision Model

    I-JEPA, Image-based Joint-Embedding Predictive Architecture, is an open-source computer vision model from Meta AI, and the first AI model based on Yann LeCun’s vision for a more human-like AI, which he presented last year in a 62 pages paper titled “A Path Towards Autonomous Machine Intelligence”.In this post we’ll dive into the research paper that…

  • YOLO-NAS

    What is YOLO-NAS and How it Was Created

    In this post we dive into YOLO-NAS, an improved version in the YOLO models family for object detection which was precented earlier this year by Deci. YOLO models have been around for a while now, presented in 2015 with the paper You Only Look Once, which is what the shortcut YOLO stands for, and over…

Scroll to Top