Table-GPT: Empower LLMs To Understand Tables
In this post we dive into Table-GPT, a novel research by Microsoft, that empowers LLMs to understand tabular data
In this post we dive into Table-GPT, a novel research by Microsoft, that empowers LLMs to understand tabular data
In this post we dive into the Large Language Models As Optimizers paper by Google DeepMind, which introduces OPRO (Optimization by PROmpting).
Large Language Models As Optimizers – OPRO by Google DeepMind Read More »
Discover an in-depth review of Code Llama paper, a specialized version of the Llama 2 model designed for coding tasks
Diving into WizardMath, a LLM for mathematical reasoning contributed by Microsoft, surpassing models such as WizardLM and LLaMA-2.
In this post we dive into Orca’s paper which shows how to do imitation tuning effectively, outperforms ChatGPT with about 7% of its size!
In this post we dive into the LongNet research paper which introduced the Dilated Attention mechanism and explain how it works
LongNet: Scaling Transformers to 1B Tokens with Dilated Attention Read More »
In this post we explain LIMA, a LLM by Meta AI which was fine-tuned on only 1000 samples, yet it achieves competitive results with top LLMs
LIMA from Meta AI – Less Is More for Alignment of LLMs Read More »
Dive into Shepherd, a LLM from Meta AI which is purposed to critique responses from other LLMs, a step in resolving LLMs hallucinations.
Shepherd: A Critic for Language Model Generation Read More »
LLMs are aligned for safety to avoid generation of harmful content. In this post we review a paper that is able to successfully attack LLMs.
Universal and Transferable Adversarial LLM Attacks Read More »
In this post we review Google DeepMind’s paper that introduces Soft Mixture of Experts, a fully-differentiable sparse Transformer.