LLM Evaluation in 2026: Why Your Benchmark Scores Don't Matter Anymore

The Uncomfortable Truth About LLM Benchmarks Here鈥檚 a fact that might surprise you: GPT-4, Claude 3.5, and Gemini all score nearly identically on traditional fluency metrics. Yet anyone who鈥檚 used these models in production knows they behave very differently. Your chatbot hallucinates less with Claude. Your summarization pipeline produces more accurate outputs with GPT-4. Your document extraction works better with a fine-tuned smaller model than any of the giants. So what鈥檚 going on? Why are we still chasing BLEU scores and perplexity when they clearly don鈥檛 predict real-world performance? ...

February 15, 2026 路 20 min 路 Akshay Uppal

DiT Annotated Paper

DIT: SELF-SUPERVISED PRE-TRAINING FOR DOCUMENT IMAGE TRANSFORMER DocumentAI with Images has a new leader in town and its DiT! A yet another stellar paper from the folks at Microsoft advancing the field of DocumentAI. This new paper essentially draws inspiration from various papers to come up with a clean end-to-end pre-trained network for various image tasks like document image classification, document layout analysis, table detection, etc. This also lays a foundation for all the upcoming multimodal networks for document understanding and plays an important role in the upcoming LayoutLMv3. Read along to explore this easy-to-read paper which potentially generates a lot of impact in the field. ...

April 21, 2022 路 2 min 路 Akshay Uppal

WebFormer Annotated Paper

WebFormer: The Web-page Transformer for Structure Information Extraction Understanding tokens from unstructured web pages is challenging in practice due to a variety of web layout patterns, this is where WebFormer comes into play. In this paper, the authors propose a novel architecture, WebFormer, a Web-page transFormer model for structure information extraction from web documents. This paper also introduces rich attention patterns between HTML tokens and text tokens, which leverages the web layout for effective attention weight computation. This can prove to be a big leap in web page understanding as it provides great incremental results and a way forward for the domain. ...

March 7, 2022 路 2 min 路 Akshay Uppal

LayoutLMv2 Annotated Paper

LayoutLMv2: Multi-Modal Pre-Training For Visually-Rich Document Understanding Microsoft delivers again with LayoutLMv2 to further mature the field of document understanding. The new pre-training tasks, the spatial aware self-attention, and the fact that image information is integrated into the pre-training stage itself distinguish this paper from its predecessor LayouLM and establish a new state-of-the-art performance for six widely used datasets in different tasks. This takes a step further in understanding documents through visual cues along with the textual content and layout information through a multi-modal model approach and carefully integrates Image, text, and layout information in the new self-attention mechanism. ...

December 16, 2021 路 2 min 路 Akshay Uppal

Fastformer Annotated Paper

Fastformer: Additive Attention Can Be All You Need Of late this paper is all the rage with its claims to introduce an attention mechanism that has a linear time complexity with respect to the sequence length. Why is this such a big deal you ask? Well, If you are familiar with transformers, one of the biggest downsides is the quadratic complexity which creates a huge bottleneck for longer sequences. So if additive attention works out, we will no longer have a strict cap of 512 tokens as introduced in the original and subsequent transformer-based architectures. This paper compares itself with other well-known efficient transformer techniques and conducts experiments on five well-known datasets. ...

October 4, 2021 路 2 min 路 Akshay Uppal