Posts

LayoutLM Annotated Paper

LayoutLM: Pre-training of Text and Layout for Document Image Understanding Diving deeper into the domain of understanding documents, today we have a brilliant paper by folks at Microsoft. The main idea of this paper is to jointly model the text as well as layout information for documents. The authors talk about the importance of layout features in the form of 2D positional embeddings and Visual features in the form of token-wise image embeddings along with the textual features for state of the art document understanding. This paper is a solid milestone in this domain and is now actively used as a benchmark of comparison for the latest research in the area. ...

RoBERTa Annotated Paper

RoBERTa: A Robustly Optimized BERT Pretraining Approach Soon after BERT got released in late 2018, A floodgate of transformer-based networks got opened. Full capabilities of BERT was going unnoticed until RoBERTa. In this paper, the authors question and improve the hyperparameters and training paradigm of BERT with carefully crafted experiments and come up with a robust and better performing network without changing the core architecture of BERT. Please feel free to read along with the paper with my notes and highlights. ...

Few Shot NER Annotated Paper

Few-Shot Named Entity Recognition: A Comprehensive Study A lesser-known albeit important paper in my opinion. This paper highlights a key problem in the industry that does not always appear in research making it all the more impressive. In this paper, the authors talk about the problem of less data for NER in industry and experimentally try the effects of three key approaches on few-shot NER: Meta-Learning: Construct prototypes for different entities Supervised pre-training on huge noisy data Self Training Please feel free to read along with the paper with my notes and highlights. ...

EfficientNet Annotated Paper

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks This paper created a huge mark in the field of model scaling and parametric optimization in terms of model architecture. It brings us a new scaling method called compund scaling, to scale the convolution network in all the three dimensions - Width, Depth and, resolution/channels. Along with this novel way of scaling it also brings us a new family of architecture created using Neural Architecture Search called the EfficentNet Family. ...

EfficientNet-V2 Annotated Paper

EfficientNetV2: Smaller Models and Faster Training This very recent paper (1-month-old at the time of writing this) introduces EfficientNetV2, a new family of convolutional networks that have faster training speed and better parameter efficiency. Based on top of the EfficientNet this paper pushes the boundary of model scaling and architecture search by further optimizing the network by using training aware Neural Architecture Search (NAS) and scaling. It jointly optimizes training speed and parameter efficiency to create the lightest best-performing models. ...