RoBERTa Annotated Paper

RoBERTa: A Robustly Optimized BERT Pretraining Approach Soon after BERT got released in late 2018, A floodgate of transformer-based networks got opened. Full capabilities of BERT was going unnoticed until RoBERTa. In this paper, the authors question and improve the hyperparameters and training paradigm of BERT with carefully crafted experiments and come up with a robust and better performing network without changing the core architecture of BERT. Please feel free to read along with the paper with my notes and highlights. ...

August 10, 2021 路 1 min 路 Akshay Uppal

Few Shot NER Annotated Paper

Few-Shot Named Entity Recognition: A Comprehensive Study A lesser-known albeit important paper in my opinion. This paper highlights a key problem in the industry that does not always appear in research making it all the more impressive. In this paper, the authors talk about the problem of less data for NER in industry and experimentally try the effects of three key approaches on few-shot NER: Meta-Learning: Construct prototypes for different entities Supervised pre-training on huge noisy data Self Training Please feel free to read along with the paper with my notes and highlights. ...

August 9, 2021 路 1 min 路 Akshay Uppal

EfficientNet Annotated Paper

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks This paper created a huge mark in the field of model scaling and parametric optimization in terms of model architecture. It brings us a new scaling method called compund scaling, to scale the convolution network in all the three dimensions - Width, Depth and, resolution/channels. Along with this novel way of scaling it also brings us a new family of architecture created using Neural Architecture Search called the EfficentNet Family. ...

July 7, 2021 路 1 min 路 Akshay Uppal

EfficientNet-V2 Annotated Paper

EfficientNetV2: Smaller Models and Faster Training This very recent paper (1-month-old at the time of writing this) introduces EfficientNetV2, a new family of convolutional networks that have faster training speed and better parameter efficiency. Based on top of the EfficientNet this paper pushes the boundary of model scaling and architecture search by further optimizing the network by using training aware Neural Architecture Search (NAS) and scaling. It jointly optimizes training speed and parameter efficiency to create the lightest best-performing models. ...

July 7, 2021 路 1 min 路 Akshay Uppal

Text Classification with BERT

Fine-Tune BERT for Text Classification with TensorFlow Figure 1: BERT Classification Model We will be using GPU accelerated Kernel for this tutorial as we would require a GPU to fine-tune BERT. Prerequisites: Willingness to learn: Growth Mindset is all you need Some basic idea about Tensorflow/Keras Some Python to follow along with the code Initial Set Up Install TensorFlow and TensorFlow Model Garden import tensorflow as tf print(tf.version.VERSION) Cloning the Github Repo for tensorflow models ...

July 1, 2021 路 18 min 路 Akshay Uppal