Attention Is All You Need

One of the most game changing paper of recent times. This paper introduces the transformer architecture and talks about the self attention mechanism.

Please feel free to read along the paper with my notes and highlights.

ColorMeaning
GreenTopics about the current paper
YellowTopics about other relevant references
BlueImplementation details/ maths
RedText including my thoughts, questions, and understandings

Citation

@misc{vaswani2017attention,
      title={Attention Is All You Need}, 
      author={Ashish Vaswani and Noam Shazeer and Niki Parmar and Jakob Uszkoreit and Llion Jones and Aidan N. Gomez and Lukasz Kaiser and Illia Polosukhin},
      year={2017},
      eprint={1706.03762},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}