Important Papers
- The Annotated Transformer:
- The First Law of Complexodynamics:
- The Unreasonable Effectiveness of Recurrent Neural Networks:
- Understanding LSTM Networks:
- Recurrent Neural Network Regularization:
- Keeping Neural Networks Simple by Minimizing the Description Length of the Weights:
- Pointer Networks:
- Order Matters: Sequence to sequence for sets:
- ImageNet classification with deep convolutional neural networks:
- GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism:
- Deep Residual Learning for Image Recognition:
- Multi-Scale Context Aggregation by Dilated Convolutions:
- Neural Message Passing for Quantum Chemistry:
- Attention is all you need:
- Neural Machine Translation by Jointly Learning to Align and Translate:
- Identity Mappings in Deep Residual Networks:
- A Simple NN Module for Relational Reasoning:
- Variational Lossy Autoencoder:
- Relational recurrent neural networks:
- Quantifying the Rise and Fall of Complexity in Closed Systems: The Coffee Automaton:
- Neural Turing Machines:
- Deep Speech 2: End-to-End Speech Recognition in English and Mandarin:
- Scaling Laws for Neural Language Models:
- A Tutorial Introduction to the Minimum Description Length Principle:
- Machine Super Intelligence: