Important Papers
- A Simple NN Module for Relational Reasoning: A simple neural network module for relational reasoning
- A Tutorial Introduction to the Minimum Description Length Principle: A tutorial introduction to the minimum description length principle
- Attention is all you need: Attention Is All You Need
- Deep Residual Learning for Image Recognition: Deep Residual Learning for Image Recognition
- Deep Speech 2: End-to-End Speech Recognition in English and Mandarin: Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
- GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism: GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
- Identity Mappings in Deep Residual Networks: Identity Mappings in Deep Residual Networks
- ImageNet classification with deep convolutional neural networks: ImageNet classification with deep convolutional neural networks
- Keeping Neural Networks Simple by Minimizing the Description Length of the Weights:
- Machine Super Intelligence:
- Multi-Scale Context Aggregation by Dilated Convolutions: Multi-Scale Context Aggregation by Dilated Convolutions
- Neural Machine Translation by Jointly Learning to Align and Translate: Neural Machine Translation by Jointly Learning to Align and Translate
- Neural Message Passing for Quantum Chemistry: Neural Message Passing for Quantum Chemistry
- Neural Turing Machines: Neural Turing Machines
- Order Matters: Sequence to sequence for sets: Order Matters: Sequence to sequence for sets
- Pointer Networks: Pointer Networks
- Quantifying the Rise and Fall of Complexity in Closed Systems: The Coffee Automaton: Quantifying the Rise and Fall of Complexity in Closed Systems: The Coffee Automaton
- Recurrent Neural Network Regularization: Recurrent Neural Network Regularization
- Relational recurrent neural networks: Relational recurrent neural networks
- Scaling Laws for Neural Language Models: Scaling Laws for Neural Language Models
-
The Annotated Transformer:
-
The First Law of Complexodynamics:
-
The Unreasonable Effectiveness of Recurrent Neural Networks:
-
Understanding LSTM Networks:
-
Variational Lossy Autoencoder: Variational Lossy Autoencoder