Important Papers

  • A Simple NN Module for Relational Reasoning: A simple neural network module for relational reasoning
    Publication
  • A Tutorial Introduction to the Minimum Description Length Principle: A tutorial introduction to the minimum description length principle
    Publication
  • Attention is all you need: Attention Is All You Need
    Publication
  • Deep Residual Learning for Image Recognition: Deep Residual Learning for Image Recognition
    Publication
  • Deep Speech 2: End-to-End Speech Recognition in English and Mandarin: Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
    Publication
  • GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism: GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
    Publication
  • Identity Mappings in Deep Residual Networks: Identity Mappings in Deep Residual Networks
    Publication
  • ImageNet classification with deep convolutional neural networks: ImageNet classification with deep convolutional neural networks
    Publication
  • Keeping Neural Networks Simple by Minimizing the Description Length of the Weights:
    Publication
  • Machine Super Intelligence:
    Publication
  • Multi-Scale Context Aggregation by Dilated Convolutions: Multi-Scale Context Aggregation by Dilated Convolutions
    Publication
  • Neural Machine Translation by Jointly Learning to Align and Translate: Neural Machine Translation by Jointly Learning to Align and Translate
    Publication
  • Neural Message Passing for Quantum Chemistry: Neural Message Passing for Quantum Chemistry
    Publication
  • Neural Turing Machines: Neural Turing Machines
    Publication
  • Order Matters: Sequence to sequence for sets: Order Matters: Sequence to sequence for sets
    Publication
  • Pointer Networks: Pointer Networks
    Publication
  • Quantifying the Rise and Fall of Complexity in Closed Systems: The Coffee Automaton: Quantifying the Rise and Fall of Complexity in Closed Systems: The Coffee Automaton
    Publication
  • Recurrent Neural Network Regularization: Recurrent Neural Network Regularization
    Publication
  • Relational recurrent neural networks: Relational recurrent neural networks
    Publication
  • Scaling Laws for Neural Language Models: Scaling Laws for Neural Language Models
    Publication
  • The Annotated Transformer:

  • The First Law of Complexodynamics:

  • The Unreasonable Effectiveness of Recurrent Neural Networks:

  • Understanding LSTM Networks:

  • Variational Lossy Autoencoder: Variational Lossy Autoencoder
    Publication