Important Papers

  • The Annotated Transformer:
    Publication
  • The First Law of Complexodynamics:
    Publication
  • The Unreasonable Effectiveness of Recurrent Neural Networks:
    Publication
  • Understanding LSTM Networks:
    Publication
  • Recurrent Neural Network Regularization:
    Publication
  • Keeping Neural Networks Simple by Minimizing the Description Length of the Weights:
    Publication
  • Pointer Networks:
    Publication
  • Order Matters: Sequence to sequence for sets:
    Publication
  • ImageNet classification with deep convolutional neural networks:
    Publication
  • GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism:
    Publication
  • Deep Residual Learning for Image Recognition:
    Publication
  • Multi-Scale Context Aggregation by Dilated Convolutions:
    Publication
  • Neural Message Passing for Quantum Chemistry:
    Publication
  • Attention is all you need:
    Publication
  • Neural Machine Translation by Jointly Learning to Align and Translate:
    Publication
  • Identity Mappings in Deep Residual Networks:
    Publication
  • A Simple NN Module for Relational Reasoning:
    Publication
  • Variational Lossy Autoencoder:
    Publication
  • Relational recurrent neural networks:
    Publication
  • Quantifying the Rise and Fall of Complexity in Closed Systems: The Coffee Automaton:
    Publication
  • Neural Turing Machines:
    Publication
  • Deep Speech 2: End-to-End Speech Recognition in English and Mandarin:
    Publication
  • Scaling Laws for Neural Language Models:
    Publication
  • A Tutorial Introduction to the Minimum Description Length Principle:
    Publication
  • Machine Super Intelligence:
    Publication