Important Papers

  • A Simple NN Module for Relational Reasoning: A simple neural network module for relational reasoning

    Publication

  • A Tutorial Introduction to the Minimum Description Length Principle: A tutorial introduction to the minimum description length principle

    Publication

  • Attention is all you need: Attention Is All You Need

    Publication

  • Deep Residual Learning for Image Recognition: Deep Residual Learning for Image Recognition

    Publication JIF

  • Deep Speech 2: End-to-End Speech Recognition in English and Mandarin: Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

    Publication

  • GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism: GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism

    Publication

  • Identity Mappings in Deep Residual Networks: Identity Mappings in Deep Residual Networks

    Publication

  • ImageNet classification with deep convolutional neural networks: ImageNet classification with deep convolutional neural networks

    Publication JIF

  • Keeping Neural Networks Simple by Minimizing the Description Length of the Weights:

    Publication

  • Machine Super Intelligence:

    Publication

  • Multi-Scale Context Aggregation by Dilated Convolutions: Multi-Scale Context Aggregation by Dilated Convolutions

    Publication

  • Neural Machine Translation by Jointly Learning to Align and Translate: Neural Machine Translation by Jointly Learning to Align and Translate

    Publication

  • Neural Message Passing for Quantum Chemistry: Neural Message Passing for Quantum Chemistry

    Publication

  • Neural Turing Machines: Neural Turing Machines

    Publication

  • Order Matters: Sequence to sequence for sets: Order Matters: Sequence to sequence for sets

    Publication

  • Pointer Networks: Pointer Networks

    Publication

  • Quantifying the Rise and Fall of Complexity in Closed Systems: The Coffee Automaton: Quantifying the Rise and Fall of Complexity in Closed Systems: The Coffee Automaton

    Publication

  • Recurrent Neural Network Regularization: Recurrent Neural Network Regularization

    Publication

  • Relational recurrent neural networks: Relational recurrent neural networks

    Publication

  • Scaling Laws for Neural Language Models: Scaling Laws for Neural Language Models

    Publication

  • The Annotated Transformer:

    Publication

  • The First Law of Complexodynamics:

    Publication

  • The Unreasonable Effectiveness of Recurrent Neural Networks:

    Publication

  • Understanding LSTM Networks:

    Publication

  • Variational Lossy Autoencoder: Variational Lossy Autoencoder

    Publication