OpenAI 前首席科学家 Ilya Sutskever：学会这30篇，你就掌握了90% 的 AI-平芜编程栈

OpenAI 前首席科学家 Ilya Sutskever：学会这30篇，你就掌握了90% 的 AI

原创尹小军 AGI Hunt2024年5月19日 15:27北京

在人工智能领域，了解并掌握关键文献对于深入理解和应用技术至关重要。以下是 Ilya Sutskever 推荐的 30 篇重要文献，掌握它们将让你对人工智能有着深入的了解。点击【阅读原文】可查看完整论文集合。

1. The Annotated Transformer

作者：Harvard NLP
简介：这篇论文介绍了 Transformer 模型，这是一种基于注意力机制的深度学习模型，在自然语言处理任务中取得了巨大成功。
论文链接：https://nlp.seas.harvard.edu/annotated-transformer/

2. The First Law ofComplexodynamics复杂动力学

作者：Scott Aaronson
简介：对于复杂动力学的第一定律的探讨。
论文链接：https://scottaaronson.blog/?p=762

3. The Unreasonable Effectiveness of RNNs

作者：Andrej Karpathy
简介：探讨循环神经网络的不合理有效性。
论文链接：https://karpathy.github.io/2015/05/21/rnn-effectiveness/

4. Understanding LSTM Networks

作者：Christopher Olah
简介：解释了LSTM（长短期记忆网络）的工作原理。
论文链接：https://colah.github.io/posts/2015-08-Understanding-LSTMs/

5. Recurrent Neural Network Regularization

作者：Wojciech Zaremba、Ilya Sutskever
简介：关于循环神经网络的正则化方法。
论文链接：https://arxiv.org/pdf/1409.2329.pdf

6. Keeping Neural Networks Simple by Minimizing the Description Length of the Weights

作者：Geoffrey Hinton
简介：通过减少权重描述长度来简化神经网络。
论文链接：https://www.cs.toronto.edu/~hinton/absps/colt93.pdf

7. Pointer Networks

作者：Oriol Vinyals、Meire Fortunato、Navdeep Jaitly
简介：介绍了一种用于序列到序列学习的神经网络结构。
论文链接：https://arxiv.org/pdf/1506.03134.pdf

8. ImageNet Classification with Deep CNNs

作者：Alex Krizhevsky、Ilya Sutskever、Geoffrey Hinton
简介：使用深度卷积神经网络进行图像分类的方法。
论文链接：https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf

9. Order Matters: Sequence to sequence for sets

作者：Oriol Vinyals、Samy Bengio、Manjunath Kudlur
简介：探讨序列到序列模型在集合领域的应用。
论文链接：https://arxiv.org/pdf/1511.06391.pdf

10. GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism

作者：Yanping Huang、Youlong Cheng、Dengyong Zhou
简介：利用管道并行性高效训练大型神经网络的方法。
论文链接：https://arxiv.org/pdf/1811.06965.pdf

11. Deep Residual Learning for Image Recognition

作者：Kaiming He、Xiangyu Zhang、Shaoqing Ren、Jian Sun
简介：介绍了一种用于图像识别的深度残差学习方法。
论文链接：https://arxiv.org/pdf/1512.03385.pdf

12. Multi-Scale Context Aggregation by Dilated Convolutions

作者：Fisher Yu、Vladlen Koltun、Thomas Funkhouser
简介：介绍了一种多尺度上下文聚合的方法，使用了扩张卷积。
论文链接：https://arxiv.org/pdf/1511.07122.pdf

13. Neural Quantum Chemistry

作者：Kristof Schütt、Pieter-Jan Kindermans、Huziel Enoc Sauceda、Stefan Chmiela、Alexandre Tkatchenko、Klaus-Robert Müller
简介：探讨了神经网络在量子化学领域的应用。
论文链接：https://arxiv.org/pdf/1704.01212.pdf

14. Attention Is All You Need

作者：Ashish Vaswani、Noam Shazeer、Niki Parmar、Jakob Uszkoreit、Llion Jones、Aidan N. Gomez、Łukasz Kaiser、Illia Polosukhin
简介：介绍了一种完全基于注意力机制的神经网络模型，用于序列到序列学习任务。
论文链接：https://arxiv.org/pdf/1706.03762.pdf

15. Neural Machine Translation by Jointly Learning to Align and Translate

作者：Dzmitry Bahdanau、KyungHyun Cho、Yoshua Bengio
简介：介绍了一种神经机器翻译模型，同时学习对齐和翻译。
论文链接：https://arxiv.org/pdf/1409.0473.pdf

16. Identity Mappings in Deep Residual Networks

作者：Kaiming He、Xiangyu Zhang、Shaoqing Ren、Jian Sun
简介：研究了在深度残差网络中使用身份映射的方法。
论文链接：https://arxiv.org/pdf/1603.05027.pdf

17. A Simple NN Module for Relational Reasoning

作者：Adam Santoro、David Raposo、David G.T. Barrett、Mateusz Malinowski、Razvan Pascanu、Peter Battaglia、Tim Lillicrap
简介：介绍了一种用于关系推理的简单神经网络模块。
论文链接：https://arxiv.org/pdf/1706.01427.pdf

18. Variational Lossy Autoencoder

作者：Emily Denton、Rob Fergus、Yann LeCun
简介：介绍了一种变分损失自编码器的方法。
论文链接：https://arxiv.org/pdf/1611.02731.pdf

19. Relational RNNs

作者：Adam Santoro、Ryan Faulkner、David Raposo、Jack Rae、Mike Chrzanowski、Theophane Weber、Timothy Lillicrap、Peter Battaglia
简介：探讨了关系循环神经网络的应用。
论文链接：https://arxiv.org/pdf/1806.01822.pdf

20. Quantifying the Rise and Fall of Complexity in Closed Systems: The Coffee Automaton

作者：Jérôme Durand-Lose、Benoît Masson、Ashwin Pananjady
简介：探讨了闭合系统中复杂性的变化，以咖啡自动化系统为例。
论文链接：https://arxiv.org/pdf/1405.6903.pdf

21. Neural Turing Machines

作者：Alex Graves、Greg Wayne、Ivo Danihelka
简介：介绍了一种具有外部内存的神经图灵机模型。
论文链接：https://arxiv.org/pdf/1410.5401.pdf

22. Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

作者：Dario Amodei、Sandeep Gupta、Andrew Gibiansky、Rishita Anubhai、Eric Battenberg、Carl Case、Jared Casper、Bryan Catanzaro、Qiang Cheng、Guan Chen、Jie Chen、Jensen Chen、Mingxing Chen、Zhao Chen、Jacobson Cohen、Mournir El-Houmaidi、Yangqing Jia、Brendan Jou、Timothy LeGower、Amit Levy、Jiangyan Long、Philippe Mathieu、Levente Orban、Sherry Moore、Jonathan Raiman、Shuang Sun、Johannes Titz、Kunal Vyas、Ning Wang、Tianhao Wang、Chiyuan Zhang、Zhenyao Zhu
简介：介绍了一种用于英语和普通话的端到端语音识别系统。
论文链接：https://arxiv.org/pdf/1512.02595.pdf

23. Scaling Laws for Neural LMs

作者：Nikita Kitaev、Łukasz Kaiser、Anselm Levskaya
简介：探讨了神经语言模型的缩放规律。
论文链接：https://arxiv.org/pdf/2001.08361.pdf

24. A Tutorial Introduction to the Minimum Description Length Principle

作者：Paul Vitányi、Mark Li
简介：对最小描述长度原理的教程介绍。
论文链接：https://arxiv.org/pdf/math/0406077.pdf

25. Machine Super Intelligence Dissertation

作者：Shane Legg
简介：探讨了机器超级智能的论文。
论文链接：https://www.vetta.org/documents/Machine_Super_Intelligence.pdf

26. PAGE 434 onwards: Komogrov Complexity

作者：Andrey Kolmogorov
简介：介绍了科尔莫哥罗夫复杂性理论。
论文链接：https://www.lirmm.fr/~ashen/kolmbook-eng-scan.pdf

27. CS231n Convolutional Neural Networks for Visual Recognition

作者：Andrej Karpathy、Justin Johnson、Fei-Fei Li
简介：介绍了用于视觉识别的卷积神经网络模型。
论文链接：https://cs231n.github.io/

28. Open this site in a new tab

作者：Andrej Karpathy
简介：The Unreasonable Effectiveness of RNNs
网址：https://karpathy.github.io/2015/05/21/rnn-effectiveness/

29. Sequence to Sequence Learning with Neural Networks

作者：Ilya Sutskever、Oriol Vinyals、Quoc V. Le
简介：介绍了一种用于序列到序列学习的神经网络模型。
论文链接：https://arxiv.org/pdf/1409.3215.pdf

30. Neural Architectures for Named Entity Recognition

作者：Guillaume Lample、Miguel Ballesteros、Sandeep Subramanian、Kazuya Kawakami、Chris Dyer
简介：探讨了命名实体识别任务中的神经网络架构。
论文链接：https://arxiv.org/pdf/1603.01360.pdf

通过学习以上这30篇论文，你将对人工智能领域的重要理论、模型和技术有着更深入的理解，为成为一名优秀的 AI 研究者或从业者打下坚实的基础。

OpenAI 前首席科学家 Ilya Sutskever：学会这30篇，你就掌握了90% 的 AI

OpenAI 前首席科学家 Ilya Sutskever：学会这30篇，你就掌握了90% 的 AI

1. The Annotated Transformer

2. The First Law ofComplexodynamics复杂动力学

3. The Unreasonable Effectiveness of RNNs

4. Understanding LSTM Networks

5. Recurrent Neural Network Regularization

6. Keeping Neural Networks Simple by Minimizing the Description Length of the Weights

7. Pointer Networks

8. ImageNet Classification with Deep CNNs

9. Order Matters: Sequence to sequence for sets

10. GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism

11. Deep Residual Learning for Image Recognition

12. Multi-Scale Context Aggregation by Dilated Convolutions

13. Neural Quantum Chemistry

14. Attention Is All You Need

15. Neural Machine Translation by Jointly Learning to Align and Translate

16. Identity Mappings in Deep Residual Networks

17. A Simple NN Module for Relational Reasoning

18. Variational Lossy Autoencoder

19. Relational RNNs

20. Quantifying the Rise and Fall of Complexity in Closed Systems: The Coffee Automaton

21. Neural Turing Machines

22. Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

23. Scaling Laws for Neural LMs

24. A Tutorial Introduction to the Minimum Description Length Principle

25. Machine Super Intelligence Dissertation

26. PAGE 434 onwards: Komogrov Complexity

27. CS231n Convolutional Neural Networks for Visual Recognition

28. Open this site in a new tab

29. Sequence to Sequence Learning with Neural Networks

30. Neural Architectures for Named Entity Recognition

Keil5安装失败应对策略：实战案例分析

springboot3整合SpringSecurity实现登录校验与权限认证（万字超详细讲解）

PDF-Extract-Kit教程：如何构建自定义PDF解析流程

PDF-Extract-Kit教程：自定义模型训练与微调方法

PDF-Extract-Kit性能对比：CPU与GPU处理效率测评

PDF-Extract-Kit部署实战：边缘计算环境PDF处理