The Lost Reading Items
11 Nov 2024, Taro LangnerI recently shared a summary of a viral AI reading list attributed to Ilya Sutskever, which laid claim to covering ‘90% of what matters’ back in 2020. It boils down the reading items to barely one percent of the original word count to form the TL;DR I would have wished for before reading.
The viral version of the list as shared online is known to be incomplete, however, and includes only 27 of about 40 original reading items. The rest allegedly fell victim to the E-Mail deletion policy at Meta¹. These missing reading items have inspired some good discussions in the past, with many different ideas as to which papers would have been important enough to include.
This post is an attempt to identify these lost reading items. It builds on clues gathered from the viral list, contemporary presentations given by Ilya Sutskever, resources shared by OpenAI and more.
¹Correction: An earlier version mistakenly referred to OpenAI here instead of Meta
Filling the Gaps
The main piece of evidence is a claim shared along with the list according to which an entire selection of meta-learning papers was lost.
Meta-learning is often said to pursue ‘learning to learn’, with neural networks being trained for a general ability to adapt more easily to new tasks for which only few training samples are available. A network should thus be able benefit from its existing weights without requiring an entirely new training from scratch on the new data. One-shot learning provides just a single training sample to a model from which it is expected to learn a new downstream task, whereas zero-shot settings provide no annotated training samples at all.
For some of the candidate papers listed below, the case can be strengthened further by evidence in the form of an endorsement straight from OpenAI itself. Ilya Sutskever was chief scientist at a time when OpenAI published the educational resource ‘Spinning Up in Deep RL’ which includes several of these candidates in an entirely separate reading list of 105 ‘Key Papers in Deep RL’. Below, the papers which also appear in that list are marked with a symbol (⚛).
Clues from the Preserved Reading Items
Some meta-learning concepts can be found even in the known parts of the list. The preserved reading items can be arranged into a narrative arc around a related branch of research on Memory-Augmented Neural Networks (MANNs). Following the ‘Neural Turing Machine’ (NTM) paper, ‘Set2Set’ and ‘Relational RNNs’ experimented with external memory banks that an RNN could read and write information on. They directly cite or closely relate to several papers which may well have been part of the original list:
Potential Reading Items (Part 1):
- ‘Meta-learning with memory-augmented neural networks’
from 2016 - ‘Prototypical networks for few-shot learning’
from 2017 - ‘Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks’⚛
from 2017
Clues from Contemporary Presentations
Certain papers about meta-learning and competitive self-play also feature repeatedly in a series of presentations held by Ilya Sutskever around this time and may well have eventually been included in the reading list too.
These presentations largely overlap and repeatedly reference known contents of the reading list. They open with a fundamental motivation of why deep learning works, framing backpropagation with neural networks as a search for small circuits that relate to the Minimum Description Length principle, according to which the shortest program that can explain given data will reach the best generalization possible.
Next, all three presentations reference the following meta-learning papers:
Potential Reading Items (Part 2):
- ‘Human-level concept learning through probabilistic program induction’
as Lake et al., 2016 - ‘Neural Architecture Search with Reinforcement Learning’
as Zoph and Le, 2017 - ‘A Simple Neural Attentive Meta-Learner’⚛
as Mishra et al., 2017
Reinforcement Learning (RL) also features heavily in all three presentations, with close links to meta-learning. One key concept is competitive self-play in which agents interact in a simulated environment to reach specific, typically adversarial objectives. As a way to ‘turn compute into data’, this approach enabled simulated agents to outperform human champions and invent new moves in rule-based games. Ilya Sutskever presents an evolutionary biology perspective that relates competitive self-play to the impact of social interaction on brain size (pay-walled link). He goes on to suggest that rapid competence gain in a simulated ‘agent society’ may ultimately, according to his judgement, provide a plausible path towards a form of AGI.
Given the significance he ascribes to these concepts, it seems plausible that some of the cited papers on self-play may have later also been included in the reading list. They may form a sizeable chunk of the missing items, especially as RL is otherwise mentioned by only one of the preserved reading items.
Potential Reading Items (Part 3):
- ‘Hindsight Experience Replay’⚛
as Andrychowicz et al., 2017 - ‘Continuous control with deep reinforcement learning’⚛
as DDPG: Deep Deterministic Policy Gradients, 2015 - ‘Sim-to-Real Transfer of Robotic Control with Dynamics Randomization’
as Peng et al., 2017 - ‘Meta Learning Shared Hierarchies’
as Frans et al., 2017 - ‘Temporal Difference Learning and TD-Gammon [1995]’
as Tesauro et al., 1992 - ‘Karl Sims - Evolved Virtual Creatures, Evolution Simulation, 1994’
as Carl Sims, 1994 (YouTube video [4:09]) - ‘Emergent Complexity via Multi-Agent Competition’
as Bansal et al., 2017 - ‘Deep reinforcement learning from human preferences’⚛
as Christiano et al., 2017 (Note: Introduces RLHF)
Even today, these presentations from around 2018 are still worth watching. Next to fascinating bits of knowledge, they also include gems such as the statement:
‘Just like in the human world: The reason humans find life difficult is because of other humans’
-Ilya Sutskever
While some concepts in computer science accordingly appear timeless, other points may seem surprising today, like the casual remark of an audience member in the Q&A session:
-Audience member
To which Ilya Sutskever responds:
-Ilya Sutskever (in 2018)
This response was later confirmed by experimental results in the reading item ‘Scaling Laws for Neural Language Models’ (which echoes the ‘Bitter Lesson’ by Rich Sutton). It was ultimately proven true, as he would oversee Transformer architectures scaled up to an estimated 1.8 trillion parameters and costing over $60 million to train on 128 GPUs forming Large Language Models (LLMs) which are today capable of generating text that is increasingly difficult to distinguish from human writing.
Honorable Mentions
Many other works and authors may have featured on the original list, but the evidence wears increasingly thin from here on.
Overall, the preserved reading items manage to strike an impressive balance between covering different model classes, applications and theory while also including many famous authors of the field. Perhaps the exceptions to this rule are worth noting, even if they may have slipped among the ‘10% of what matters’ that didn’t make the original list.
As such, it would have seemed plausible to include:
- Yann LeCun with pioneering work on CNNs for real-world use
- Ian Goodfellow with Generative Adversarial Networks (GANs) that dominated image generation at the time and
- Demis Hassabis for RL research towards AlphaFold that earned a Nobel prize
Conclusion
This post will remain largely speculative until more becomes known. After all, even the viral list itself was never officially confirmed to be authentic. Nonetheless, the potential candidates for the lost reading items listed above seemed worth sharing. Taken together, they may well fill a gap in the viral version of the list that would, in the words of the author, corresponded roughly to a missing ‘30% of what matters’ at its time.