What the brain can teach artificial neural networks
The brain offers valuable lessons to artificial neural networks to boost their data and energy efficiency, flexibility and more.
The field of NeuroAI encompasses two intertwined research programs: the use of artificial intelligence (AI) to model intelligent behavior, and the application of neuroscience insights to improve AI systems.
The motivation for using neuroscience to improve AI is clear: If the ultimate goal is, in the words of AI pioneer Marvin Minsky, “to build machines that can perform any […] task that a human can do,” then the most natural strategy is to reverse-engineer the brain. The motivation for using AI—in particular, artificial neural networks (ANNs)—to model neuroscience is that they represent our best model of distributed brain-like computation; indeed, these are the only models that can solve hard computational problems.
In spite of remarkable progress over the past decade, though, modern AI still lags far behind people and other animals on some tasks. AI systems can now write essays, pass the bar exam, ace advanced physics tests, prove mathematical theorems, write complex computer programs and flawlessly recognize speech. In many other domains, however—including navigating the physical world, planning over multiple time scales and performing perceptual reasoning—AI is mediocre at best.
AI systems struggle with activities that a human child or a squirrel can manage easily, a discrepancy known as Moravec’s paradox: What we consider difficult—high-level cognitive tasks, such as reasoning and solving math problems—turns out to be surprisingly easy for AI, and what we take for granted—our astounding ability to interact with the environment—remains out of reach for AI. No AI system can stalk an antelope or spin a web.
Animals learned to execute these tasks so easily thanks to 500 million years of evolution. This extensive trial-and-error process offers many potential insights for improving AI, three of which I outline below. This idiosyncratic list is not exhaustive; it merely provides a sampling of the opportunities for neuroscience to advance AI and, in turn, develop better AI tools for neuroscience.
I
n the first column of our new series on NeuroAI, I discussed the history of the synergy between neuroscience and AI. Here I describe in detail how gains in data efficiency, energy efficiency and flexibility can lead to future progress.When it comes to data efficiency, AI systems require enormous amounts of training data. Estimates suggest that state-of-the-art language models, such as GPT-4, have consumed dozens of terabytes of text data, and the data appetite of state-of-the-art models in other domains, including vision, is comparable. This volume is many orders of magnitude more data than a human toddler needs to learn to speak or a squirrel needs to learn to navigate the visual world. Why the gap?
Animals inherit evolutionary solutions efficiently encoded through a “genomic bottleneck.” The genome instructs the development of neural circuits that can instinctively perform complex tasks without the need for vast amounts of experience or data. Adapting this genomic bottleneck as an algorithm for ANNs could give these networks more efficient learning with far less training data and help uncover some of the fundamental constraints on neuronal development, such as the rules governing how neuronal circuits are specified in the genome.
Similarly, ANNs might borrow ideas from the human brain to become more energy efficient. A real-time conversation with models such as ChatGPT requires at least 100 times more power than the entire human brain consumes. And this significantly underestimates the brain’s energy advantage, because it compares the energy consumption of an array of Nvidia GPUs to that of the entire brain, only a small fraction of which goes into holding a conversation.
Two key factors likely account for the brain’s vastly greater efficiency. First, most of the energy neurons consume is used to generate action potentials, with energy usage roughly proportional to spike rate. As a result, neurons operate in a sparse regime, firing about 0.1 spikes per second in the cortex. Current artificial networks, by contrast, effectively operate in an energy-inefficient high-firing regime. Although there has been progress in making artificial networks more efficient, we do not yet really understand how to build artificial networks that compute in a sparse-spiking, energy-efficient regime like the brain.
Second, brains can tolerate significant noise, particularly in synaptic transmission, in which up to 90 percent of spikes can fail to trigger neurotransmitter release. Digital computation on modern computers—which relies on 0s and 1s—is, by contrast, extremely sensitive to noise. An error in even a single bit—flipping a 0 to a 1 or vice versa—can lead to catastrophic failure. As a result, digital systems expend a great deal of energy to ensure that 1s and 0s are never conflated. Brain-like algorithms that operate in a noisy regime have the potential to yield huge energy savings.
The energy used by AI is already enormous and growing rapidly, so understanding and replicating the brain’s energy-efficient computational strategies would have important implications for technology. Conversely, understanding how to compute with spikes could help resolve decades-old controversies about rate versus temporal coding.
And lastly, AI systems typically optimize for a single goal, a lack of flexibility that potentially leads to alignment problems. The most extreme form of this is illustrated by the “paper clip maximizer” thought experiment, in which an AI tasked with maximizing paper-clip production could potentially convert all available resources, including all humans, into paper clips.
Biological systems, by contrast, excel at balancing multiple objectives—the most basic of these are sometimes summarized as the four Fs: feeding, fleeing, fighting and, to put it more politely, romance. We do not fully understand how animals balance multiple objectives and sub-objectives, in part because the computational problem has not yet been clearly articulated. This challenge exemplifies the virtuous circle of NeuroAI: As we develop AI capable of juggling competing objectives, insights from neurobiology can guide our approach, and AI models can provide a rigorous testing ground for theories about how brains manage multiple goals, accelerating progress in both fields.
Herbert Simon predicted in 1965 that “machines will be capable, within 20 years, of doing any work a [hu]man can do.” Despite impressive progress, and the wildly optimistic predictions that we are on the brink of the AI singularity—a point at which AI surpasses human intelligence and evolves on its own—we are still far from that goal. But perhaps, to paraphrase Winston Churchill, now is not the end, or even the beginning of the end; but it is, perhaps, the end of the beginning in our attempts to formulate a unified model that leads to a deeper understanding of brain computations and to artificial systems capable of mimicking intelligent behavior.