Brain-Like AI Emerges Without Training Data in New Study

Before these systems ever see a single cat photo or traffic sign, some AI models are already humming in tune with the visual cortex.

In new work from Johns Hopkins University, scientists showed that carefully designed, biologically inspired architectures can mimic activity in human and primate visual brain areas even when the networks are untrained. Using brain recordings from monkeys and humans and comparing them with responses from dozens of wide, randomly initialized neural networks, the team reports in Nature Machine Intelligence that convolutional architectures with specific dimensionality changes come strikingly close to the performance of classic pretrained models.

The study tackles a puzzle that has been hanging over computational neuroscience for years. Deep neural networks trained on image recognition have been very successful at predicting brain responses in the ventral visual stream, but it has not been clear whether their brain like behavior comes mostly from the data and training objective or from the architectural blueprint itself. Atlas Kazemian, Eric Elmoznino, and Michael Bonner set out to strip learning out of the picture and ask what the raw wiring can do on its own.

They built families of untrained networks based on three blueprints familiar across modern AI: fully connected networks, vision transformers, and convolutional neural networks. In each case, they dramatically expanded the dimensionality of the last layer, adding orders of magnitude more random features. Those model responses to images were then linearly mapped to neural data: single unit activity from macaque areas V4 and IT, as well as large scale human fMRI responses from early, midventral, and high level ventral visual cortex.

Untrained Convolutional Networks Show Surprising Brain Alignment

The heart of the story is that not all architectures benefit equally from simply getting wider. When the team expanded the number of random features in the final layer, all three architectures improved a bit at predicting cortical responses, but convolutional networks pulled far ahead. Wide convolutional models that combined spatial pooling, increasing channel counts, and nonlinear activations showed “striking performance gains” as dimensionality grew, in some cases rivaling AlexNet pretrained on ImageNet when tested on monkey IT responses.

In human fMRI data from the Natural Scenes Dataset, the pattern held but with nuance. The best untrained convolutional models approached pretrained AlexNet performance in early visual cortex and reached about 70 percent of its performance in higher ventral regions. Fully connected and transformer architectures, even when matched for dimensionality, remained substantially weaker across the board. That gap, the authors argue, reflects the inductive biases baked into the convolutional design.

The Johns Hopkins press release puts the stakes in blunt terms.

“The way that the AI field is moving right now is to throw a bunch of data at the models and build compute resources the size of small cities. That requires spending hundreds of billions of dollars. Meanwhile, humans learn to see using very little data,” said lead author Mick Bonner, assistant professor of cognitive science at Johns Hopkins University.

Inside the convolutional networks, two ingredients mattered most: spatial locality and nonlinearity. When the researchers removed the rectified linear unit activations, performance dropped sharply and widening the model no longer helped. When they destroyed spatial locality by permuting image pixels so that each convolutional filter sampled scattered points rather than contiguous patches, encoding performance again fell, and the benefits of expansion were muted. Those manipulations left the overall parameter count similar, but they broke the architectural motifs that seem to echo the known organization of the visual cortex.

The team also showed that these effects are not just about having many neurons. Principal component analyses revealed that much of the useful variance in the high dimensional convolutional representations could be captured by a smaller number of principal components without sacrificing much encoding performance. So it is not the sheer width alone, but how that width reshapes the effective dimensionality of natural image representations under convolutional constraints.

Architecture As A Shortcut To More Efficient AI Learning

One of the more provocative findings is that strong alignment with brain data does not require good classification performance. When the researchers trained a simple linear classifier on features from their best untrained convolutional model and evaluated it on scene categories from the Places365 dataset, the model performed far worse than pretrained AlexNet. In other words, a network can look brain like in its internal responses while still being quite poor at labeling images, a reminder that brain similarity and task accuracy are related but distinct targets.

From the perspective of AI design, that distinction may be an opportunity. If the right architecture already sits close to a brain aligned solution before learning, then subsequent training might be faster, more data efficient, or more robust. Bonner leans into that idea.

“If training on massive data is really the crucial factor, then there should be no way of getting to brain-like AI systems through architectural modifications alone,” Bonner said. “This means that by starting with the right blueprint, and perhaps incorporating other insights from biology, we may be able to dramatically accelerate learning in AI systems.”

The work also helps reconcile two seemingly conflicting stories in the literature. On one hand, heavily pretrained networks with very different architectures often converge on similar levels of brain alignment, suggesting a kind of degeneracy in model design. On the other, decades of vision science have emphasized how anatomical constraints and ethological objectives sculpt cortical representations. By focusing on untrained or minimally trained networks, this study shows that convolutional architectures occupy a privileged zone in that design space, where random high dimensional feature expansions already carve out a representational geometry that the brain can use.

There are still open questions. The untrained convolutional models matched monkey data much more closely than human high level ventral responses, where semantic and feedback processes are likely more important. And the authors are careful not to claim that the cortex itself operates as a random feature bank. Instead, they suggest that architecture first, learning second may be a fruitful modeling philosophy, and that development in biological vision might resemble a pruning and refinement process acting on a richly overcomplete starting point.

For now, the takeaway is simple and a bit humbling for engineers: before you collect another billion images, it might be worth revisiting the blueprint.

Nature Machine Intelligence study


Discover more from NeuroEdge

Subscribe to get the latest posts sent to your email.

Leave a Comment