A 1940s Math Trick Just Unlocked the Hidden Rules Written Inside Your DNA

Inside the nucleus of every cell, DNA doesn’t float freely. It’s wound, folded, bundled into a dense tangle of proteins and genetic material called chromatin, which compacts roughly two metres of DNA into a space just a few millionths of a metre across. Within that tangle are tiny domains, each about 100 nanometres wide (smaller than the wavelength of visible light, for context), and these domains are, in a real sense, where the action is. They control which genes get switched on, which stay silent, and in doing so they govern whether a cell stays healthy or tips toward cancer, whether it ages normally or doesn’t. The problem, for researchers who have spent years squinting at them through powerful microscopes, is that watching something happen and understanding why it happens are very different things.

That gap, between observation and explanation, is one of the oldest frustrations in science. And it turns out the mathematics needed to close it has been sitting in a 1944 paper by a German-American mathematician, waiting for someone to notice it was exactly what was needed.

The mathematician was Kurt Otto Friedrichs, who later received the US National Medal of Science. He described mathematical objects he called mollifiers, tools designed to smooth out particularly jagged or noisy functions by softening their sharpest features. The word itself comes from the Latin mollire, to soften. Friedrichs wasn’t thinking about DNA when he wrote that paper; he was doing pure mathematics. But Vivek Shenoy, a materials scientist at the University of Pennsylvania’s School of Engineering, and his team have now built mollifiers into the architecture of a neural network, and the results could reshape how scientists study some of the most complex systems in nature.

The technique tackles a particularly thorny class of problem called an inverse partial differential equation. “Solving an inverse problem is like looking at ripples in a pond and working backward to figure out where the pebble fell,” says Shenoy. “You can see the effects clearly, but the real challenge is inferring the hidden cause.”

The Problem with Working Backward

Partial differential equations, or PDEs, are the mathematical language scientists use to describe how systems change across both space and time. They underpin weather forecasting, materials modelling, heat transfer, and, in the Shenoy Lab, the way chromatin organises itself inside living cells. The inverse version of these equations asks the harder question: given what you can observe, what were the hidden rules that produced it? What reaction rates, what forces, what parameters are actually driving the system you’re watching?

“For years, we’ve used these equations to study how chromatin organises itself inside living cells,” Shenoy says. “But we kept running into the same problem: We could see the structures and model their formation, but we could not reliably infer the epigenetic processes driving this system, namely the chemical changes that help control which genes are active. The more we tried to optimise the existing approach, the clearer it became that the mathematics itself needed to change.”

The standard way AI systems handle these problems is through a method called recursive automatic differentiation, which calculates, layer by layer, how quantities change through a neural network. It works reasonably well for simpler cases. But for higher-order equations, especially when the underlying data is noisy (and biological data almost always is), the process starts to amplify errors rather than correct them. Think of trying to measure the gradient of a jagged mountain ridge by zooming in repeatedly: each step magnifies whatever roughness was there before. Ananyae Kumar Bhartari, a co-first author on the study who completed Penn Engineering’s Scientific Computing master’s programme, spent considerable time thinking the problem lay with the neural network’s design. “We initially assumed the issue had to do with the neural network’s architecture,” Bhartari says. “But, after carefully adjusting the network, we eventually realised the bottleneck was recursive automatic differentiation itself.” Once the true culprit was identified, the fix was perhaps surprising in its simplicity: smooth the signal before you try to measure it. A mollifier layer, inserted into the network, does exactly that, softening the jagged input just enough that the subsequent differentiation becomes stable and reliable. “That let us solve these equations more reliably, without the same computational burden,” says Bhartari.

It is worth pausing on that last part. Much of the recent progress in AI has come from throwing more computing power at problems: bigger models, more data, faster chips. Vinayak Vinayak, a doctoral candidate in materials science and the paper’s other co-first author, is direct about the distinction. “Modern AI often advances by scaling up computation. But some scientific challenges require better mathematics, not just more compute.”

From Watching to Understanding

For the chromatin problem specifically, mollifier layers mean something quite concrete. The domains Shenoy’s group studies are, as he puts it, “just 100 nanometres in size, but because accessibility determines gene expression, and gene expression governs cell identity, function, aging and disease, these domains play a critical role in biology and health.” Being able to watch those domains has been possible for years. Being able to infer the epigenetic reaction rates that drive their formation, the chemical changes that happen at molecular scale and determine how genes are regulated, has been stubbornly out of reach. Mollifier layers could change that, shifting chromatin science from structural description to dynamic modelling.

The study, published in Transactions on Machine Learning Research and set to be presented at NeurIPS 2026, has implications that reach well beyond genetics. Inverse PDE problems are common in materials science, fluid mechanics, climate modelling; anywhere scientists are trying to work backward from measurable patterns to hidden causes. The researchers suggest their framework could offer a more stable and computationally efficient approach across all of these domains.

Vinayak’s view of where this eventually leads is, by the standards of academic papers, fairly ambitious. “If we can track how these reaction rates evolve during aging, cancer or development,” he says, “this creates the potential for new therapies: If reaction rates control chromatin organisation and cell fate, then altering those rates could redirect cells to desired states.” Shenoy frames it in simpler terms: “If you understand the rules that govern a system, you now have the possibility of changing it.” A 1940s softening trick, it turns out, might be what finally makes that possible.

Source: Transactions on Machine Learning Research (TMLR) — Mollifier Layers: Enabling Efficient High-Order Derivatives in Inverse PDE Learning

Frequently Asked Questions

Why couldn’t scientists just use more computing power to solve these DNA problems?

The difficulty isn’t really about processing speed. Recursive automatic differentiation, the standard AI technique for these calculations, becomes mathematically unstable when data is noisy and the equations are high-order, meaning more computation just amplifies the errors rather than correcting them. The mollifier approach solves the underlying instability rather than trying to overpower it, which is why it also ends up being more computationally efficient, not less.

What does it actually mean to “infer hidden reaction rates” inside a cell?

Inside the nucleus, chemical modifications to proteins and DNA (the epigenetic layer) happen at specific rates that determine whether a stretch of DNA is accessible for gene expression or locked away. Scientists can observe the resulting chromatin structures, but the reaction rates themselves can’t be directly measured. Inverse PDE learning works backward from the observed structure to estimate those rates mathematically, a bit like reconstructing a recipe from tasting the finished dish.

Could this approach help with cancer research specifically?

That’s one of the applications the Penn team is most interested in. Chromatin organisation changes in cancer cells in ways that alter gene expression, and if mollifier layers can reliably infer the epigenetic reaction rates driving those changes, it opens up a potential target: if you can identify which reaction rates are aberrant, you might eventually find ways to correct them. The research is at an early stage, but the logic connects directly to how cancer biologists think about cell fate and identity.

Is there a risk that smoother data means less accurate results?

It’s a reasonable concern, and the researchers addressed it carefully. Mollification doesn’t erase features in the data; it softens the sharpest edges just enough to make differentiation stable. The mathematical theory behind mollifiers, which dates to Friedrichs’s 1944 work, includes precise controls on how much smoothing occurs. Too much and you lose real signal; too little and the instability returns. Getting that balance right was a central part of the technical work in the paper.

Quick Note Before You Read On.

ScienceBlog.com has no paywalls, no sponsored content, and no agenda beyond getting the science right. Every story here is written to inform, not to impress an advertiser or push a point of view.

Good science journalism takes time — reading the papers, checking the claims, finding researchers who can put findings in context. We do that work because we think it matters.

If you find this site useful, consider supporting it with a donation. Even a few dollars a month helps keep the coverage independent and free for everyone.

A 1940s Math Trick Just Unlocked the Hidden Rules Written Inside Your DNA

The Problem with Working Backward

From Watching to Understanding

Frequently Asked Questions

Related

Leave a Comment Cancel reply