The Neural Network That Was Taught Physics Before It Saw Any Data

A month. That is roughly how long Viktor Lilja used to wait just to teach a computer enough about light to be useful. Not the calculation itself, mind you, but the grind before it: generating the training data, one painstaking point at a time, each one taking anywhere from ten minutes to a full hour to compute. Up to 40,000 of them for a single network.

And then, sometimes, you would realise you needed more. So you started again.

“It might take us a whole month to generate enough data to train the neural network. Then if you realise that you need to add more things, it can take another month,” says Lilja, a doctoral student at Chalmers University of Technology in Sweden. His group designs optical components in a field called nanophotonics, where light is bent and steered on scales smaller than a single wavelength, and where the rules that govern ordinary lenses stop being much help. To get around the limits of natural materials, the team builds artificial ones inside supercomputers, leaning on neural networks to predict how each design will behave. The trouble is feeding those networks. They are hungry, and they are slow learners.

What Lilja and his colleagues have now done is give the machine a head start. Instead of letting it puzzle out the laws of physics from scratch, the way it always had to before, they simply taught it the rules first.

Teaching a Machine the Laws It Cannot Break

The idea sounds almost too obvious. An optical component has to obey electromagnetism, the same Maxwell’s equations that physics undergraduates sweat over. Yet a standard neural network knows none of this when it begins. It learns by example, inferring the underlying physics from thousands of worked cases, essentially reinventing the wheel every time it is trained. The Chalmers approach, published in Laser & Photonics Reviews, bakes the physics in before training starts, building the network around a piece of theory called the quasinormal mode expansion.

Quasinormal modes are, roughly, the natural resonances of a leaky system: the frequencies at which a tiny structure prefers to ring when light hits it, accounting for the fact that some of that light always escapes. Frame the problem in those terms and a great deal follows automatically. The network is guaranteed to respect energy conservation and causality, two things a black-box model can quietly violate. And it stops wasting effort learning what physicists already know.

“When we fed the super-brain information about the laws of physics, it immediately got much smarter. Our calculations now take one tenth of the time previously required,” says Philippe Tassin, a professor in the Department of Physics and Astronomy at Chalmers, who led the work. Thirty days of data generation, in other words, collapsed to three.

Why the Physicist Defers to the Network

There is a nice irony buried in all this, which is that Tassin understands the equations perfectly well and still cannot do what his own network does. He teaches electromagnetism. He knows it inside out. But knowing the equations and being able to read off a material’s behaviour from its shape are two very different things, and the second one, it turns out, is where the machine pulls ahead. The physics is simply too tangled for a human eye. So you build the tool, you hand it the rulebook, and then you watch it draw conclusions you could not have drawn yourself. For the photonic crystal slabs the team tested, the network needed only about 160 training examples to hit its target accuracy, roughly a tenth of the data that a conventional network demanded. On messier free-form designs it still got by on around a third.

And once trained, it is quick. Frighteningly so. “Once we’d trained the network, we could ask it to examine any structure at all and get the optical properties in a millisecond. With these new networks, we get better estimates and avoid obvious errors,” says Lilja. The team also ran the process backwards, asking the network to dream up a structure with a desired set of resonances; the design converged in under a second.

It is not flawless. On the more complicated metasurfaces, the network missed a few of the weaker resonances, probably because even tens of thousands of examples cannot fully map such a vast space of possible shapes. The authors are upfront about it. Still, the predictions it does make can be checked against the real eigenmodes of Maxwell’s equations, and they line up, which is more than you can say for most machine-learning shortcuts.

The payoff is breadth as much as speed. Because quasinormal modes describe almost any resonant optical system, the same trick should carry across to all sorts of devices: thinner camera and eyeglass lenses, certainly, but also the photonic crystals being eyed for shuttling information between quantum computers, where light at optical frequencies might one day carry data that today travels as fragile electrical signals. Chalmers happens to be building Sweden’s first larger quantum machine a few departments over. The geography is convenient.

For now, Tassin is content with the hours saved, which in a field this slow is no small thing. “Now that we can work so much faster, we can speed up design development for optical components.” A month down to three days is the sort of change that does not just make the work quicker; it makes work possible that nobody would have bothered attempting before.

Source: Laser & Photonics Reviews, DOI 10.1002/lpor.202502769

Frequently Asked Questions

Why does teaching a neural network physics make it faster?

A standard network has to infer the rules of electromagnetism from scratch by studying thousands of worked examples, which is slow and data-hungry. By building known physics into the network before training begins, the Chalmers team let it skip that step entirely, cutting the data it needed by up to ten times. The result is a model that learns from a few hundred examples rather than thousands, and gets the answer in a millisecond once trained.

Is it true that a physicist cannot do what this network does?

In a sense, yes. The researcher who led the work teaches electromagnetism and knows the governing equations thoroughly, but reading a material’s optical behaviour directly from its shape is a different problem that defeats human intuition. The physics is too tangled to eyeball, which is precisely why the network, fed the same rules, can draw conclusions a person cannot.

Could this speed up quantum computers?

Possibly, though indirectly. The same method can design the photonic crystals being explored for carrying information between quantum computers using light rather than fragile electrical signals. Chalmers is building Sweden’s first larger quantum machine in a neighbouring department, so the two strands of research may well converge.

What’s stopping this from designing any optical device perfectly?

On the most complex free-form designs, the network still misses some of the weaker resonances, because even tens of thousands of training examples cannot fully cover such an enormous space of possible shapes. The authors are candid about this limit. The reassurance is that its predictions can be checked against the true physics, so errors are visible rather than hidden.

Quick Note Before You Read On.

ScienceBlog.com has no paywalls, no sponsored content, and no agenda beyond getting the science right. Every story here is written to inform, not to impress an advertiser or push a point of view.

Good science journalism takes time — reading the papers, checking the claims, finding researchers who can put findings in context. We do that work because we think it matters.

If you find this site useful, consider supporting it with a donation. Even a few dollars a month helps keep the coverage independent and free for everyone.

The Neural Network That Was Taught Physics Before It Saw Any Data

Teaching a Machine the Laws It Cannot Break

Why the Physicist Defers to the Network

Frequently Asked Questions

Related

Leave a Comment Cancel reply