Paving the way to a brand-new class of computers that will drive unprecedented innovation and scientific breakthroughs.
Occupying the space of two professional basketball courts and weighing 600 tons, the massive Aurora exascale supercomputer is taking shape at the U.S. Department of Energy’s (DOE) Argonne National Laboratory.
Long rows of 8-foot-tall black cabinets loom large inside a new wing of the white-walled data center that houses Argonne’s other supercomputers. Aurora’s computing racks are interspersed with cabinets containing a special water-cooling system. Countless red and blue cables curl in and out of each cabinet, piping water from beneath the data center floor to cool the system. Electrical wiring supplies power from the mechanical room above, and more than 300 miles of networking cables connect the system together.
“It is going to be a game-changing tool for the scientific community.” — ALCF Director Michael Papka
Work has been underway for several years to get the infrastructure in place to operate the world-class system. Built by Intel and Hewlett Packard Enterprise, Aurora will be theoretically capable of delivering more than two exaflops of computing power. That means it should be able to do more than 2 billion billion calculations per second, making Aurora one of the fastest supercomputers in the world.
When Aurora is completed next year, its high computing speed and artificial intelligence capabilities will enable science that is impossible today.
Researchers from across the world use the supercomputers at the Argonne Leadership Computing Facility (ALCF) to tackle a wide range of scientific problems. Examples include designing more efficient airplanes, investigating the mysteries of the cosmos, modeling the impacts of climate change and accelerating the discovery of new materials. The ALCF is a DOE Office of Science user facility.
“With Aurora, we’re building a machine that will be able to do artificial intelligence, big data analysis and high-fidelity simulations at a scale we haven’t seen before,” said ALCF Director Michael Papka. “It is going to be a game-changing tool for the scientific community.”
The process of deploying a new supercomputer begins with the major facility upgrades necessary to operate the system, including utility-scale electrical and mechanical work.
“Preparing for a new supercomputer requires years of planning, coordination and collaboration,” said Susan Coghlan, ALCF project director for Aurora. “Aurora is our largest and most powerful supercomputer yet, so we’ve had to do some substantial facility upgrades to get ready for the system. We’ve transformed our building with the addition of new data center space, mechanical rooms and equipment that significantly increased our power and cooling capacity.”
Because Aurora is a liquid-cooled system, Argonne had to upgrade its cooling capacity to pump 44,000 gallons of water through a complex loop of pipes that connects to cooling towers, chillers, heat exchangers, a filtration system and other components. With pipes ranging from 4 inches to 30 inches in diameter, the cooling system ensures the water is at the perfect temperature, pressure and purity levels as it passes through the Aurora hardware.
The electrical room, which is located on the floor above the data center, is reinforced to support the weight of an airplane. It contains 14 substations that provide 60 megawatts of electricity to power not only Aurora but future Argonne computing systems and the entire building’s day-to-day electricity needs. The electrical room is outfitted with a large ceiling hatch so the substations can be lowered in (and out if needed) by construction cranes.
“When we’re doing utility-scale work to upgrade power and cooling capacity, we aim to create an infrastructure that can support future systems as well,” Coghlan said.
Once the major facility upgrades were in place, the team moved on to improvements like making sure power is delivered to the right locations at the right voltage, installing heavy-duty floor tiles to support the massive supercomputer, and putting in pipes to link the water system to Aurora.
The installation continues this fall with the phased delivery of Aurora’s computing engine: Intel’s state-of-the-art GPUs (graphics processing units) and CPUs (central processing units). Six GPUs and two CPUs are encapsulated in a sleek, thin compute blade, which holds the hardware components that make up a machine. Blades will be inserted vertically into each of Aurora’s cabinets. The final phase will upgrade the system with high-bandwidth memory.
While there are always challenges associated with construction work at this scale, many Aurora facility upgrades were carried out during the COVID-19 pandemic, creating some unforeseen issues related to contractor access and supply chain disruptions. The Aurora team has been building the supercomputer piece by piece as components have become available.
“We’ve basically assembled and built the entire system on the data center floor,” Coghlan said. “Instead of waiting for every part to be ready at once, we decided to take whatever parts and pieces we could get. We’ve had to be agile, adapting and modifying the plan on the fly to look at what we can build and what we have to push off to a later date.”
Having much of the physical system and supporting infrastructure in place has allowed Argonne, Intel and Hewlett Packard Enterprise to test and fine-tune various components, such as the storage system and the cooling loop, ahead of the supercomputer’s deployment so that it’s ready for science on day one.
“It’s been amazing to see our building go from a construction zone to a new state-of-the-art data center,” Papka said. “We can’t wait to power on Aurora, open it up to the research community, and work with them to take their science to new levels.”
The Argonne Leadership Computing Facility provides supercomputing capabilities to the scientific and engineering community to advance fundamental discovery and understanding in a broad range of disciplines. Supported by the U.S. Department of Energy’s (DOE’s) Office of Science, Advanced Scientific Computing Research (ASCR) program, the ALCF is one of two DOE Leadership Computing Facilities in the nation dedicated to open science.
Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science.
The U.S. Department of Energy’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit https://energy.gov/science.