British physicists and computer scientists are playing a key role in facing one of the biggest computing problems in the world — how to process the massive data volumes expected from the world’s biggest particle physics experiment the Large Hadron Collider (LHC), at CERN in Switzerland. When the LHC becomes operational in 2007 it will produce Petabytes (millions of Gigabytes) of data. To manage this, researchers have been creating a Grid to distribute the processing and storage of data around the world. Yesterday, March 15th 2005, the LHC Computing Grid (LCG) project announced that its massive computing Grid now includes more than 100 sites in 31 countries.
This makes it the world’s largest international scientific Grid. The UK is the biggest single contributor to the LCG, with more than a fifth of the Grid’s processing power at its 16 sites. The name ‘Grid’ comes from analogy with the Electricity Grid. Users can obtain a resource such as electricity, or in this case, computer processing, from a variety of sources to supply their needs, without needing to know where it comes from. This means that a scientist can be based anywhere in the world and request calculations on data from CERN, that will then be performed across numerous sites, countries and even continents. The sites participating in the LCG project are primarily universities and research laboratories. They contribute more than 10,000 central processor units (CPUs) and a total of nearly 10 million Gigabytes of storage capacity on disk and tape. More than 2,000 of these CPUs are in the UK, along with one million Gigabytes of storage capacity. LCG receives substantial support from the EU-funded project Enabling Grids for E-sciencE (EGEE), which is a major contributor to the operations of the LCG project.
The UK contribution to LCG is through the £33m GridPP project, funded by the Particle Physics and Astronomy Research Council (PPARC). Professor Tony Doyle, GridPP Project Leader, said, “LCG has taken Grid technology up to the next level in terms of scale. At GridPP, we’ve managed to create the largest science Grid in the UK by collaborating exceptionally well with the CERN-based team and others in LCG and EGEE. We all recognise that continued partnership in the next couple of years will be critical in terms of ensuring that the Grid operates effectively and efficiently.”
The LHC is a particle accelerator that will recreate conditions from moments after the Big Bang in order to study the fundamental properties of sub-atomic particles. The LCG project was launched in 2003 and is growing rapidly. The Grid operated by the LCG project is already being tested by the four major experiments that will use the LHC, namely ALICE, ATLAS, CMS and LHCb, to simulate the computing conditions expected once the LHC is fully operational. As a result, the LCG partners are achieving record-breaking results for high-speed data transfer, distributed processing and storage. Already, other scientific applications from disciplines such as biomedicine and geophysics are being tested on this unique computing infrastructure, with the support of the EGEE project.
Grid computing is a term used for many varieties of distributed computing. For the LCG project, the objective is to unite the computing capacity that exists in scientific organizations around the globe. This requires special middleware – the software that allows seamless operations across multiple institutional domains – so that users of the Grid perceive it as a single resource. Underlying the middleware is the basic infrastructure of this Grid, which consists of extremely high speed networks, clusters of hundreds of computers at the participating sites, as well as banks of disk servers and tape silos for the data storage, also distributed around the globe.
The LCG Project Leader Les Robertson, based at CERN’s IT Department, said: “We are well ahead of our original schedule for reaching 100 sites, and thanks is due to the many partner sites around the world for their contribution to this success – making a Grid like this is a truly collaborative effort.”
The Global Grid Forum, which is a community-initiated forum of thousands of individuals from industry and research leading the global standardization effort for Grid computing, is meeting in Seoul this week. The Chair of the GGF, Mark Linesch, described LCG’s 100-site milestone as “great news for Grids, and great news for science. Without doubt the LCG project is pushing the envelope for what an international science Grid can do.”
Despite the record-breaking scale of the LCG project today, Robertson notes that the current processing capacity of this Grid is estimated to be just 5% of the long-term needs of the LHC. Therefore, the LCG will continue to grow rapidly over the coming two years, both by adding sites and increasing resources available at existing sites. In addition, the exponential increase in processor speed and disk storage capacity inherent to the IT industry will help to achieve the LHC’s ambitious computing goals. An overview of the current status of the LCG project, listing all participating sites, can be found at http://goc.grid-support.ac.uk/gppmonWorld/cert_maps/CE.html
Notes for Editors
Click here for a map of the LCG sites.
CERN, the European Organization for Nuclear Research, has its headquarters in Geneva. At present, its Member States are Austria, Belgium, Bulgaria, the Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, Italy, Netherlands, Norway, Poland, Portugal, Slovakia, Spain, Sweden, Switzerland and the United Kingdom. India, Israel, Japan, the Russian Federation, the United States of America, Turkey, the European Commission and UNESCO have Observer status.
The mission of the LHC Computing Grid (LCG) project is to build and maintain a data storage and analysis infrastructure for the entire high energy physics community that will use the LHC. Discovering new fundamental particles and analysing their properties with the LHC accelerator is possible only through statistical analysis of the massive amounts of data gathered by the LHC detectors ATLAS, CMS, ALICE and LHCb, and detailed comparison with compute-intensive theoretical simulations. The goals of the LCG project include: developing different software components to support the physics application software in a Grid environment; developing and deploying computing services based on a distributed Grid model; managing users and their rights in an international, heterogeneous and non-centralized Grid environment; managing acquisition, installation, and capacity planning for the large number of commodity hardware components that form the physical platform for the LCG project. The LCG project relies on advanced networking infrastructures such as the GEANT network, a multi-gigabit pan-European data communications network supported by 26 National Research and Education Networks. For more information see http://www.cern.ch/lcg.
GridPP is a six-year PPARC project with additional associated funding from HEFCE, SHEFC and the European Union. A collaboration of twenty UK Universities and research institutes and CERN, it will provide the UK’s contribution to the Large Hadron Collider Computing Grid. For more information see http://www.gridpp.ac.uk. The GridPP Collaboration involves: The University of Birmingham; The University of Bristol; Brunel University; CERN, European Particle Physics Laboratory; The University of Cambridge; Council for the Central Laboratory of the Research Councils; The University of Durham; The University of Edinburgh; The University of Glasgow; Imperial College London; Lancaster University; The University of Liverpool; The University of Manchester; Oxford University; Queen Mary, University of London; Royal Holloway, University of London; The University of Sheffield; The University of Sussex; University of Wales Swansea; The University of Warwick; University College London.
The Enabling Grids for E-sciencE (EGEE) project is funded by the European Commission and aims to build on recent advances in grid technology and develop a service grid infrastructure which is available to scientists 24 hours-a-day. The project aims to provide researchers in both academia and industry with access to major computing resources, independent of their geographic location. The EGEE project identifies a wide range of scientific disciplines and their applications and supports a number of them for deployment. To date there are five different scientific applications running on the EGEE Grid infrastructure. For more information see http://public.eu-egee.org/.
From PPARC