When experiments are run at CERN’s Large Hadron Collider particle accelerator, it’s a tremendous event. The world’s largest machine has been responsible for discovering numerous new subatomic particles, including the ultra-elusive Higgs boson.
And lately, its data has been hinting tantalisingly at new physics beyond the Standard Model – the best set of equations we have to explain how the Universe works.
But it turns out the world only ever sees less than one percent of the data it actually generates.
Physicists use the 26.7-kilometre (16.6-mile) LHC tunnel to accelerate particles almost to light speed, and smash them together as hard as they can to see if they can find anything new in the resulting shower of particles.
Those particles collide at an incredible rate – the initial rate is about 30 million collisions per second for proton bunches, which contain about 120 billion protons.
That results in an insane amount of data. On its website, CERN notes that one billion collisions per second generates one petabyte per second.
And that’s kind of a problem – because storing, never mind analysing, that much data is impossible to manage.
“If we wanted to keep all 30 million events per second we would need about 2,000 petabytes to store a typical 12-hour run. For a typical running year of 150 days uptime, this would mean almost 400,000 PetaByte = 400 ExaByte per year – a huge amount of data we would not be able to store,” CERN research physicist Andreas Hoecker told ScienceAlert.
“Even worse, with the available (significant!) CPU capacity, we would not be able to process the data as fast as they come in.”
So an educated decision had to be made about what data makes the storage cut. And it’s tiny compared to the total number of events. For every 30 million collisions, just 1,200 are saved.
That’s only 0.004 percent of the total data generated – the other 99.996 percent is lost forever.
That sounds pretty terrifying when you consider what we could be missing.
But don’t panic. Hoecker says there’s very little chance that any data that could point to new physics outside the Standard Model has ended up in the discard pile, even accounting for its huge size.
“Most of the interesting processes we know of are fairly rare. For example, the production of a Higgs boson is a very rare process. With the maximum LHC collision intensity achieved so far we are producing roughly 1 Higgs boson per second,” Hoecker explained.
“Other interesting physics processes are less rare, but still only with a rate of up to a few hundred per second. We use ‘triggers’, which are fast online algorithms based on custom hardware and software, to select the interesting channels out of the majority of less interesting ones.”
These triggers, designed by many physicists with different interests, are based on the properties of the particles the researchers are looking for, which predominantly are the heavier particles, such as the Higgs boson, top quark, and W and Z bosons.
These heavy particles decay almost immediately into lighter particles – but the lost mass is translated into momentum. The higher momenta of these lighter particles then act as a sort of signature that can be picked up by algorithms looking for particular events.
It is possible, as a Forbes article suggested earlier this year, that the low momentum data that gets thrown out – from what are known as “soft” collisions between protons – could contain important clues. But Hoecker doesn’t think this is very likely.
“If the new particles are light, one would wonder why they weren’t discovered at previous, less energetic colliders,” he said. “The LHC builds on top of decades of searches for new particles (with many discoveries on the way) and extends those searches to higher energy and rarer processes.”
But the solution to that is not to try to keep impossibly huge amounts of mostly useless data, but to build new detectors for the Large Hadron Collider, as our understanding and technology capabilities improve.
Even if we could store all that data, Hoecker notes, doing so would still be a huge waste of computing resources – although some really good physics does inevitably get thrown away.
An upgrade of the Large Hadron Collider planned for a 2025 launch will allow physicists to improve the trigger, reducing the amount of good data that gets the chuck.
But, Hoecker said, the Large Hadron Collider is not really wasting anything. It was designed to study physics at the energy frontier, and that is exactly what it is doing – and been finding some really interesting stuff along the way. Including new physics.
“Finding the Higgs boson is – in effect – new physics,” Hoecker said.
“The fact that the Higgs boson is not physics beyond the Standard Model and that it was postulated doesn’t alter the wonder a physicist feels when observing and measuring it.”