CERN releases 300TB of Large Hadron Collider data into open access

Cancel your plans for this weekend! CERN just dropped 300 terabytes of hot collider data on the world and you know you want to take a look.

Kati Lassila-Perini, a physicist who works on the Compact Muon Solenoid (!) detector, gave a refreshingly straightforward explanation for this huge release.

“Once we’ve exhausted our exploration of the data, we see no reason not to make them available publicly,” she said in a news release accompanying the data. “The benefits are numerous, from inspiring high school students to the training of the particle physicists of tomorrow. And personally, as CMS’s data preservation coordinator, this is a crucial part of ensuring the long-term availability of our research data.”

Amazing that this perspective is not more widely held — though I suspect it is, by the scientists at least, if not the publishers and department heads who must think of the bottom line.

The data itself is from 2011, much of it from protons colliding at 7 TeV (teraelectronvolts, you know) and producing those wonderful fountains of rare particles we all love to fail to understand. All told, it’s about half the total data collected by the CMS detector, and makes up about 2.5 inverse femtobarns. But who’s counting?

cmsgiffyThere’s both the raw data from the detectors (so you can verify the results) and also “derived” datasets that are more easy to work with — and don’t worry, CERN is providing the tools to do so, as well. There’s a whole CERN Linux environment ready for booting up in a virtual machine, and a bunch of scripts and apps (some are on GitHub, too).

Just messing around in the same computing environment used by researchers plumbing the depths of the universe would be an interesting way to spend a few labs in a college physics course. There are even “masterclasses,” data sets and tools specially curated for high school kids.

Techcrunch event

Join 10k+ tech and VC leaders for growth and connections at Disrupt 2025

Netflix, Box, a16z, ElevenLabs, Wayve, Hugging Face, Elad Gil, Vinod Khosla — just some of the 250+ heavy hitters leading 200+ sessions designed to deliver the insights that fuel startup growth and sharpen your edge. Don’t miss the 20th anniversary of TechCrunch, and a chance to learn from the top voices in tech. Grab your ticket before doors open to save up to $444.

Join 10k+ tech and VC leaders for growth and connections at Disrupt 2025

Netflix, Box, a16z, ElevenLabs, Wayve, Hugging Face, Elad Gil, Vinod Khosla — just some of the 250+ heavy hitters leading 200+ sessions designed to deliver the insights that fuel startup growth and sharpen your edge. Don’t miss a chance to learn from the top voices in tech. Grab your ticket before doors open to save up to $444.

San Francisco | October 27-29, 2025

This is only the latest of several data dumps, but it’s also by far the largest. A more detailed explanation of the types of data and how they can be accessed is right here.

Topics

, , , , , ,
Loading the next article
Error loading the next article