Daniel Murnane

Position: Learning the Language of Reality: A Multi-tasking, Multi-scale Physics Language Model for High Energy Physics
Categories: Fellows, Postdoc Fellows 2023
Location: Niels Bohr Institute, University of Copenhagen

Abstract:

The search for new physics beyond the Standard Model at the Large Hadron Collider (LHC) at CERN has been an elusive quest, despite the billion-euro machinery and extremely sensitive detectors used in the experiment. To overcome this obstacle, I propose a project to develop a novel machine learning (ML) approach called a Physics Language Model (PLM).

The PLM is a graph neural network (GNN) that maintains multiple scales of information about the energy deposits across the ATLAS detector located at the LHC. Instead of discarding fine details as is currently done, the PLM uses a hierarchical structure to pay attention to the most relevant scales and features of the physics data. This approach can also be trained on a variety of physics tasks and, in other domains such as protein property prediction, has been shown to outperform single-task models. Novel developments in the field of high energy physics (HEP) should be expected to feedback to improve Biological and Chemical Language Models.

The current HEP paradigm is to work on a discrete task in the physics analysis chain, using only the scale and granularity of the data produced in the previous stage. Modern ML models, and large language models (LLMs) such as GPT in particular, are a complete inversion of this paradigm. They instead gain expressivity from learning emergent patterns in the fine details of many datasets and tasks. In my role as Machine Learning Forum Convener for ATLAS, and with current collaborations with Berkeley Lab, DeepMind, Columbia University, Copenhagen University and Georgia Tech on this topic, I believe the time has come to use the available data, physics tasks, and huge compute availability to build a prototype PLM.

The PLM could greatly increase the discovery power for new physics at the LHC by reviving the data that is currently discarded. This is a unique opportunity, as algorithm choices for the High Luminosity LHC (HL-LHC) upgrade will be finalized within 18 months. If trends in natural language ML can be captured in physics, a PLM can also be expected to grow exponentially in power with increasing dataset and model size.