Real-time data analysis in particle physics

From Scholarpedia
Vladimir Vava Gligorov (2024), Scholarpedia, 19(3):53035. doi:10.4249/scholarpedia.53035 revision #201433 [link to/cite this article]
Jump to: navigation, search
Post-publication activity

Curator: Vladimir Vava Gligorov

Particle physics experiments which produce more data than can be read out by their detectors or recorded to permanent storage have to make irreversible decisions about how to reduce this data volume. Such decisions have to be made quickly with respect to the timescale on which the experiment generates data, in practice within fractions of a second. For this reason they are often referred to as occuring in "real time". The most common example is using information from one part of the detector to infer whether or not it is interesting to read out data from the detector as a whole. Because this decision involves giving a "trigger" to the detector electronics, any real-time processing of data within particle physics is usually referred to as "triggering".

The data volumes faced by today's particle physics experiments are so large, however, that the data must be further reduced even once data from the whole detector has been read out. A significant part of the trigger system then consists of looking at the detector data as a whole and reconstructing high-level physics objects in order to make better and more efficient decisions about which parts of the data to keep and which to discard. In some cases these high-level objects are the complete signal candidates which we are trying to search for, or whose properties we wish to measure. Reconstructing the complete signal candidates in real time also opens the possibility of a more sophisticated compression of the detector data, based on whether individual electronic signals in specific parts of the detector are associated with the high-level signal candidates or not. A high-level trigger which reconstructs complete signal candidates and measures their properties with the best achievable accuracy and precision is said to perform a real-time analysis of the data.

Note that while this entry is written from the perspective of collider experiments, most of the concepts translate to other types of experiments as well.


Data rates and volumes

Particle physics experiments may need to process data in real time in order to reduce the data rate, data volume, or most frequently both. Historically, both the rate at which data is produced and the overall volume of data from a single experiment would often exceed data rates and volumes in entire other scientific or industrial domains put together. The necessity of real-time processing therefore arises not only from the cost and impracticability of storing all this data, but from the even greater impracticability of distributing it for analysis by individual physicists.

As an example, we can look at real-time processing of ALEPH, one of the four LEP [1] experiments. LEP was an electron-positron collider, and as the ALEPH detector paper (Decamp, D. et al. (1963)) notes

Given the luminosity of LEP and the corresponding event rate, no specific type of physics events need to be selected -- the trigger must only reduce the background to a manageable level

Even so, the data rates and volumes faced by ALEPH were formidable in the context of 1989 technology. An unprocessed random ALEPH event would be around 30 kBytes in size, while an event containing a $Z^0$ boson would be several hundreds of kBytes. As a crossing of the LEP particle bunches occured every 22 $\mu$s, this meant a total unprocessed data rate of 1.36~Gbytes per second. LEP ran for around $10^6$ seconds per year and consumer hard drives cost around $10^4$ dollars per GByte in 1989. Storing all data to either tape or disk was therefore out of the question even without taking into account that server-grade and backed up storage costs up to an order of magnitude more per GByte than individual consumer disks. Transmitting this data would have been similarly impractical at a time when top of the line Ethernet connections had speeds in the hundreds of kbytes per second. Both the data rate and volume therefore had to be reduced by around five orders of magnitude to bring them into the realm of what could be stored and distributed.

Data rates, data volumes, bunch crossings, and events
The data volume of an experiment is the total number of bytes of information which it produces while recording data. The data rate is the volume divided by the time over which the data is recorded. This is often expressed with reference to a nominal unit of data called an "event". For example, a typical particle collider uses "bunches" of particles in counter-rotating beams and one or more collisions occur at each "bunch crossing". An event typically refers to the full data recorder by the detector during one bunch crossing. More generally, one event corresponds to one decision to read the detector out, in other words to one trigger. The distinction between data volume and rate is useful because processing many small events or processing a few large events has different implications for the balance between arithmetic computation and memory manipulation in the processing architecture.

The ratio between generated data volumes and storage costs have remained rather similar between LEP and the LHC [2] collider (LHC Study Group (1995)) built almost twenty years later. The LHC is a proton-proton collider in which bunch crossings take place every 25 ns. If we consider the smallest of the main LHC experiments, LHCb, a random event is around 30 kBytes on average after the zero-suppression performed in the detector electronics. (Aaij, R. et al. (2014)) This gives an overall data rate of around 1.2~TBytes per second, around four orders of magnitude larger than the ALEPH zero-suppressed rate. The ATLAS and CMS detectors ( Aad, G et al. (2008), Chatrchyan, S et al. (2008)) have raw data rates which are one to two orders of magnitude larger still. Over the same period hard disk prices have significantly decreased and consumer drives were roughly $0.1$ dollar per GByte when the LHC started up in 2009, around five orders of magnitude smaller than in 1989. Similarly typical Ethernet connections in 2009 had speeds of hundreds of MBytes per second, around three orders of magnitude better than at the time of LEP. Once again, we conclude that the real-time processing at the LHC must reduce the data volume by around four orders of magnitude.

These data volumes can also be put in the context of the global internet traffic [3] around the time that the experiments started taking data. ALEPH's 1.3 GByte of data per second dwarfed the 1 TByte of data per month being transmitted in 1990 on networks which preceeded the internet. By the time of the LHC in 2009 worldwide internet data rates had grown to around 15 EBytes per month, which at around 5.8 TBytes per second was still smaller than the instantaneous zero-suppressed data rates of the ATLAS and CMS detectors, and a few factors larger than that of LHCb. It was not until around 2014-2015 that global internet dataflow finally outgrew the data rates produced by the LHC experiments. This serves to underline the point that even if particle physics experiments could record all their data, distributing them to the hundreds (in the case of LEP) or thousands (in the case of the LHC) of physicists working on these experiments would require a network infrastructure orders of magnitude bigger than the internet as a whole.

Types of real-time architectures

When applied to data processing generally, the term "real time" [4] is typically interpreted to mean a commitment to process data within a specific timeframe. It is also intuitively interpreted to mean that the data is processed in fractions of a second, or more generally on timescales which are comparable or shorter than others in the system. For example in the case of autonomous vehicles, real-time processing should happen on timescales shorter than those on which the environment around the vehicle changes, so that the vehicle can react before it e.g. hits a pedestrian who stepped out into the road. The term "real time" has a somewhat more nuanced meaning in particle physics. To understand this better it is helpful to first briefly discuss the types of architectures used for real-time processing within our domain. These architectures can be roughly divided according to whether they have a fixed or variable latency. The most typical fixed-latency architectures used in particle physics are FPGAs or ASICs, while the most typical variable latency architectures are CPUs and GPUs.

Latency and synchronicity
A fixed-latency architecture executes a certain sequence of instructions at a specific clock frequency. Consequently the time taken to process a given set of inputs and produce the corresponding outputs is always the same. Fixed-latency architectures are "synchronous": inputs which arrive in a certain order are processed and produce outputs in the same order. A variable-latency architecture reads and processes inputs as they arrive, and the time to produce the corresponding outputs can vary from one set of inputs to another. Variable-latency architectures do not need to give their output in a fixed amount of time but are rather optimized so that the average set of inputs is processed at an average frequency. Variable-latency architectures are usually asynchronous, meaning that outputs can be produced in a different order to the inputs. If a variable-latency architecture needs to be synchronous, a sorting of the outputs is necessary, and this step will also impose an upper limit to how long a given set of inputs can be processed for. Either architecture may and usually does implement buffers to temporarily hold data while waiting for a processing step to complete.
Hardware and software (or high-level) triggers
The term "hardware trigger" is typically used to describe fixed-latency processing architectures, whether FPGA or ASIC based, which process inputs from specific fixed subsets of the detector. On the other hand "software trigger" typically refers to CPU server farms, or more recently GPU farms or CPU servers with GPU coprocessors, which receive inputs from most or all parts of the detector and can choose which parts of the input to process in the processing sequence itself. Software triggers are also often known as "high-level" triggers (HLT). It is traditional to hear hardware triggers described as inflexible, however the reality is somewhat more nuanced. The key element which makes hardware triggers more rigid than software triggers is not the processor itself, since FPGAs can be reprogrammed or repurposed. It is the fixed dataflow which controls which parts of the trigger architecture receive data from which parts of the detector, and hence greatly constrains the possible processing logic.

Particle physics experiments often use a mixture of fixed and variable latency architectures. One reason is that the cost of optical links to fully read a detector out, particularly if they need to be radiation-hard, can be comparable to or even higher than the cost of processors with which to subsequently analyse that data in real-time. So if it is possible to reduce the data rate by simple criteria based on a small part of the detector, typically in experiments which are looking for rare and distinctive signals, it may be significantly more economical to begin real-time processing with a fixed-latency hardware trigger. It may also be physically impossible to fully read out the detector, particularly in cases where the cables which carry the data would have to be put inside the active detector volume, spoiling its resolution. On the other hand wherever economically and physically possible experiments achieve maximal flexibility by fully reading out the detector into a variable-latency architecture which performs all real-time processing steps. These schemas are illustrated in Figure 1. The diagram has been simplified by omitting lossless compression, zero suppression, and other pre-processing of the data which may happen in the readout electronics of each detector component.

Figure 1: Illustration of two typical trigger architectures in particle physics. Left: a two-stage architecture with a fixed-latency hardware trigger followed by a variable-latency software trigger. The hardware trigger typically receives partial granularity data from a subset of detectors to limit the data rate being transmitted. Right: a single stage architecture with a full detector readout into a variable-latency software trigger. Note that the software triggers may themselves be composed of multiple computing architectures, for example CPUs and GPUs, which perform different parts of the real-time processing. Such single-stage architectures are typically referred to as "triggerless".

The general definition of "real time" given at the start of this section applies best to fixed-latency architectures such as detector readouts or hardware trigger systems. In this case there is a commitment to process or read data within a specific timeframe potentially leading to a decision to discard the data or to trigger further processing. This timeframe is typically microseconds, dictated by the buffer available within the FPGA or ASIC board to store data from the rest of the detector while waiting for the hardware trigger decision.

The typical definition of real time does not map well onto variable-latency processing architectures used in particle physics. Not only is there no commitment to process data within a specific timeframe, but the average time to process a set of inputs may run into hundreds of miliseconds or even seconds, which is not what people might intuitively understand as "real" time. This is possible because the data is no longer buffered in the electronics of an FPGA or ASIC board, but rather in the orders of magnitude larger memory or even hard disks of CPU and GPU servers. Variable-latency processing in particle physics does however share an important characteristic with quicker kinds of real-time processing, which is that the data is irreversibly processed before it is written to permanent storage.

Seen from this perspective even variable-latency processing respects the second intuitive definition of real-time, which is that the data is processed on a timescale shorter than others in the system. In this case, the timescale on which the data can be kept for manual inspection by physicists. Some variable-latency systems stretch even this definition by using hard disk buffers so large that the data can be buffered for days and even weeks at a time. At that point there is enough time for physicists to inspect the data and intervene to change the processing, and such processing is only "real-time" in a very qualified sense.

Returning to our earlier discussion of data rates and volumes, we should note that it understated the difficulty because we did not take into account that particle physics experiments are planned years if not decades before they are constructed. Their construction therefore inevitably involves projecting network, storage, and computing costs into the far future. In the case of our example detectors, the ALEPH letter of intent was written in 1981 while it started taking data in 1989; the LHCb letter of intent was written in 1995 while it started taking data in 2009. In both cases the real-time data processing had to be overdesigned in order for the experiment to remain feasible under conservative assumptions about technology evolution. In LHCb's case the output rate of the triggers for physics analysis was initially expected ( The LHCb Collaboration (2003)) to be around 200 Hz, with another 1.8 kHz of triggers used for detector calibration. In reality a combination of delays to the LHC, strong commercial technology evolution, and a flexible variable-latency trigger system which could evolve with these meant that LHCb could output over 10 kHz of triggers for physics analysis by the end of 2018, (Aaij, R. et al. (2019)) This in turn allowed a significant expansion in LHCb's physics programme. Although not all experiments can reap similar physics benefits from an overdesigned data processing, the real-time data processing is also the only part of a detector which can be meaningfully expanded and improved without simply replacing it by a better detector. The long-term trend within particle physics is therefore towards doing more and more of the real-time processing in variable latency and software triggers.

From triggering to real-time analysis

Having considered the physical and financial constraints on the types of real-time architectures, we can now turn to the physics goals those architectures serve. Particle physics experiments can be grouped according to how the particles being studied are produced: in the decays of long-lived Standard Model particles, by colliding a beam of particles with a fixed target, or by colliding two beams of particles. Fixed target or collider experiments can be further grouped according to whether the particles are produced in the interactions of leptons (typically electrons and positrons), hadrons, or a mixture of leptons and hadrons. These categories are relevant for real-time data processing because the production mechanism strongly influences two key parameters for the real-time system: the fraction of all events which contain signal particles, and the fraction of information in each event which comes from the signal particles or is necessary for characterizing the signal particles. These parameters in turn decide the balance between event selection and data compression as a means for reducing the data volume in the real-time processing.

Event selection and data compression
Event selection is the process of deciding, based on the values of certain criteria determined in real-time, whether a given event is interesting for further analysis. Event selection can bias the nature of the selected events. For example, if you select events based on the presence of a high energy deposit in the calorimeter, these events will not only have a greater average calorimeter energy, but will also contain a greater fraction of those underlying physical processes which tend to produce high energy calorimeter deposits.

Lossless data compression is typically implemented throughout the processing chain, lossy data compression goes further by reducing the information within each selected event. A simple example of data compression is zero-suppression, but more sophisticated data compression algorithms remove hits from parts of the detector and save only the high-level physics objects which could be reconstructed from those hits in real-time. Because data compression applies keep-or-reject criteria to a large fraction of the event data in real-time, it can introduce more complex biases than event selection. A real-time processing system may use event selection, data compression, or a mixture of the two to reduce the data volume.

To illustrate this point, let's have a look at the triggers of some current and past experiments with different production regimes. LEP was a typical example of a lepton collider, where events in the sense of bunch crossings occured every 22 $\mu$s, but in most cases the electron and positron particles elastically scattered or did not interact at all. The goal of the real-time processing was to identify the events where something interesting occured, for example the production of a $Z^0$ boson, and then to record the full detector information for these events. The triggering logic is nicely summarised in the ALEPH detector paper (Decamp, D. et al. (1963))

Typical events are complex, with 20 charged particles on average plus a similar number of neutrals, distributed over the entire solid angle. The expected event rate is very low, especially at energies above the $Z^0$ pole. Therefore this detector was conceived to cover as much of the total solid angle as practically possible and to collect from each event a maximum amount of information.

Note that by "event rate" the ALEPH authors meant what we have been calling "signal rate" in this article. There is then an explicit link between the fact that the interesting signal is produced rarely and the fact that the detector is designed to be hermetic ("cover as much of the total solid angle") and collect a maximum of information about each triggered event. It is worth expanding this logic because it underpins the trigger design of many other experiments as well.

Since any processing of the data may introduce biases, for example by preferentially selecting particles with a certain amount of energy or momentum, it is desirable to do as little processing as possible before writing data to permanent storage. The logic is that while biases introduced in real-time are irreversible, any mistakes made in the analysis of permanently recorded data can be undone by repeating the analysis from scratch. This holds first and foremost for the accuracy and resolution with which your detector can measure a particle's properties such as momentum and energy. These parameters can typically be improved over a detector's lifetime by increasingly sophisticated alignment and calibration techniques, but you can only benefit from such improvements if you saved the underlying detector information from which these high-level particle properties can be recomputed.

Similarly, it is desirable to record the full detector information for any event which the trigger considered interesting, because it allows new analysis techniques to be developed after the data is collected, in turn improving the physics reach of the experiment. For example, when searching for a hypothetical new particle, you might not know all the ways in which it can decay when the detector is designed. If you can trigger on some common feature of all these decays you can increase the number of decay modes which you search for over time, and therefore the physics reach of your experiment. If you do believe that you have found a new particle, understanding the correlation between that particle's decay products and the other particles reconstructed in your detector also helps to rule out fake signals. Alternatively, if the signal is real, such correlations are indispensible for understanding the physical mechanism by which the signal was produced. This again motivates keeping all detector information for the analysis of triggered events.

Common trigger signatures of new particles
Although we don't always know how a hypothesized new particle will decay, there are some general principles which allow trigger systems to select interesting events which might contain such particles for later analysis. If the hypothesized particle has a large mass compared to other particles produced in the experiment, when it decays it will convert that mass into momentum for its decay products. Almost irrespective of the total number of decay products, one of them on average end up with a large momentum compared to other particles produced in the experiment. Therefore, the presence of "large energy" within an event is a good generic trigger signature for heavy hypothetical particles. This can take the form of an unusually high momentum charged particle trajectory, a high energy calorimeter deposit, or a large energy imbalance in the detector indicating the presence of an energetic invisible particle. Similarly, if the hypothesized particle has a long lifetime compared to other particles produced in the experiment, it will decay far away from where it is produced. Consequently, its decay products will appear to come from a different part of the detector than typical particles, and the presence of such "displaced vertices" can be a good generic trigger signature.

This general philosophy also guided the design of triggers for the LHC's two general purpose detectors: ATLAS and CMS. Similarly to the LEP experiments ATLAS and CMS were primarily designed to make precision measurements of known Standard Model particles, find the top quark and Higgs boson as the remaining pieces of the Standard Model, and search for putative new particles beyond the Standard Model. Because the top quark and Higgs boson are so much heavier than any other Standard Model particles, their trigger signatures are very similar to those of new heavy particles beyond the Standard Model. The vast majority of ATLAS and CMS analyses are triggered by finding a common trigger-level signature of "something interesting" and then recording all detector information for these selected events, including all the information about particles which are not related to the signal.

The difference between these experiments and LEP or other lepton colliders arises from the production environment. The total inelastic cross-section for a hadron collider rises with the collision energy, whereas it falls for a lepton collider. The Tevatron or LHC inelastic cross-sections are roughly one million times larger than for LEP, and ten thousand times larger than for lower energy lepton colliders such as KEKB or PEP-II. (Toge N. et al. (1995), Hutton, A and Zisman, M.S (1991))So the trigger systems at hadron colliders have far more data to process right from the start. At lepton colliders the trigger is mostly discriminating between an inelastic interaction and beam-induced backgrounds: not only are the particles from an inelastic interaction much more energetic, but they typically come from a different place than beam-induced backgrounds.

By contrast, at a hadron collider the trigger is mostly discriminating between different kinds of inelastic interactions. The particles always come from the same place and because hadrons have a complex internal structure mediated by the strong force, inelastic hadron interactions always produce large numbers of particles as well as sometimes producing interesting signal. While interesting signals like the Higgs boson are still heavier and therefore produce more energetic decay products than the particles produced in an average inelastic proton-proton collision, this distinction is much fuzzier than at a lepton collider.

The other major difference in the lepton and hadron production environments is the composition of the signal-like events selected by the trigger system. In a lepton collider, if something interesting happened this is generally because one or two signal(-like) particles were produced and decayed inside the detector volume. So when a trigger selects an event as interesting, most of the information recorded by the detector for that event is relevant to the analysis of the signal. On the other hand in a hadron collider, most particles produced in an interesting event are not related to the signal or its decay products. Furthermore, at the LHC a typical event will contain multiple independent proton-proton inellastic interactions. Even if the event contains an interesting signal, it will have been produced in only one of these interactions, so by definition much if not most of the detector information in interesting events is not related to the signal we wish to study.

Signal candidate
A collection of physics objects reconstructed in the detector which are combined and selected as coming from the same physics process. Most typically this is a group of charged particle trajectories and/or neutral particle energy deposits which are postulated to come from the decay of a hypothetical signal particle. A physics analysis will typically associate a set of high level inferred properties (mass, lifetime, momentum vector, ...) with a signal candidate, as well as the charged or neutral particles which the signal candidate was made out of, and the raw signals in different parts of the detector which those particles were themselves made out of.

The event-selection approach to triggering breaks down if the data rate reduction of four to five orders of magnitude cannot be achieved without throwing away a large fraction of the signal: either due to irreducible backgrounds or due to overly abundant signals. It is then necessary to go beyond event selection and implement more fine-grained data compression and reduction techniques in real-time. Basic data compression methods like zero suppression are local to individual detector components and have a very simple physics meaning, but they cannot reduce the data volume by multiple orders of magnitude. Instead, the data must be reduced by fully reconstructing the detector and making real-time inferences about which particles are related to the signal and in what way. These inferences in turn allow the information saved for each event to be precisely targeted: full information about the signal including all data about its decay products in individual parts of the detector; high-level physics information about related particles; and aggregated high-level information about the rest of the event. This in turn allows a further one to two orders of magnitude reduction in the data volume, which then makes it possible to store this data long term and distribute it to physics analysts. Because it relies on a full detector reconstruction and calculates high-level quantities in real time, this kind of data compression and reduction is referred to as "real-time analysis".

Examples of real-time analysis

Figure 2: Comparison between the number of dijet events reconstructed in real-time (black points), the number of events selected by any single-jet trigger (thicker, blue line), and the events selected by single-jet triggers but corrected for the trigger prescale factors (thinner, red line) as a function of the dijet invariant mass.

In the case of both ATLAS and CMS, real-time analysis grew out of a need to enable searches for new particles in domains where the irreducible Standard Model backgrounds saturate the classical trigger bandwidth. A typical case are relatively light new particles, for example dark matter particles, decaying into a pair of hadronic jets. Unless these particles have a long lifetime, there is an irreducible background from QCD jets produced in inelastic proton-proton collisions, and this background grows in size exponentially as the dijet mass decreases. Such short-lived particles decaying into a pair of QCD jets satisfy none of the "common trigger signatures of new particles" which we introduced earlier. This is illustrated in Figure 2 for the ATLAS analysis. (The ATLAS Collaboration (2016)) A similar plot can be found in the corresponding CMS analysis. ( Khachatryan, V. et al. (2016)) The red curve represents the dijet mass spectrum which would be selected by regular triggers based on event selection if their output rate would not be an issue. The blue curve shows the dijet mass spectrum which is actually selected by triggers based on event selection once the downscaling used to reduce their rate to the allowed maximum is taken into account. As the plot makes clear, below around 800 GeV of dijet mass, the need to keep the trigger rate down significantly limits the analysis sensitivity.

After selecting a dijet signal candidate, the ATLAS trigger compressed the event by discarding information not associated with this candidate. This in turn reduced the event size by more than an order of magnitude and allowed the dijet real-time analysis trigger to operate without having to randomly discard ("downscale" in the jargon of the field) a fixed proportion of events in the single-jet event selection triggers. Neither ATLAS nor CMS executed a full real-time reconstruction of their detectors during this period, nor did they calibrate and align the detectors fully in real-time. While this reduced the jet resolution somewhat compared to the best performance which could have been achieved in a classical analysis relying on reprocessed events selected by a trigger, the impact on the analysis sensitivity was negligible compared to the gain in data sample size from performing the analysis in real-time.

Figure 3: The data fit to the reconstructed mass of (left) $D^0\to K^-\pi^+$ and (right) $D^+\to K^-\pi^+\pi^+$ candidates selected by LHCb real-time analysis in 2015. Fit components are indicated in the legend.

Similar physics use-cases exist in LHCb, specifically in searches for light dark matter decaying into dilepton final states (Aaij, R. et al. (2018)) where the Drell-Yan electroweak background is irreducible in the same way as the QCD dijet background in the ATLAS and CMS analyses. The majority of real-time analyses in LHCb are however to be found in charm physics and hadron spectroscopy, where the issue are not irreducible backgrounds but rather irreducible signals. This is illustrated in Figure 3 which shows two LHCb charm signals reconstructed by real-time analysis in 2015. (Aaij, R. et al. (2016)) The signal purity which could be achieved in real-time is quite high, but the LHC simply produces too many charm hadrons: when colliding protons at 13 TeV between 2015 and 2018, over six hundred thousand charm hadrons were produced and decayed inside the LHCb detector acceptance every second! While many of those decays are not of interest to physics analysis, individual decays which were of interest to physics analysis occured hundreds and in some cases tens of times per second. These arguments hold even more strongly for the upgraded LHCb detector  (Aaij, R. et al. (2018)) most of whose physics programme fully relies on real-time analysis for triggering.

In the examples considered so far, the primary use of real-time analysis was to pick out some subset of interesting particles, typically the signal candidate, and record these while discarding the rest of the event. In ALICE however, real-time analysis is used in a somewhat different way. Unlike in ATLAS, CMS, or LHCb, most of the data volume in ALICE comes from a single detector component: the time projection chamber, or TPC. This detector allows an incredibly precise charged particle reconstruction and momentum resolution, as each particle leaves dozens of individual hits in the TPC while traversing the detector. For the same reason, the TPC data volume is more than one order of magnitude too big to record. ALICE therefore uses real-time analysis to compress the data in the TPC, by identifying hits associated with very low momentum particles, typically caused by material interactions or beam-gas collisions, which are not interesting for physics analysis. These hits are then removed, reducing the TPC data volume by more than an order of magnitude. However, once this is done, the rest of the event is fully recorded for analysis.

What all versions and implementations of real-time analysis have in common is that once an event is tagged as interesting, only a small subset of that event's data is recorded to permanent storage. In other words the binary decision to record an event becomes a spectrum of "which information about the event should be recorded", with traditional full event selection as one endpoint. This in turn means that it is important to make the real-time processing of the data as accurate and as precise as possible, because there will be little opportunity to improve the physics performance later. For this reason both LHCb and ALICE, which rely on real-time analysis for much or all of their physics programme, have developed ways to spatially align and calibrate their detectors in real-time, thus ensuring that the data is always processed with the detector in its optimal condition.


  • Decamp, D et al. (1990). ALEPH: a detector for electron-positron annihilations at LEP Nucl. Instrum. Methods Phys. Res., A 294: 121-178. [5]
  • LHC Study Group, (1995). The Large Hadron Collider - conceptual design CERN-AC-95-05-LHC : . [6]
  • Aaij, R et al. (2014). LHCb Detector Performance Int. J. Mod. Phys. A 30: 1530022. [7]
  • Aad, G et al. (2008). The ATLAS Experiment at the CERN Large Hadron Collider JINST 3: S08003. [8]
  • Chatrchyan, S et al. (2008). The CMS experiment at the CERN LHC. The Compact Muon Solenoid experiment JINST 3: S08004. [9]
  • LHCb Collaboration, (1995). LHCb Trigger System Technical Design Report CERN-LHCC-2003-031 : . [10]
  • Aaij, R et al. (2019). Design and performance of the LHCb trigger and full real-time reconstruction in Run 2 of the LHC JINST 14: 04. [11]
  • Toge, N et al. (1995). KEK B-factory Design Report KEK-Report-95-7 : . [12]
  • Hutton(1991). PEP-II: an asymmetric B factory based on PEP Conference Record of the 1991 IEEE Particle Accelerator Conference 1: 84-86. [13]
  • The ATLAS collaboration, (2016). Search for light dijet resonances with the ATLAS detector using a Trigger-Level Analysis in LHC pp collisions at 13 TeV TeV ATLAS-CONF-2016-030 : . [14]
  • Khachatryan, V et al. (2016). Search for narrow resonances in dijet final states at 8 TeV with the novel CMS technique of data scouting Phys. Rev. Lett. 117: 031802. [15]
  • Aaij, R et al. (2018). Search for Dark Photons Produced in 13 TeV $pp$ Collisions Phys. Rev. Lett. 120: 061801. [16]
  • Aaij, R et al. (2016). Measurements of prompt charm production cross-sections in $pp$ collisions at 13 TeV JHEP 03: 159. [17]
  • Aaij, R et al. (2018). Computing Model of the Upgrade LHCb experiment CERN-LHCC-2018-014 : . [18]
Personal tools

Focal areas