Back to home page

sPhenix code displayed by LXR

 
 

    


Warning, /acts/docs/core/reconstruction/track_fitting.md is written in an unsupported language. File is not indexed.

0001 # Track Fitting
0002 
0003 The track fitting algorithms estimate the track parameters.
0004 It is part of the pattern recognition/track  reconstruction/tracking.
0005 We can run the track fitting algorithms, after we allocated all hits to single tracks with the help of a track finding algorithm.
0006 It is not necessary, that all points of a track are present.
0007 
0008 Currently, we have implementations for three different fitters:
0009 * Kalman Filter (KF)
0010 * Gaussian Sum Filter (GSF)
0011 * Global Chi-Square Fitter (GX2F) [in development]
0012 Even though all of them are least-squares fits, the concepts are quite different.
0013 Therefore, we should not expect identical results from all of them.
0014 
0015 (kf_core)=
0016 ## Kalman Filter (KF) [wip]
0017 The Kalman Filter is an iterative fitter.
0018 It successively combines measurements to obtain an estimate of the track parameters.
0019 The KF needs an estimate as a starting point. The procedure alternates between two methods:
0020 1. Extrapolate the current state to the next surface.
0021 2. Update the extrapolation using the measurement of the new surface.[^billoir]
0022 The meaning of "this surface" and "the next surface" changes with the context.
0023 There are three different interpretations for this.
0024 The KF can give us those three interpretations as sets of track parameters:
0025     * predicted: Uses "older" data (i.e. from the last surfaces) to make the prediction. This prediction is an extrapolation from the old data onto the current surface.
0026     * filtered: Uses the "current" data (i.e. the predicted data updated with the measurement on the current surface). It is some kind of weighted mean.
0027     * smoothed: Uses the "future" data to predict the current parameters. This can only be evaluated if the whole propagation is finished once. This can be done in to ways: one uses backwards-propagation and one does not.
0028 
0029 :::{todo}
0030 Complete Kalman Filter description
0031 :::
0032 
0033 (gsf_core)=
0034 ## Gaussian Sum Filter (GSF)
0035 
0036 The GSF is an extension of the Kalman-Filter that allows to handle non-gaussian errors by modelling the track state as a gaussian mixture:
0037 
0038 $$
0039 p(\vec{x}) = \sum_i w_i \varphi(\vec{x}; \mu_i, \Sigma_i), \quad \sum_i w_i = 1
0040 $$
0041 
0042 A common use case of this is electron fitting. The energy-loss of Bremsstrahlung for electrons in matter are highly non-Gaussian, and thus cannot be modelled accurately by the default material interactions in the Kalman Filter. Instead, the Bremsstrahlung is modelled as a Bethe-Heitler distribution, where $z$ is the fraction of the energy remaining after the interaction ($E_f/E_i$), and $t$ is the material thickness in terms of the radiation length:
0043 
0044 $$
0045 f(z) = \frac{(- \ln z)^{c-1}}{\Gamma(c)}, \quad c = t/\ln 2
0046 $$
0047 
0048 (figBetheHeitler)=
0049 :::{figure} figures/gsf_bethe_heitler_approx.svg
0050 :width: 450px
0051 :align: center
0052 The true Bethe-Heitler distribution compared with a gaussian mixture approximation (in thin lines the individual components are drawn) at t = 0.1 (corresponds to ~ 10mm Silicon).
0053 :::
0054 
0055 To be able to handle this with the Kalman filter mechanics, this distribution is approximated by a gaussian mixture as well (see {numref}`figBetheHeitler`). The GSF Algorithm works then as follows (see also {numref}`figGsf`)
0056 
0057 * On a surface with material, the Bethe-Heitler energy-loss distribution is approximated with a fixed number of gaussian components for each component. Since this way the number of components would grow exponentially with each material interaction, components that are close in terms of their *Kullback–Leibler divergence* are merged to limit the computational cost.
0058 * On a measurement surface, for each component a Kalman update is performed. Afterwards, the component weights are corrected according to each component's compatibility with the measurement.
0059 
0060 (figGsf)=
0061 :::{figure} figures/gsf_overview.svg
0062 :width: 450px
0063 :align: center
0064 Simplified overview of the GSF algorithm.
0065 :::
0066 
0067 ### The Multi-Stepper
0068 To implement the GSF, a special stepper is needed, that can handle a multi-component state internally: The {class}`Acts::MultiEigenStepperLoop`, which is based on the {class}`Acts::EigenStepper` and thus shares a lot of code with it. It interfaces to the navigation as one aggregate state to limit the navigation overhead, but internally processes a multi-component state. How this aggregation is performed can be configured via a template parameter, by default weighted average is used ({struct}`Acts::WeightedComponentReducerLoop`).
0069 
0070 Even though the multi-stepper interface exposes only one aggregate state and thus is compatible with most standard tools, there is a special aborter is required to stop the navigation when the surface is reached, the {struct}`Acts::MultiStepperSurfaceReached`. It checks if all components have reached the target surface already and updates their state accordingly. Optionally, it also can stop the propagation when the aggregate state reaches the surface.
0071 
0072 
0073 ### Using the GSF
0074 
0075 The GSF is implemented in the class {struct}`Acts::GaussianSumFitter`. The interface of its `fit(...)`-functions is very similar to the one of the {class}`Acts::KalmanFitter` (one for the standard {class}`Acts::Navigator` and one for the {class}`Acts::DirectNavigator` that takes an additional `std::vector<const Acts::Surface *>` as an argument):
0076 
0077 ```{doxygenstruct} Acts::GaussianSumFitter
0078 ---
0079 members: fit
0080 outline:
0081 ---
0082 ```
0083 
0084 The fit can be customized with several options. Important ones are:
0085 * *maximum components*: How many components at maximum should be kept.
0086 * *weight cut*: When to drop components.
0087 * *component merging*: How a multi-component state is reduced to a single set of parameters and covariance. The method can be chosen with the enum {enum}`Acts::ComponentMergeMethod`. Two methods are supported currently:
0088     * The *mean* computes the mean and the covariance of the mean.
0089     * *max weight* takes the parameters of component with the maximum weight and computes the variance around these. This is a cheap approximation of the mode, which is not implemented currently.
0090 * *mixture reduction*: How the number of components is reduced to the maximum allowed number. Can be configured via a {class}`Acts::Delegate`:
0091     * *Weight cut*: Keep only the N components with the largest weights. Implemented in {func}`Acts::reduceMixtureLargestWeights`.
0092     * *KL distance*: Merge the closest components until the required amount is reached. The distance measure is the *Kullback-Leibler distance* in the *q/p* component. Implemented in {func}`Acts::reduceMixtureWithKLDistance`.
0093 
0094 :::{note}
0095 A good starting configuration is to use 12 components, the *max weight* merging and the *KL distance* reduction.
0096 :::
0097 
0098 All options can be found in the {struct}`Acts::GsfOptions`:
0099 
0100 ```{doxygenstruct} Acts::GsfOptions
0101 ---
0102 outline:
0103 ---
0104 ```
0105 
0106 If the GSF finds the column with the string identifier *"gsf-final-multi-component-state"* (defined in `Acts::GsfConstants::kFinalMultiComponentStateColumn`) in the track container, it adds the final multi-component state to the track as a `std::optional<Acts::MultiComponentBoundTrackParameters<SinglyCharged>>` object.
0107 
0108 A GSF example can be found in the ACTS Examples Framework [here](https://github.com/acts-project/acts/blob/main/Examples/Scripts/Python/truth_tracking_gsf.py).
0109 
0110 ### Customising the Bethe-Heitler approximation
0111 
0112 The GSF needs an approximation of the Bethe-Heitler distribution as a Gaussian mixture on each material interaction (see above). This task is delegated to a separate class, that can be provided by a template parameter to {struct}`Acts::GaussianSumFitter`, so in principle it can be implemented in different ways.
0113 
0114 However, ACTS ships with the class {class}`Acts::AtlasBetheHeitlerApprox` that implements the ATLAS strategy for this task: To be able to evaluate the approximation of the Bethe-Heitler distribution for different materials and thicknesses, the individual Gaussian components (weight, mean, variance of the ratio $E_f/E_i$) are parametrised as polynomials in $x/x_0$. This class can load files in the ATLAS format that can be found [here](https://gitlab.cern.ch/atlas/athena/-/tree/main/Tracking/TrkFitter/TrkGaussianSumFilter/Data). A default parameterization can be created with {func}`Acts::makeDefaultBetheHeitlerApprox`.
0115 
0116 The {class}`Acts::AtlasBetheHeitlerApprox` is constructed with two parameterizations, allowing to use different parameterizations for different $x/x_0$. In particular, it has this behaviour:
0117 * $x/x_0 < 0.0001$: Return no change
0118 * $x/x_0 < 0.002$: Return a single gaussian approximation
0119 * $x/x_0 < 0.1$: Return the approximation for low $x/x_0$.
0120 * $x/x_0 \geq 0.1$: Return the approximation for high $x/x_0$. The maximum possible value is $x/x_0 = 0.2$, for higher values it is clipped to 0.2 and the GSF emits a warning.
0121 
0122 ### Further reading
0123 
0124 * *Thomas Atkinson*, Electron reconstruction with the ATLAS inner detector, 2006, see [here](https://cds.cern.ch/record/1448253)
0125 * *R Frühwirth*, Track fitting with non-Gaussian noise, 1997, see [here](https://doi.org/10.1016/S0010-4655(96)00155-5)
0126 * *R Frühwirth*, A Gaussian-mixture approximation of the Bethe–Heitler model of electron energy loss by bremsstrahlung, 2003, see [here](https://doi.org/10.1016/S0010-4655(03)00292-3)
0127 
0128 (gx2f_core)=
0129 ## Global Chi-Square Fitter (GX2F)
0130 
0131 In general the *GX2F* is a weighted least squares fit, minimising the
0132 
0133 $$
0134 \chi^2 = \sum_i \frac{r_i^2}{\sigma_i^2}
0135 $$
0136 
0137 of a track.
0138 Here, $r_i$ are our residuals that we weight with $\sigma_i^2$, the covariance of the measurement (a detector property).
0139 Unlike the *KF* and the *GSF*, the *GX2F* looks at all measurements at the same time and iteratively minimises the starting parameters.
0140 
0141 With the *GX2F* we can obtain the final parameters $\vec\alpha_n$ from starting parameters $\vec\alpha_0$.
0142 We set the $\chi^2 = \chi^2(\vec\alpha)$ as a function of the track parameters, but the $\chi^2$-minimisation could be used for many other problems.
0143 Even in the context of track fitting, we are quite free on how to use the *GX2F*.
0144 Especially the residuals $r_i$ can have many interpretations.
0145 Most of the time we will see them as the distance between a measurement and our prediction.
0146 But we can also use scattering angles, energy loss, ... as residuals.
0147 Therefore, the subscript $i$ stands most of the time for a measurement surface, since we want to go over all of them.
0148 
0149 This chapter on the *GX2F* guides through:
0150 - Mathematical description of the base algorithm
0151 - Mathematical description of the multiple scattering
0152 - (coming soon) Mathematical description of the energy loss
0153 - Implementation in ACTS
0154 - Pros/Cons
0155 
0156 ### Mathematical description of the base algorithm
0157 
0158 :::{note}
0159 The mathematical derivation is shortened at some places.
0160 There will be a publication including the full derivation coming soon.
0161 :::
0162 
0163 To begin with, there will be a short overview on the algorithm.
0164 Later in this section, each step is described in more detail.
0165 1. Minimise the $\chi^2$ function
0166 2. Update the initial parameters (iteratively)
0167 3. Calculate the covariance for the final parameters
0168 
0169 But before going into detail, we need to introduce a few symbols.
0170 As already mentioned, we have our track parameters $\vec\alpha$ that we want to fit.
0171 To fit them we, we need to calculate our residuals as
0172 
0173 $$
0174 r_i = m_i - f_i^m(\vec\alpha)
0175 $$
0176 
0177 where $f^m(\vec\alpha)$ is the projection of our propagation function $f(\vec\alpha)$ into the measurement dimension.
0178 Basically, if we have a pixel measurement we project onto the surface, discarding all angular information.
0179 This projection could be different for each measurement surface.
0180 
0181 #### 1. Minimise the $\chi^2$ function
0182 
0183 We expect the minimum of the $\chi^2$ function at
0184 
0185 $$
0186 \frac{\partial\chi^2(\vec\alpha)}{\partial\vec\alpha} = 0.
0187 $$
0188 
0189 To find the zero(s) of this function we could use any method, but we will stick to a modified [Newton-Raphson method](https://en.wikipedia.org/wiki/Newton%27s_method),
0190 since it requires just another derivative of the $\chi^2$ function.
0191 
0192 #### 2. Update the initial parameters (iteratively)
0193 
0194 Since we are using the Newton-Raphson method to find the minimum of the $\chi^2$ function, we need to iterate.
0195 Each iteration (should) give as improved parameters $\vec\alpha$.
0196 While iterating we update a system, therefore we want to bring it in this form:
0197 
0198 $$
0199 \vec\alpha_{n+i} = \vec\alpha_n + \vec{\delta\alpha}_n.
0200 $$
0201 
0202 After some derivations of the $\chi^2$ function and the Newton-Raphson method, we find matrix equation to calculate $\vec{\delta\alpha}_n$:
0203 
0204 $$
0205 [a_{kl}] \vec{\delta\alpha}_n = \vec b
0206 $$
0207 
0208 with
0209 
0210 $$
0211 a_{kl} = \sum_{i=1}^N \frac{1}{\sigma_i^2} \frac{\partial f_i^m(\vec\alpha)}{\partial \alpha_k}\frac{\partial f_i^m(\vec\alpha)}{\partial \alpha_l}\\
0212 $$
0213 
0214 (where we omitted second order derivatives) and
0215 
0216 $$
0217 b_k = \sum_{i=1}^N \frac{r_i}{\sigma_i^2} \frac{\partial f_i^m(\vec\alpha)}{\partial \alpha_k}.
0218 $$
0219 
0220 At first sight, these expression might seem intimidating and hard to compute.
0221 But having a closer look, we see, that those derivatives already exist in our framework.
0222 All derivatives are elements of the Jacobian
0223 
0224 $$
0225 \mathbf{J} = \begin{pmatrix}
0226                  \cdot & \dots & \cdot\\
0227                  \vdots & \frac{\partial f^m(\vec\alpha)}{\partial \alpha_k} & \vdots\\
0228                  \cdot & \dots & \cdot
0229              \end{pmatrix}.
0230 $$
0231 
0232 At this point we got all information to perform a parameter update and repeat until the parameters $\vec\alpha$ converge.
0233 
0234 #### 3. Calculate the covariance for the final parameters
0235 
0236 The calculation of the covariance of the final parameters is quite simple compared to the steps before:
0237 
0238 $$
0239 cov_{\vec\alpha} = [a_{kl}]^{-1}
0240 $$
0241 
0242 Since it only depends on the $[a_{kl}]$ of the last iteration, the *GX2F* does not need an initial estimate for the covariance.
0243 
0244 ### Mathematical description of the multiple scattering
0245 
0246 To describe multiple scattering, the *GX2F* can fit the scattering angles as they were normal parameters.
0247 Of course, fitting more parameters increases the dimensions of all matrices.
0248 This makes it computationally more expensive to.
0249 
0250 But first shortly recap on multiple scattering.
0251 To describe scattering, on a surface, only the two angles $\theta$ and $\phi$ are needed, where:
0252 - $\theta$ is the angle between the extrapolation of the incoming trajectory and the scattered trajectory
0253 - $\phi$ is the rotation around the extrapolation of the incoming trajectory
0254 
0255 This description is only valid for thin materials.
0256 To model thicker materials, one could in theory add multiple thin materials together.
0257 It can be shown, that it is enough to two sets of $\theta$ and $\phi$ on both sides of the material.
0258 We could name them $\theta_{in}$, $\theta_{out}$, $\phi_{in}$, and $\phi_{out}$.
0259 But in the end they are just multiple parameters in our fit.
0260 That's why we will only look at $\theta$ and $\phi$ (like for thin materials).
0261 
0262 By defining residuals and covariances for the scattering angles, we can put them into our $\chi^2$ function.
0263 For the residuals we choose (since the expected angle is 0)
0264 
0265 $$
0266 r_s = \begin{cases}
0267          0 - \theta_s(\vec\alpha) \\
0268          0 - \sin(\theta_{loc})\phi_s(\vec\alpha)
0269       \end{cases}
0270 $$
0271 
0272 with $\theta_{loc}$ the angle between incoming trajectory and normal of the surface.
0273 (We cannot have angle information $\phi$ if we are perpendicular.)
0274 For the covariances we use the Highland form [^cornelissen] as
0275 
0276 $$
0277 \sigma_{scat} = \frac{13.6 \text{MeV}}{\beta c p} Z\prime \sqrt{\frac{x}{X_0}} \left( 1 + 0.038 \ln{\frac{x}{X_0}} \right)
0278 $$
0279 
0280 with
0281 - $x$ ... material layer with thickness
0282 - $X_0$ ... radiation length
0283 - $p$ ... particle momentum
0284 - $Z\prime$ ... charge number
0285 - $\beta c$ ... velocity
0286 
0287 Combining those terms we can write our $\chi^2$ function including multiple scattering as
0288 
0289 $$
0290 \chi^2 = \sum_{i=1}^N \frac{r_i^2}{\sigma_i^2} + \sum_{s}^S \left(\frac{\theta_s^2}{\sigma_s^2} + \frac{\sin^2{(\theta_{loc})}\phi_s^2}{\sigma_s^2}\right)
0291 $$
0292 
0293 Note, that both scattering angles have the same covariance.
0294 
0295 ### (coming soon) Mathematical description of the energy loss [wip]
0296 
0297 :::{todo}
0298 Write *GX2F*: Mathematical description of the energy loss
0299 
0300 The development work on the energy loss has not finished yet.
0301 :::
0302 
0303 ### Implementation in ACTS
0304 
0305 The implementation is in some points similar to the KF, since the KF interface was chosen as a starting point.
0306 This makes it easier to replace both fitters with each other.
0307 The structure of the *GX2F* implementation follows coarsely the mathematical outline given above.
0308 It is best to start reading the implementation from `fit()`:
0309 1. Set up the fitter:
0310    - Actor
0311    - Aborter
0312    - Propagator
0313    - Variables we need longer than one iteration
0314 2. Iterate
0315    1. Update parameters
0316    2. Propagate through geometry
0317    3. Collect:
0318        - Residuals
0319        - Covariances
0320        - Projected Jacobians
0321    4. Loop over collection and calculate and sum over:
0322        - $\chi^2$
0323        - $[a_{kl}]$
0324        - $\vec b$
0325    5. Solve $[a_{kl}] \vec{\delta\alpha}_n = \vec b$
0326    6. Check for convergence
0327 3. Calculate covariance of the final parameters
0328 4. Prepare and return the final track
0329 
0330 #### Configuration
0331 
0332 Here is a simplified example of the configuration of the fitter.
0333 
0334 ```cpp
0335 template <typename traj_t>
0336 struct Gx2FitterOptions {
0337 Gx2FitterOptions( ... ) : ... {}
0338 
0339 Gx2FitterOptions() = delete;
0340 
0341 ... 
0342 //common options:
0343 // geoContext, magFieldContext, calibrationContext, extensions,
0344 // propagatorPlainOptions, referenceSurface, multipleScattering,
0345 // energyLoss, freeToBoundCorrection
0346 
0347 /// Max number of iterations during the fit (abort condition)
0348 size_t nUpdateMax = 5;
0349 
0350 /// Disables the QoP fit in case of missing B-field
0351 bool zeroField = false;
0352 
0353 /// Check for convergence (abort condition). Set to 0 to skip.
0354 double relChi2changeCutOff = 1e-7;
0355 };
0356 ```
0357 
0358 Common options like the geometry context or toggling of the energy loss are similar to the other fitters.
0359 For now there are three *GX2F* specific options:
0360 1. `nUpdateMax` sets an abort condition for the parameter update as a maximum number of iterations allowed.
0361 We do not really want to use this condition, but it stops the fit in case of poor convergence.
0362 2. `zeroField` toggles the q/p-fit.
0363 If there is no magnetic field, we get no q/p-information.
0364 This would crash the fitter when needing matrix inverses.
0365 When this option is set to `true`, most of the matrices will omit the q/p-rows and -columns.
0366 3. `relChi2changeCutOff` is the desired convergence criterion.
0367 We compare at each step of the iteration the current to the previous $\chi^2$.
0368 If the relative change is small enough, we finish the fit.
0369 
0370 ### Pros/Cons
0371 
0372 There are some reasons for and against the *GX2F*.
0373 The biggest issue of the *GX2F* is its performance.
0374 Currently, the most expensive part is the propagation.
0375 Since we need to do a full propagation each iteration, we end up with at least 4-5 full propagation.
0376 This is a lot compared to the 2 propagations of the *KF*.
0377 However, since the *GX2F* is a global fitter, it can easier resolve left-right-ambiguous measurements, like in the TRT (Transition Radiation Tracker – straw tubes).
0378 
0379 [^billoir]: [https://twiki.cern.ch/twiki/pub/LHCb/ParametrizedKalman/paramKalmanV01.pdf](https://twiki.cern.ch/twiki/pub/LHCb/ParametrizedKalman/paramKalmanV01.pdf)
0380 [^cornelissen]: [https://cds.cern.ch/record/1005181/files/thesis-2006-072.pdf#page=80](https://cds.cern.ch/record/1005181/files/thesis-2006-072.pdf)