Revision as of 09:14, 4 June 2019

Abstract

Introduction

The CLAS12 is a charged and neutral particle detector constructed in Hall B of Jefferson Lab used to reconstruct nuclear physics interactions that result when GeV energy electrons interact with select targets. CLAS12's precursor, the original CLAS (CEBAF Large Acceptance Spectrometer), was built to facilitate measurements of excited states of nucleons and characterize nucleon-nucleon correlations. Contemporaneous with the 12GeV upgrade to Jefferson Lab's accelerator, CLAS was removed and CLAS12 was built in its place.\autocite{hallBMainPage}

CLAS12 is designed to not only continue the mission of its predecessor, but to also explore even deeper physics such measuring the generalized parton distributions of nucleons and searching for exotic forms of matter. It is currently slated to perform experiments extending for more than ten years into the future and will likely continue to contribute even after that.\autocite{hallBMainPage} Figure \ref{fig:clas12-design} shows a model of the detector with labels detailing some of the sub detector's and other components. It is important to note here that CLAS12 is not a singular detector, but instead a suite of detectors which can be correlated together in order to more precisely reconstruct the interactions that occur within the physics target.

Alongside the construction of CLAS12, several software packages have been created to handle simulating events, reconstructing events from detector data, and visualizing the reconstructed interactions. For simulation, the CLAS Collaboration has developed a package that provides all of the necessary information to GEMC\autocite{gemc} (GEant4 Monte-Carlo), a general program designed to make interfacing with Geant4\autocite{geant4} easier. The program used for viewing events is called CED (CLAS Event Display), and is a continuation of a program originally designed for use with the CLAS. Most importantly for this document, as it is the topic of discussion, is the reconstruction software. Known as clas12-offline-software\autocite{coatjava}, it is a collection of various (mostly) java programs used to process the data collected from CLAS12 and reconstruct the observed particle mass, momentum, and location within the detector.

This thesis is two-fold in purpose. First, it is intended to provide an understanding of the current methods used to reconstruct particle trajectories from the drift chamber sub-detectors of CLAS12. This includes both a theoretical understanding of the underlying math and physics as well as the actual implementation of those ideas. The second purpose is to present an additional method of reconstruction which takes the already existing drift chamber reconstruction output and further refines it so that interactions which arise from a rasterized beam may be reconstructed more accurately.

It should be noted that, in the context of this thesis, the primary motivation behind the development of this new reconstruction algorithm is to allow for the reconstruction of SIDIS (Semi-Inclusive Deep-Inelastic Scattering) events from a particular physics target which contains two polarized target regions separated by some distance. More details will be given about the target in the Simulation and Reconstruction chapter. SIDIS events are those in which one only considers the scattering particle, an electron in this case, and the highest energy hadron produced by the interaction, if any. The core idea is that when the scattering particle interacts with a parton in the struck particle, there will be a primary product hadron accompanied by a hadron shower. The kinematics of this primary product and the scattering particle are believed to encode most of the information about the struck parton.

IMG

Theory

This chapter is meant to provide sufficient knowledge and background to understand the general workings of the CLAS12 drift chambers, the most critical elements of the reconstruction process, and the rasterization of the scattering electrons. With regards to the drift chambers, particular attention is paid to the geometry and physical construction as that directly effects the acceptance of the detector. The elements of the most importance discussed here are the concepts of Runge-Kutta approximations and Kalman Fitters. Both are used heavily in the reconstruction process. And finally, the discussion about rasterized beams gives background as to why this project exists to begin with.

Drift Chambers

The CLAS12 drift chambers are designed to determine the momentum of charged particles moving through them. The following sections provide a brief overview of drift chambers in general.

Typical Construction and Principle of Operation

Drift chambers are charged particle detectors that are designed to measure the trajectory of a particle's path through the detector. This is the primary measurement used to calculate quantities of scientific interest by means of inverse kinematics.

Typically, drift chambers are constructed out of a chamber filled with an ionizable gas and planes of conducting wire. The wire planes are arranged to be ideally perpendicular to the particle's direction of motion and come in two alternating types: cathode planes and anode (sense) planes. The planes are designed to create a nearly-uniform static electric field which will drive electrons towards the sense planes. As described in the next paragraph, the sense wires collect ionized electrons, producing an electronic signal. That electronic signal propagates in both directions along the sense wire and away from the location of the ionization. The signal's propagation is measured by a pair of coupled timers at either end of the wire. It is common to have multiple sets of layers which are oriented differently, though still perpendicularly to the direction of motion, to increase the accuracy of the reconstructed trajectories.\autocite{sauli77}

The general principle of producing an electronic signal in a drift chamber is as follows. A charged particle moves through the drift chamber and ionizes atoms along its path. The electrons, which have been separated from these atoms, are now accelerated towards the anode wires (sense wires). If the anode wire is very thin, the electric field near to it becomes very strong, causing the electrons to pick up a great deal of speed and cause a Townsend avalanche. A Townsend avalanche is a cascading series of ionizations which help to amplify the signal.\autocite{leo87} This signal being a voltage pulse traveling towards either end of the wire from the point where the electrons hit. Using the coupled timers at the ends, it is then possible to use the difference in time of signal propagation to calculate the position along the wire of the hit.

This position along the axis of the wire is only one dimensional information about a particle traveling through 3D space. It is, however, possible to couple the timing information from multiple nearby sense wires and a measurement of the time the liberated electrons take to reach the sense wire to calculate the distance of closest approach (DOCA) to each of them. This then gives a position along the axis of each wire as well as a radius perpendicular to that axis at that point. If all of this information is known perfectly, then the vector component of the path which lies in the plane of a circle defined by the aforementioned radius will be tangent to that circle. Combining that information with the change in hit position along the axis of each wire allows for the ultimate measurement of the particle's path through the detector.

Addition of a Magnetic Field

The inclusion of a magnetic field into a drift chamber allows for the reconstruction of not just the path of the particle, but also the magnitude of its momentum. A uniform magnetic field perpendicular to the particle's direction of motion, for example, would cause the path to bend into some section of a circle, thus changing the expected hit position along the wires of the sense planes. Using these hits, it is then possible to reconstruct the radius of curvature of the path. With the assumption the particle carries the usual elementary charge, it is then possible to calculate the particle's momentum by equating the central force and magnetic force as shown in equation \ref{eq:momentumFromForces}.

\begin{equation} \frac{mv^2}{r}=qvB\implies{}mv=qBr \label{eq:momentumFromForces} \end{equation}

CLAS12

CLAS12's drift chambers are the primary source of path information for interaction reconstruction within the entire detector suite and are the focus of this document. As such, it is important to make as clear as possible their construction.

CLAS12's Drift Chamber Construction

The CLAS12 Detector's drift chamber is broken up into 18 separate sub-chambers as shown in figure \ref{fig:regionsAndSectors}. There are six chambers in each of three regions which are approximately 2.1, 3.3, and 4.5 meters away from the physics target position. Within each region, the six triangular chambers are arranged to be in symmetric sectors around the beam axis with angular coverage between 5\textdegree{} and 40\textdegree{} as measured at the target location with respect to the beam axis.\autocite{dcSpecSheet}

\begin{figure} \includegraphics[width=\textwidth]{./Img/DAF-19-03-24-DC-RegionsAndSectors.png} \caption{The three regions and six sectors per region in the drift chambers.\autocite{clas12DriftChambers}} \label{fig:regionsAndSectors} \end{figure}

Within each region and sector are two superlayers of wire planes which are arranged such that the axes of the wires lie in parallel planes that are rotated by 12\textdegree{} with respect to each other. Within each superlayer are 6 layers of sense wires. Each layer has 112 sense wires which have a hexagonal arrangement of cathode wires around them. These hexagonal arrangements are referred to as cells They grow in size from region 1 to region 3 and provide a spatial resolution of approximately 300\textmu{}m.\autocite{dcSpecSheet}

Magnetic Fields

The CLAS12 detector has two magnetic fields, one solenoidal field at the target and one toroidal field centered around the second drift chamber region. The solenoidal field points primarily in the direction of the beamline with a central value of 5 Tesla, and is designed to prevent Moller electrons, a primary background, from entering the detector.\autocite{solenoidSpecSheet} The toroidal field is actually the result of six smaller coils, one in each sector, which all contribute to create a field that is primarily in the phi (azimuthal about the beam line) direction.\autocite{toroidSpecSheet} This is designed to bend particles either into or away from the beam line through the drift chambers which allows for the reconstruction of the particle's momentum. See figure \ref{fig:magneticFields}.

\begin{figure} \includegraphics[width=\textwidth]{./Img/DAF-19-03-24-DC-MagneticFieldMagnitude.png} \caption{The strength of both the solenoidal (left) and toroidal (center) magnetic fields.} \label{fig:magneticFields} \end{figure}

Track Reconstruction Elements

This section describes the theory behind and implementation of several of the more opaque components of the reconstruction algorithms.

Coordinate Systems

The collaboration has defined two coordinate systems in addition to the lab system to reduce the complexity of the reconstruction process within the drift chambers. The lab coordinate system is defined by a right-handed system such that the positive y-direction is upwards, against the pull of gravity and the positve z-direction is down the beam line. This results in the positive x-direction bisecting what is known as sector one as shown in figure \ref{fig:labCoordinateSystem}.

\begin{figure} \includegraphics[width=\textwidth]{./Img/DAF-19-03-24-DC-LabCoordinateSystem.png} \caption{The lab coordinate system and sectors as seen when looking down the beam line.} \label{fig:labCoordinateSystem} \end{figure}

The sector coordinate system is defined by rotating the sector of interest into the position of sector one. This rotation will naturally be some multiple of 60\textdegree{} and will only take into account $\pm{}30$\textdegree about the 0\textdegree angle. The tilted sector coordinate system takes a sector coordinate system and rotates it about the y-axis by 25\textdegree{} such that the z-direction is now perpendicular to the plane of the drift chambers. Transformations between these different coordinate systems are typically handled by rotation matrices. The matrices \ref{mat:rotx}, \ref{mat:roty}, and \ref{mat:rotz} operate on vectors to rotate about the x, y, and z axes respectively. \autocite{arfken13}

\begin{equation} \begin{bmatrix} 1 & 0 & 0\\ 0 & cos(\theta) & -sin(\theta)\\ 0 & sin(\theta) & cos(\theta) \label{mat:rotx} \end{bmatrix} \end{equation} \begin{equation} \begin{bmatrix} cos(\theta) & 0 & -sin(\theta)\\ 0 & 1 & 0\\ sin(\theta) & 0 & cos(\theta) \label{mat:roty} \end{bmatrix} \end{equation} \begin{equation} \begin{bmatrix} 1 & 0 & 0\\ 0 & cos(\theta) & -sin(\theta)\\ 0 & sin(\theta) & cos(\theta) \label{mat:rotz} \end{bmatrix} \end{equation}

It follows then that to transform from the tilted sector coordinate system to the sector coordinate system, one would use matrix \ref{mat:roty} with $\theta=25\degree{}$. Likewise, to transform from the sector coordinate system for sector n to the lab coordinate system, one would use matrix \ref{mat:rotz} with $\theta=(n-1)30\degree{}$.

Runge-Kutta Approximation

The Runge-Kutta approximations are a family of numerical methods used to approximate solutions to ordinary differential equations. The simplest member of this family is Euler's method as shown in equations \ref{eq:eulerMethod:1} and \ref{eq:eulerMethod:2}. Here it is important to see that $f(t,y)=\frac{df}{dt}$ and h is the step size in t. It is straightforward to see that it is possible to achieve an arbitrary level of accuracy by allowing h to be arbitrarily small.\autocite{griffiths10}

\begin{equation} x_{n+1}=x_{n}+hf(t_{n},y{n}) \label{eq:eulerMethod:1} \end{equation} \begin{equation} t_{n+1}=t_{n}+h \label{eq:eulerMethod:2} \end{equation}

However, it is often either not possible or not feasible to allow h to become arbitrarily small. This has motivated the use of the higher order Runge-Kutta methods as they are more algorithmically efficient at giving approximations of the same quality. This allows for larger step sizes and thus fewer steps. Of these methods, the 4th order method, often referred to simply as "the Runge-Kutta method," is used in the drift chamber reconstruction process. At its core, the 4th order method is a weighted average of 4 different approximations as seen in equations \ref{eq:rungeKutta:1} through \ref{eq:rungeKutta:6}.\autocite{griffiths10}

\begin{equation} k_1=f(t_n,x_n) \label{eq:rungeKutta:1} \end{equation} \begin{equation} k_2=f(t_n+\frac{h}{2},x_n+h\frac{k_1}{2}) \label{eq:rungeKutta:2} \end{equation} \begin{equation} k_3=f(t_n+\frac{h}{2},x_n+h\frac{k_2}{2}) \label{eq:rungeKutta:3} \end{equation} \begin{equation} k_4=f(t_n+h,x_n+hk_3) \label{eq:rungeKutta:4} \end{equation} \begin{equation} x_{n+1}=x_n+\frac{h}{6}(k_1+2k_2+2k_3+k_4) \label{eq:rungeKutta:5} \end{equation} \begin{equation} t_{n+1}=t_n+h \label{eq:rungeKutta:6} \end{equation}

It is important to note here that all members of the Runge-Kutta family do accumulate errors as they step through more iterations. The error in a single step, called the Local Truncation Error, is of the order $O(h^{5})$, and the accumulated error is of the order $O(h^{4})$.\autocite{griffiths10}

Jefferson Lab's software team has implemented the 4th order algorithm into an object called a Swimmer. This Swimmer is given maps of the magnetic fields and a state vector which incudes the position, momentum, and charge of the particle of interest. The swimmer can then be called upon to swim (step) the particle until it meets certain position criteria such as a particular value along the z-axis.

Kalman Fitter

%TODO: NEED TO INCLUDE GENERAL KALMAN FITTING AS WELL AS THE SPECIFICS FROM KFITTER.JAVA

Rasterized Beam

A rastered beam is the intentional movement of the incident electrons, ideally translating without rotation, over the surface of the target. The primary purpose of this is to distribute the energy deposited into the target more uniformly and avoid localized target heating that would result in physical changes to the target. In the case of a cryogenic target, overheating can cause localized target boiling, reducing the effective length of the target, and result in a lower luminosity of the experiment. In the case of a polarized target, overheating can impact the ability of the target to be polarized resulting in a need to recover target polarization through annealing.

As already stated, the ideal case is a raster that purely translates the incident electrons in the plane perpendicular to the electron momentum without any rotation. The underlying assumption used to determine the kinematics of an observed scattering event is that the incident electrons were directed parallel to a coordinate axis. In the CLAS12, the reference coordinate system aligns the z-axis such that it points parallel to the incident electrons momentum. The scattered electron's azimuthal angle with respect to this axis is used to quantify kinematic variables in the reaction and the cross section. Most notably, this angle is used to determine the amount of four-momentum that was transferred to the target. It follows that an angular offset in the incident electron beam would be an error that propagates to the measured kinematics and cross-sections.

Unfortunately, it is nigh impossible to be able to achieve pure translation in a real lab setting. It is preferable then to simply minimize the changes in incident angle to such a degree that the contribution to systematic errors are minimal. In the common case of using a dipole magnet to create raster behavior, it is sufficient that the angular size of the target relative to the magnets position be quite small. This ensures that any particle that hits the target will also have a small relative change in angle compared to any other particle.

Simulation and Reconstruction

This chapter describes the steps taken to determine the efficacy of the proposed reconstruction algorithm. Included are a description of the physical setup of the actual polarized SIDIS dual target, a description of the simulations done to gather test data, and a discussion of both the current and proposed reconstruction algorithms. The ultimate goal of the proposed reconstruction algorithm is to be able to confidently state which of the two subtargets any given event spawned from.

Physical Setup

Figure \ref{fig:forestTarget} provides a general description of the target used in the simulation. In the simulation, the setup is centered in the middle of the solenoid magnet with a gap of four centimeters between the two targets. Each target is two centimeters long and has a radius of one centimeter. Each of the targets can be independently poalarized and may contain either NH\textsubscript{3} or ND\textsubscript{3}.

\begin{figure} \includegraphics[width=\textwidth]{./Img/TF_RGC_DualPoltTargFig_11-16-2017.png} \caption{A description of the general geometry of the polarized dual target.\autocite{forestTarget}} \label{fig:forestTarget} \end{figure}

Simulation

All of the data used to analyze the efficacy of the proposed reconstruction algorithm was generated using GEMC version 4a.2.4. GEMC\autocite{gemc} (GEant4 Monte-Carlo) takes in configuration files called GCards. These GCards direct the program to load in certain tagged releases of the files (geometry, magnetic fields, etc) required to simulate the desired experiment. The CLAS12 software team has created a repository containing various different revisions of tags which can be used.\autocite{clas12Tags} A sample GCard can be found in Appendix A.

Additionally, the GCards allow the user to set certain parameters within the simulation. Of particular importance are scaling the magnetic fields and specifying the manner in which the simulated particles are generated. The magnetic field scales are set by simply passing in the desired scale factor. The event generator is somewhat more complicated. It is broken into four different components: beam momentum, momentum spread, beam position, and position spread. The beam momentum has four parameters: simulated particle, momentum magnitude, momentum polar angle, and momentum azimuthal angle. The momentum spread has three parameters: plus or minus momentum spread, plus or minus polar angle, and plus or minus azimuthal angle. The beam position has three parameters: x-position, y-position, and z-position. the position spread has two parameters: spread radius in the x-y plane and z-position spread. A full description of all available options can be found at https://gemc.jlab.org/gemc/html/documentation/options.html.

It should be noted here that the parameters for the momentum were provided by Dr. Tony Forest\autocite{forestTarget} and that they are indicative of the expected kinematic range of the actual experiment.

\begin{table}[!ht] \centering \begin{tabular}{|c|c|c|c|c|} \hline & e- In Up & e- In Down & e- Out Up & e- Out Down \\ \hline Toroid Scale & -1.0 & -1.0 & 1.0 & 1.0 \\ \hline $|P|$ & $4.75\pm{}3.25GeV$ & $4.75\pm{}3.25GeV$ & $4.75\pm{}3.25GeV$ & $4.75\pm{}3.25GeV$ \\ \hline $\phi$ & $22.5\pm{}17.5\degree$ & $22.5\pm{}17.5\degree$ & $22.5\pm{}17.5\degree$ & $22.5\pm{}17.5\degree$\\ \hline $\theta$ & $0.0\pm{}180.0\degree$ & $0.0\pm{}180.0\degree$ & $0.0\pm{}180.0\degree$ & $0.0\pm{}180.0\degree$\\ \hline $X$ & $0.0cm$ & $0.0cm$ & $0.0cm$ & $0.0cm$ \\ \hline $Y$ & $0.0cm$ & $0.0cm$ & $0.0cm$ & $0.0cm$ \\ \hline $R$ & $1.0cm$ & $1.0cm$ & $1.0cm$ & $1.0cm$ \\ \hline $Z$ & $-3.0\pm{}1.0cm$ & $3.0\pm{}1.0cm$ & $-3.0\pm{}1.0cm$ & $3.0\pm{}1.0cm$ \\ \hline \hline & \textpi{}+ Up & \textpi{}+ Down & \textpi{}- Up & \textpi{}- Down \\ \hline Toroid Scale & 1.0 & 1.0 & -1.0 & -1.0 \\ \hline $|P|$ & $1.8\pm{}1.0GeV$ & $1.8\pm{}1.0GeV$ & $1.8\pm{}1.0GeV$ & $1.8\pm{}1.0GeV$ \\ \hline $\phi$ & $45.0\pm{}45.0\degree$ & $45.0\pm{}45.0\degree$ & $45.0\pm{}45.0\degree$ & $45.0\pm{}45.0\degree$\\ \hline $\theta$ & $0.0\pm{}180.0\degree$ & $0.0\pm{}180.0\degree$ & $0.0\pm{}180.0\degree$ & $0.0\pm{}180.0\degree$\\ \hline $X$ & $0.0cm$ & $0.0cm$ & $0.0cm$ & $0.0cm$ \\ \hline $Y$ & $0.0cm$ & $0.0cm$ & $0.0cm$ & $0.0cm$ \\ \hline $R$ & $1.0cm$ & $1.0cm$ & $1.0cm$ & $1.0cm$ \\ \hline $Z$ & $-3.0\pm{}1.0cm$ & $3.0\pm{}1.0cm$ & $-3.0\pm{}1.0cm$ & $3.0\pm{}1.0cm$ \\ \hline \end{tabular} \caption{Descriptions of the non-default parameters used generate the eight data sets. Inbending and outbending are shortened to In and Out respectively. Upstream and Downstream are shortened to Up and Down respectively.} \label{tab:gemcParams} \end{table}

Four pairs of data sets were generated for this analysis. Each pair consists of one 5000 event data set generated uniformly within each subtarget and was later combined to make a single 10000 event data set. The four ultimate data sets are labled as such: electron-inbending, electron-outbending, postive pion, and negative pion. The two electron set parameters differ only by the scale of the toroid. In one case it is set such that electrons bend into the beam line and in the other case they bend away. In the case of the pions, the toroid scale is set such that particles always bend into the beamline. Table \ref{tab:gemcParams} shows all of the non-default parameters used to generate each data set.

After GEMC has been run, the resulting EVIO (Event Input Output) files must be converted to HIPO (High Performance Output) files. This is done by calling one of the utilities packaged in clas12-offline-software, evio2hipo. Once that is done, the upstream and downstream components of each pair of data sets are merged using another clas12-offline-software program, hipo-utils. Examples for running all of these can be found in Appendix A. All that remains then is to run the HIPO files through the two pre-existing reconstruction algorithms and then the proposed algorithm.

Current Reconstruction Algorithms

There are currently two drift chamber reconstruction algorithms. The first, called Hit Based, feeds into the second, called Time Based. The following paragraphs are brief descriptions of both of them. Appendix B holds lengthy descriptions of both algorithms.

The Hit Based algorithm is the first pass at reconstruction and is largely used to lay groundwork for the Time Based algorithm. In broad strokes, the Hit Based algorithm works as by first collecting all of the drift chamber hits and grouping them by their superlayer. Those groups are then processed and a linear fit is attmepted. That fit is then used to find a position-momentum state vector between the superlayers of each region. Finally, a combination of a Kalman Fitter and the Runge-Kutta based Swimmer are used to reconstructed the trajectory of the particle and, thus, the interaction vertex.

The Time Based algorithm works by taking the tracks already reconstructed by the Hit Based reconstruction and refining them further. First, the groups of hits are regathered and some of the likely-extraneous hits are culled using timing information. Those groups then undergo another test to check whether or not the entire group should be thrown out or not. The Kalman-Fitter and Swimmer are then used again to find the new trajectories and interaction vertices.

Both of these reconstruction algorthims are run on each HIPO file by calling the notsouseful-util program that is packaged as a part of clas12-offline-software. Once again, an example of of the command to call the program can be found in Appendix A.

Proposed Reconstruction Algorithm

The Raster Based algorithm takes the output of both the Time and Hit Based algorithms and attempts to fit them to an off-axis beam position. This is achieved by performing a grid search through a small region of momentum space about a state vector provided by older algorithms. The actual vertex is found by swimming all of these slightly varied state vectors to the beam line and calculating the DOCA. Please note that the source code for this is found in its entirety in Appendix C.

This program currently only exists in a fork off of Jefferson Lab's official GitHub repository. The address for this is https://github.com/friadavi/clas12-offline-software. Should the reader wish to run the program, they may run the fritracking program which is packaged with the rest of the clas12-offline-software in the aforementioned fork. An example of calling the program may be found in Appendix A.

For this analysis, the data sets reconstructed by the Hit and Time based algorithms were reconstructed multiple times while the two grid search parameters, span and samples, were varied independently. The span was varied, incusively, between values of 0.001 and 0.015 (0.1 and 1.5\%). The number of samples was varied, again inclusively, between 3 and 11. This allows for an analysis of the reconstruction as a function of the two independently.

Parameters

This algorithm takes a variety of input parameters which control both the extent and granularity of the two searches involved as well as various other variables. These parameters are listed and explained below.

\begin{description} \item[Input]The name and path to the input file to reconstruct. \item[Output]The name and path of the output file. Any preexisting file will be overwritten. \item[Engine]The output from which engine to reconstruct. The options at present are Hit Based and Time Based. In the event that this parameter is not specified, the algortihm will preferentially take the Time Based if available, and Time Based if not. \item[Solenoid]The scale factor of the magnetic field generated by the solenoid. \item[Toroid]The scale factor of the magnetic field generated by the toroid. \item[Grid Search Span]The span in each cartesian dimension in momentum space about which the grid search will take place. This value is provided as a percentage such that the span of the search is $p_{x,y,z}(1\pm{}value)$. \item[Grid Search Samples]The number of evenly spaced sample points to take in each momentum dimension. This includes the edges of the span, as such this number should never be less than two. It is recommended to always use an odd number, that way the starting value is included in the search. \item[Vertex Search Upper Bound]The upper bound in the z-direction in which to search for the DOCA to the beam postion. \item[Vertex Search Lower Bound]The lower bound in the z-direction in which to search for the DOCA to the beam postion. \item[Vertex Samples]The number of evenly spaced sample points to take in each iteration of the vertex search. This includes upper and lower bounds, as such this number should not be less than two. \item[Vertex Iterations]The number of recursive iterations to go through in the vertex search. \item[Threads]The maximum number of threads that can be used in the reconstruction. \end{description}

Overarching Algorithm

The algorithm begins by parsing commandline options; it stop the reconstruction if any errors occur. After that, the magnetic field maps are initialized and scaled if necessary. Then, for each event within the input file, the reconstruction algorithm is run and the output, if any, is written to the output file.

Reconstruction

This begins by reading in whichever output banks are required, as dictated by the Engine commandline parameter. Then the beam position is read in from the simulation bank. If the event is real data, then the beam position is assumed to be at $x=0, y=0$. There is currently an assumed radius of error of 0.5mm in the beam position. The algorithm then collects a state vector output by the Hit Based and Time Based algorithm. This state vector's position is just upstream of the region 1 drift chamber and will be the starting point of the search. The grid search is run and the output is written to the data banks.

Grid Search

This sub-algorithm starts by computing the upper and lower bounds of the momentum space in which to search. Arrays to store the output of each search point are then created and a ThreadPool is established to allow for the multithreading of the vertex searches for each grid search sample point. A vertex search job thread is then submitted for every sample point in the three dimensional momentum search space. For the Grid Search Samples value n, there are $n^{3}$ sample points, each evenly spaced in all three dimesions.

After all submitted jobs have been completed, the algorithm searches through all of the outputs and identifies interaction vertices which are either within the beam line's margin of error or a local minimum. All other points are culled. Any points whose interaction vertex z value is outside of the vertex search bounds is culled. Finally, the algorithm searches through all surviving interaction vertices and selects for output the one whose initial starting momentum deviated the least from the Hit Based or Time Based momentum.

Vertex Search

This sub-algorithm begins by swimming from the state vector provided by the grid search to each sample z value within the search region. The samples points are evenly spaced throughout the region and include the boundaries. The swim outputs are collected, and the DOCA for each one is calculated. All local minima amongst the DOCAs are added to a list. If this is the last iteration, the algorithm returns the output which corresponds to the smallest DOCA, failing gracefully if no swims performed successfully.

In the event that there are more iterations to go, this algorithm recursively calls itself on each local minima. The bounds of the recursive search search are as follows: if the minimum is at the a boundary of the search region it is recentered on the boundary without changing the actual distance spanned and the function is called without decrementing the iteration count. Otherwise, the search region is recentered on the point associated with the local minimum and the boundaries are shrunk to coincide with the nearest neighbor sample points. The iteration count is decremented and the function is called. The ouput of each new search is collected and the one with the smallest DOCA is returned.

@@ Line 154: / Line 154: @@
 Unfortunately, it is nigh impossible to be able to achieve pure translation in a real lab setting. It is preferable then to simply minimize the changes in incident angle to such a degree that the contribution to systematic errors are minimal. In the common case of using a dipole magnet to create raster behavior, it is sufficient that the angular size of the target relative to the magnets position be quite small. This ensures that any particle that hits the target will also have a small relative change in angle compared to any other particle.
-=Algorithms=
+=Simulation and Reconstruction=
-This chapter contains descriptions of the previously existing drift chamber reconstruction algorithms, Hit Based and Time Based, as well as a description of the algorithm that is the focus of this document, Raster Based.
+This chapter describes the steps taken to determine the efficacy of the proposed reconstruction algorithm. Included are a description of the physical setup of the actual polarized SIDIS dual target, a description of the simulations done to gather test data, and a discussion of both the current and proposed reconstruction algorithms. The ultimate goal of the proposed reconstruction algorithm is to be able to confidently state which of the two subtargets any given event spawned from.
-==Definition of Terms==
+==Physical Setup==
-There are several terms which must be first understood before reading further:
+Figure \ref{fig:forestTarget} provides a general description of the target used in the simulation. In the simulation, the setup is centered in the middle of the solenoid magnet with a gap of four centimeters between the two targets. Each target is two centimeters long and has a radius of one centimeter. Each of the targets can be independently poalarized and may contain either NH\textsubscript{3} or ND\textsubscript{3}.
-\begin{description}
-\item[DOCA]Distance of Closest Approach, i.e. the shortest distance between some point in space and some defined path.
-\item[Hit]A representation of a measurement from an individual wire.
-\item[Cluster]A representation of groups of physically close hits.
-\item[Segment]A representation of a linear fit to a cluster.
-\item[Cross]A representation of a position and momentum unit vector of the particle's trajectory at the midplane between two superlayers as reconstructed from the segments of both superlayers.
-\item[Trajectory]A general position and momentum vector at some point in space.
-\item[Track]A representation of a reconstructed path through the detector.
-\item[Road]A representation of a collection of segments.
-\item[Swimmer]A software tool utilizing a fourth order Runge-Kutta approximation to step, or "swim", a particle's trajectory through the detector.
-\end{description}
-==Hit Based Algorithm==
-The Hit Based algorithm takes the raw drift chamber hit positions, but not any time information beyond that which gives the position along any given wire, and attempts to perform track reconstruction. In general, this is simply used as a step to then perform the Time Based reconstruction.
-===Hits to Clusters===
-Raw hits are all initially sorted into clumps. Clumps are the rawest form of a cluster as they are created by independently taking each superlayer in each sector and adding the hits found therein to the clump. The clump is then processed to remove hits caused by noise. This is determined by whether or not the DOCA for that hit is greater than the size of the cell. The list of clumps is then processed into a list of clusters by removing any clumps which have less than four total hits within them. A linear fit is then performed on the hits which comprise each cluster. The clusters are split into multiple clusters if necessary by means of checking the \textchi{}\textsuperscript{2} of a linear fit. Following that step, the program updates the raw hits with various additional information and stores them as well as the clusters.
-===Clusters to Segments===
+\begin{figure}
-Segments are built by first excluding any clusters with more than fourteen hits from consideration. Then, for each remaining cluster, the information from the linear fit is taken from the tests done in the previous step and set in the segment accordingly. Some segments may then be culled if their average DOCA between the raw hit and the cluster fit line is too large.
+\includegraphics[width=\textwidth]{./Img/TF_RGC_DualPoltTargFig_11-16-2017.png}
+\caption{A description of the general geometry of the polarized dual target.\autocite{forestTarget}}
+\label{fig:forestTarget}
+\end{figure}
-===Segments to Crosses===
+==Simulation==
-Crosses are created by looping over all sectors and regions in order and identifying segments in the first superlayer of each region. For each of those segments, it then loops through again to find a corresponding segment in the second superlayer. A check is done to see if the segments are both physically close and have a consistent slope from one to the other. If the segments pass both of those requirements, the cross is created.
+All of the data used to analyze the efficacy of the proposed reconstruction algorithm was generated using GEMC version 4a.2.4. GEMC\autocite{gemc} (GEant4 Monte-Carlo) takes in configuration files called GCards. These GCards direct the program to load in certain tagged releases of the files (geometry, magnetic fields, etc) required to simulate the desired experiment. The CLAS12 software team has created a repository containing various different revisions of tags which can be used.\autocite{clas12Tags} A sample GCard can be found in Appendix A.
-Crosses are then arranged into short lists which correspond to possible tracks. To do this, all of the crosses are first sorted into a list containing only crosses from its region. Then, for every possible combination of three crosses such that there is always a cross from each region, the algorithm performs an approximate fit. This fit returns the trajectory of the fit at each of the cross positions. If the angle between any pair of corresponding crosses and trajectories is too great, the triplet of crosses is not considered for a track reconstruction. Those do that pass this test are then checked again. This time a projection of the track onto the x-y plane is checked for linearity. Any triplets remaining are now bases for track candidates.
+Additionally, the GCards allow the user to set certain parameters within the simulation. Of particular importance are scaling the magnetic fields and specifying the manner in which the simulated particles are generated. The magnetic field scales are set by simply passing in the desired scale factor. The event generator is somewhat more complicated. It is broken into four different components: beam momentum, momentum spread, beam position, and position spread. The beam momentum has four parameters: simulated particle, momentum magnitude, momentum polar angle, and momentum azimuthal angle. The momentum spread has three parameters: plus or minus momentum spread, plus or minus polar angle, and plus or minus azimuthal angle. The beam position has three parameters: x-position, y-position, and z-position. the position spread has two parameters: spread radius in the x-y plane and z-position spread. A full description of all available options can be found at https://gemc.jlab.org/gemc/html/documentation/options.html.
-===Crosses to Tracks Part I===
+It should be noted here that the parameters for the momentum were provided by Dr. Tony Forest\autocite{forestTarget} and that they are indicative of the expected kinematic range of the actual experiment.
-The algorithm now takes the triplets of crosses from the previous step and attempts to fit actual tracks from them. For each triplet, the first step is to perform another fit, this time parametric. This fit is formed by constraining the parametric trajectory to be tangent to the momentum unit vector of each cross as it passes through them. Any triplets which fail to produce a satisfactory trajectory are culled. At this point the algorithm splits finding track candidates into two branches. The first is the case where the cross triplet is indeed a full triplet without any members missing, and the torus magnet has been scaled down to a value near zero. This is looking for straight tracks in the torus region. Otherwise the algorithm will search for linked helices, one from the torus region and one from the solenoid.
-====Straight Tracks====
+\begin{table}[!ht]
-This sub-algorithm first fits the projections of the crosses onto both the x-z and y-z planes to lines. If the fit is ok for each of those, the track is then assumed to be straight and the interaction vertex is assumed to be wherever the linear fits intersect the $z=0$ plane. The particle is also assumed to be a muon. This track candidate is then further refined by passing it through a Kalman fitter. The candidate is then added to the list of track candidates.
+\centering
+\begin{tabular}{|c|c|c|c|c|}
+\hline
+				& e- In Up					& e- In Down				& e- Out Up					& e- Out Down			\\
+\hline
+Toroid Scale	& -1.0						&	-1.0					& 1.0						& 1.0					\\
+\hline
+$|P|$			& $4.75\pm{}3.25GeV$		& $4.75\pm{}3.25GeV$		& $4.75\pm{}3.25GeV$		& $4.75\pm{}3.25GeV$	\\
+\hline
+$\phi$			& $22.5\pm{}17.5\degree$	& $22.5\pm{}17.5\degree$	& $22.5\pm{}17.5\degree$	& $22.5\pm{}17.5\degree$\\
+\hline
+$\theta$		& $0.0\pm{}180.0\degree$	& $0.0\pm{}180.0\degree$	& $0.0\pm{}180.0\degree$	& $0.0\pm{}180.0\degree$\\
+\hline
+$X$				& $0.0cm$					& $0.0cm$					& $0.0cm$					& $0.0cm$				\\
+\hline
+$Y$				& $0.0cm$					& $0.0cm$					& $0.0cm$					& $0.0cm$				\\
+\hline
+$R$				& $1.0cm$					& $1.0cm$					& $1.0cm$					& $1.0cm$				\\
+\hline
+$Z$				& $-3.0\pm{}1.0cm$			& $3.0\pm{}1.0cm$			& $-3.0\pm{}1.0cm$			& $3.0\pm{}1.0cm$		\\
+\hline
+\hline
+				& \textpi{}+ Up 			& \textpi{}+ Down 			& \textpi{}- Up 			& \textpi{}- Down		\\
+\hline
+Toroid Scale	& 1.0						& 1.0						& -1.0						& -1.0					\\
+\hline
+$|P|$			& $1.8\pm{}1.0GeV$			& $1.8\pm{}1.0GeV$			& $1.8\pm{}1.0GeV$			& $1.8\pm{}1.0GeV$		\\
+\hline
+$\phi$			& $45.0\pm{}45.0\degree$	& $45.0\pm{}45.0\degree$	& $45.0\pm{}45.0\degree$	& $45.0\pm{}45.0\degree$\\
+\hline
+$\theta$		& $0.0\pm{}180.0\degree$	& $0.0\pm{}180.0\degree$	& $0.0\pm{}180.0\degree$	& $0.0\pm{}180.0\degree$\\
+\hline
+$X$				& $0.0cm$					& $0.0cm$					& $0.0cm$					& $0.0cm$				\\
+\hline
+$Y$				& $0.0cm$					& $0.0cm$					& $0.0cm$					& $0.0cm$				\\
+\hline
+$R$				& $1.0cm$					& $1.0cm$					& $1.0cm$					& $1.0cm$				\\
+\hline
+$Z$				& $-3.0\pm{}1.0cm$			& $3.0\pm{}1.0cm$			& $-3.0\pm{}1.0cm$			& $3.0\pm{}1.0cm$		\\
+\hline
+\end{tabular}
+\caption{Descriptions of the non-default parameters used generate the eight data sets. Inbending and outbending are shortened to In and Out respectively. Upstream and Downstream are shortened to Up and Down respectively.}
+\label{tab:gemcParams}
+\end{table}
-====Linked Helices====
+Four pairs of data sets were generated for this analysis. Each pair consists of one 5000 event data set generated uniformly within each subtarget and was later combined to make a single 10000 event data set. The four ultimate data sets are labled as such: electron-inbending, electron-outbending, postive pion, and negative pion. The two electron set parameters differ only by the scale of the toroid. In one case it is set such that electrons bend into the beam line and in the other case they bend away. In the case of the pions, the toroid scale is set such that particles always bend into the beamline. Table \ref{tab:gemcParams} shows all of the non-default parameters used to generate each data set.
-First, track candidates that don't have at least five segments within their crosses are culled. Then four initial fits are performed, one for each combination of one of the region one segments being coupled with one of the region three segments. In these fits, the slopes of the segments are used to calculate the momentum and charge of the particle. Those values are then input into a swimmer which will swim from the region one cross to thex-y plane of the region two cross and then on to the x-y plane of the region three cross. If the \textchi{}\textsuperscript{2} is bad or the integrated magnetic field over the path is zero, the candidate is culled.
-The momentum and charge are reconstructed, again, from the second segment of the region 1 and 3 crosses. If the momentum is greater than 11GeV, it is set to 11GeV. A state vector is formed from combining the region 1 cross information with the momentum. A Kalman fitter is created and attempts to fit the crosses. If this fails, the candidate is culled. The \textchi{}\textsuperscript{2} of the fit between the Kalman fit and the crosses is calcualted, and the candidate is culled if the value is too large. One final check to see if the Kalman fit either failed or if the final state vector produced by it is done. If passed, the candidate is added to the list of track candidates.
+After GEMC has been run, the resulting EVIO (Event Input Output) files must be converted to HIPO (High Performance Output) files. This is done by calling one of the utilities packaged in clas12-offline-software, evio2hipo. Once that is done,  the upstream and downstream components of each pair of data sets are merged using another clas12-offline-software program, hipo-utils. Examples for running all of these can be found in Appendix A. All that remains then is to run the HIPO files through the two pre-existing reconstruction algorithms and then the proposed algorithm.
-===Crosses to Tracks Part II===
+==Current Reconstruction Algorithms==
-Now that the sub-algorithms have finished, the overall algorithm now attempts to extrapolate from the existing information to see if any other track candidates can be constructed. First, any tracks overlapping the track with the best \textchi{}\textsuperscript{2} are culled. The algorithm then constructs roads out of all of the currently existing segments. This process, in essence, simply orders the segments in terms of sectors and superlayers then finds sequences of them which could indicate a track. A list of crosses belonging to those roads is then constructed. Missing road segments, called pseudosegments, are extrapolated from the existing segments.
+There are currently two drift chamber reconstruction algorithms. The first, called Hit Based, feeds into the second, called Time Based. The following paragraphs are brief descriptions of both of them. Appendix B holds lengthy descriptions of both algorithms.
-The pseudosegments are added to the preexisting list of segments, and the process of making crosses is undertaken again. These new crosses are combined with the old ones and used to create new cross triplets. These triplets are then fed into the track candidate finder, as per Crosses to Tracks Part I. Once again, overlapping tracks are culled. If any track candidates have survived, they are written to the Hit Based data banks.
+The Hit Based algorithm is the first pass at reconstruction and is largely used to lay groundwork for the Time Based algorithm. In broad strokes, the Hit Based algorithm works as by first collecting all of the drift chamber hits and grouping them by their superlayer. Those groups are then processed and a linear fit is attmepted. That fit is then used to find a position-momentum state vector between the superlayers of each region. Finally, a combination of a Kalman Fitter and the Runge-Kutta based Swimmer are used to reconstructed the trajectory of the particle and, thus, the interaction vertex.
-==Time Based Algorithm==
+The Time Based algorithm works by taking the tracks already reconstructed by the Hit Based reconstruction and refining them further. First, the groups of hits are regathered and some of the likely-extraneous hits are culled using timing information. Those groups then undergo another test to check whether or not the entire group should be thrown out or not. The Kalman-Fitter and Swimmer are then used again to find the new trajectories and interaction vertices.
-The Time Based algorithm takes the output of the Hit Based Algorithm and includes more time information in order to refine the reconstruction.
-===Hits to Clusters===
+Both of these reconstruction algorthims are run on each HIPO file by calling the notsouseful-util program that is packaged as a part of clas12-offline-software. Once again, an example of of the command to call the program can be found in Appendix A.
-This reconstruction begins by recomposing the clusters from the Hit Based Reconstruction and further refining them. First, for each cluster the algorithm removes secondary hits, which are defined as hits that lie within the same layer as another hit and fail a test comparing the relative DOCA's. The clusters are then refit with the aforementioned \textchi{}\textsuperscript{2} test if necessary to ensure that the most accurate clusters have been found. Next, the algorithm attempts to resolve the left-right ambiguity in the cluster's linear fit, i.e. determine on which side of which wires the particle passed. Finally, the clusters and associated hits are updated with this new information.
-===Clusters to Segments===
+==Proposed Reconstruction Algorithm==
-Each cluster is subjected to a test wherein the average DOCA of the hits within the cluster is calculated and then compared to the average cell size of the cells where those hits were recorded. Clusters which fail this test are culled. The clusters then undergo the same process as the Hit Based clusters. All of the Hit Based tracks are now read into memory and the hit-segment associations are set accordingly.
+The Raster Based algorithm takes the output of both the Time and Hit Based algorithms and attempts to fit them to an off-axis beam position. This is achieved by performing a grid search through a small region of momentum space about a state vector provided by older algorithms. The actual vertex is found by swimming all of these slightly varied state vectors to the beam line and calculating the DOCA. Please note that the source code for this is found in its entirety in Appendix C.
-===Reconstructing Tracks===
+This program currently only exists in a fork off of Jefferson Lab's official GitHub repository. The address for this is https://github.com/friadavi/clas12-offline-software. Should the reader wish to run the program, they may run the fritracking program which is packaged with the rest of the clas12-offline-software in the aforementioned fork. An example of calling the program may be found in Appendix A.
-The algorithm now begins track reconstruction, failing if the Hit Based Tracks were not saved correctly. First, all of the Hit Based track information stored in the data banks are read into an array, and all of the segment-track associations are set. The algorithm now considers each track, failing if the track has less than four segments, and creating crosses in the same fashion as the Hit Based algorithm. These tracks are now run through a Kalman fitter. The track is culled of the interaction vertex is too large. Finally, for each surviving track, the overall trajectory is calculated and associated hits and segements are set accordingly.
-==Raster Based Algorithm==
+For this analysis, the data sets reconstructed by the Hit and Time based algorithms were reconstructed multiple times while the two grid search parameters, span and samples, were varied independently. The span was varied, incusively, between values of 0.001 and 0.015 (0.1 and 1.5\%). The number of samples was varied, again inclusively, between 3 and 11. This allows for an analysis of the reconstruction as a function of the two independently.
-The Raster Based algorithm takes the output of both the Time and Hit Based algorithms and attempts to fit them to an off-axis beam position. This is achieved by performing a grid search through a small region of momentum space about a state vector provided by older algorithms. The actual vertex is found by swimming all of these slightly varied state vectors to and finding their DOCAs to the beam position.
 ===Parameters===
@@ Line 226: / Line 249: @@
 \item[Toroid]The scale factor of the magnetic field generated by the toroid.
 \item[Grid Search Span]The span in each cartesian dimension in momentum space about which the grid search will take place. This value is provided as a percentage such that the span of the search is $p_{x,y,z}(1\pm{}value)$.
-\item[Grid Search Samples]The number of evenly spaced sample points to take in each dimension. This includes the edges of the span, as such this number should never be less than two. It is recommended to always use an odd number, that way the starting value is included in the search.
+\item[Grid Search Samples]The number of evenly spaced sample points to take in each momentum dimension. This includes the edges of the span, as such this number should never be less than two. It is recommended to always use an odd number, that way the starting value is included in the search.
 \item[Vertex Search Upper Bound]The upper bound in the z-direction in which to search for the DOCA to the beam postion.
 \item[Vertex Search Lower Bound]The lower bound in the z-direction in which to search for the DOCA to the beam postion.
@@ Line 234: / Line 257: @@
 \end{description}
-===Overaching Alogrithm===
+===Overarching Algorithm===
 The algorithm begins by parsing commandline options; it stop the reconstruction if any errors occur. After that, the magnetic field maps are initialized and scaled if necessary. Then, for each event within the input file, the reconstruction algorithm is run and the output, if any, is written to the output file.

Difference between revisions of "DF Thesis"