broken track simulation
- class module0_flow.misc.broken_track_sim.BrokenTrackSim(**params)
Bases:
h5flow.core.H5FlowStageGenerates a realistic broken track distribution by randomly translating reconstructed track hits and removing hits that cross disabled sections of the anode plane.
The algorithm is:
select a random “source” track within the event passing a length selection cut
translate the random track in x,y such that the track is still contained
mask off hits that fall on disabled channels
re-run track reconstruction on new hit distribution
label new tracks as broken according the overlap of the new track hits with the old source track and the distance of their endpoints
- Parameters:
path:str, path to output datasets within HDF5 filegenerate_2track_joint_pdf:bool, flag to generate an output .npz file that can be used by the the track merging reconstructionjoint_pdf_filename:str, path of output .npz file (if generated)pdf_bins:listoflist, bin description for each parameter in output pdf, each formatted as(log10(min), log10(max), nbins)rand_track_length_cut:float, track length cut for source track [mm]broken_track_distance_cut:float, cut on the distance of the 2nd-closest new track endpoint from the closest source endpoint to label a track as brokentracks_dset_name:str, path to input tracks datasethit_drift_dset_name:str, path to charge hit drift datahits_dset_name:str, path to input charge hits dataset
All of
tracks_dset_name,hits_dset_name, andhit_drift_dset_nameare required in the cache.Requires Geometry and DisabledChannels resources in workflow.
offsetdatatype (1:1 with event):id u4, unique identifier dx f8, x translation applied to event dy f8, y translation applied to event i_track i8, index of track within event used as source
labeldatatype (1:1 with new track dataset):id u4, unique identifier match u1, 1 if new track is matched to the source track broken u1, 1 if new track is broken neighbor i4, index of neighboring track hit_frac f4, fraction of hits that came from source track true_endpoint_d f4(2,), minimum distance endpoints to source track endpoints neighbor_deflection_angle f4, deflection angle of track and its neighbor neighbor_transverse_sin2theta f4, transverse endpoint angle of track to its neighbor neighbor_missing_length f4, missing length of track to its neighbor neighbor_overlap f4, overlap of track and its neighbor neighbor_sin2theta f4, angle of track and its neighbor
The new
trackletsdataset datatype is the same asTrackletReconstruction.tracklet_dtype.- apply_translation(hits, rand_x, rand_y)
- class_version = 3.1.0
- default_pdf_bins = [(), (), (0, 3, 30), (), ()]
- find_matching_tracks(new_tracks, rand_tracks, rand_x, rand_y, track_ids, hits_track_idx)
- finish(source_name)
- generate_random_translation(rand_tracks)
- init(source_name)
- missing_track_segments = 200
- new_track_dtype
- new_track_label_dtype
- offset_dtype
- run(source_name, source_slice, cache)
- select_random_track(tracks)
- setup_reco()
- truth_hit_frac_cut = 0.8
- class module0_flow.misc.broken_track_sim.TrackletMerger(**params)
Bases:
h5flow.core.H5FlowStageMerges existing tracks with neighbors based on a multi-dimensional likelihood ratio metric. The observables used in the likelihood estimation are:
sin^2(theta): angle between the two track segmentstransverse distance: maximum transverse displacement of track from the axis of the first track [mm]
missing length: length of line segment between closer two endpoints that crosses active pixels [mm]
overlap: quadrature sum of 1D overlap of tracks in x, y, and z [mm]
delta-dQ/dx: difference in raw dQ/dx [mV]
Requires an input histogram .npz file consisting of 4 arrays:
'{sig}': an array of shape:(N0, N1, ... N4)representing the number of signal events in each bin of the 5 observables'{sig}_bins': an array of 5 arrays each with shape:Ni+1representing the bin edges'{bkg}': an array of shape:(N0, N1, ... N4)representing the number of background events in each bin of the 5 observables
The selection is performed by normalizing the input histograms to a PDF, calculating the
signal/backgroundlikelihood ratio, and rescaling to a normalized metric between 0 and 1. The p-value (or inefficiency) of this metric is calculated based on the signal histogram. The track merging selection cut is applied on this p-value, e.g. apvalue_cut = 0.05will result in a 95% selection efficiency for merging neighboring tracks (at least for the sample used to generate the input histograms).- Parameters:
pdf_filename:str, path to .npz file containing multi-dimensional pdf (more details above)pdf_sig_name:str, name of array in .npz file containing the “signal” histogrampdf_bkg_name:str, name of array in .npz file containing the “background” histogrampvalue_cut:float, p-value/inefficiency used as cut for likelihood ratiomax_neighbors:int, number of neighbor tracks to attempt merge proceduretrack_charge_dset_name:str, path to input charge dataset (1:1 with track hits, requires'q'field)hit_drift_dset_name:str, path to charge hit drift datahits_dset_name:str, path to input charge hits datasettrack_hits_dset_name:str, path to input track-referred charge hits datasettracks_dset_name:str, path to input track datasetmerged_dset_name:str, path to output track dataset
All of
hits_dset_name,hit_drift_dset_name,track_hits_dset_name, andtracks_dset_nameare required in the cache.Requires both Geometry and DisabledChannels resources in workflow.
mergeddatatype is the same as theTrackletReconstruction.tracklet_dtype.Example config:
track_merge: classname: TrackletMerger requires: - 'combined/tracklets' - name: 'combined/track_hits path: ['combined/tracklets', charge/hits'] - name: 'combined/track_hit_drift path: ['combined/tracklets', charge/hits', 'combined/hit_drift'] params: merged_dset_name: 'combined/tracklets/merged' hit_drift_dset_name: 'combined/hit_drift' hits_dset_name: 'charge/hits' track_charge_dset_name: 'charge/hits' tracks_dset_name: 'combined/tracklets' pdf_filename: 'joint_pdf.npz' pvalue_cut: 0.10 max_neighbors: 5
- static calc_2track_deflection_angle(tracks, neighbor)
- static calc_2track_missing_length(tracks, neighbor, missing_track_segments, pixel_x, pixel_y, disabled_channel_lut, cathode_region, pixel_pitch=None)
- static calc_2track_overlap(tracks, neighbor)
- static calc_2track_sin2theta(tracks, neighbor)
- static calc_2track_transverse_sin2theta(tracks, neighbor)
- cathode_region = 15
- class_version = 3.1.0
- static closest_trajectories(tracks0, tracks1)
- Parameters:
tracks0 – track dtype of shape:
(..., M,)tracks1 – track dtype of shape:
(..., M,)
- Returns:
start and end points of closest trajectory segments and points of closest approach, shape:
(..., M, 3)
- static create_groups(mask)
Combine masks of
n x najacency matrix such that the mask of row i is equal to theORof the rows that can be reached fromiand the rows that can reachi. E.g.:arr = [[1,0,1], [0,1,0], [0,0,1]] new_arr = create_groups(arr) new_arr # [[1,0,1], [0,1,0], [1,0,1]]
and:
arr = [[0,1,0], [0,0,1], [1,1,0]] new_arr = create_groups(arr) new_arr # [[1,1,1], [1,1,1], [0,1,1]]
- Parameters:
mask – ajacency matrix (
shape: (..., n, n))- Returns:
updated ajacency matrix (
shape: (..., n, n))
- default_hit_drift_dset_name = combined/track_hit_drift
- default_hits_dset_name = charge/hits
- default_max_neighbors = 5
- default_merged_dset_name = combined/tracklets/merged
- default_pdf_bkg_name = origin
- default_pdf_filename = joint_pdf-2_0_1.npz
- default_pdf_sig_name = rereco
- default_pvalue_cut = 0.1
- default_track_charge_dset_name = charge/hits
- default_track_hits_dset_name = combined/track_hits
- default_tracks_dset_name = combined/tracklets
- static find_k_neighbor(tracks, mask=None, k=1)
Find
k-th neighbor based on endpoint distance and require no overlap:tracksis an (N,M) array of tracksmaskis boolean of same shape astracksmasktrue indicates a valid track to search for neighbors
- init(source_name)
- static load_r_values(filename, sig_key, bkg_key)
Load the N-D pdf histogram from an .npz file. Loads and normalizes the histograms stored under
{sig_key}and{bkg_key}with bins stored under{key}_binsto create a PDF. The likelihood ratio (R) is then calculated and converted to a normalized value between 0-1 (r) with the following transformation:r = 1 - e^(-R)
Bins with 0 entries are assigned an
R-value of 0.- Parameters:
filename – path to .npz file with arrays
sig_key – name of “signal” histogram in .npz file
bkg_key – name of “background” histogram in .npz file
- Returns:
tupleof r histogram (shape: (N0, N1, ...)), r bins in each dimension (shape: (D, Ni)), an array possible r values (shape: (1001,), and corresponding p-values (shape: (1001,))
- static make_missing_segment(start1, end1, start2, end2)
- merged_dtype
- missing_track_segments = 150
- static poca(start_xyz0, end_xyz0, start_xyz1, end_xyz1)
Finds the scale factor to point of closest approach of two lines each defined by 2 3D points. The scale factor is a number between 0 and 1 representing the position along the line. To extract the 3D point of closest approach on each line:
s0, s1 = poca(start0, end0, start1, end1) # shape: (N, 1) poca0 = (1 - s0) * start0 + s0 * end0 # shape: (N, 3) poca1 = (1 - s1) * start1 + s1 * end1
- Parameters:
end}_xyz(i) ({start,) – start/end point of line i,
shape: (..., N, 3)- Returns:
tupleof line segment 0 and 1,shape: (..., N, 1)
- run(source_name, source_slice, cache)
- static score_neighbor(r, r_bins, statistic_bins, p_bins, *params)
Calculates a p-value based on a binned, multi-dimensional PDF
- Parameters:
r – likelihood ratio,
shape: (N,)*Dr_bins – bin edge for each parameter,
shape: (D, N+1)statistic_bins – bins for statistic, range 0-1,
shape: (n,)p_bins – bins for p value range 0-1,
shape: (n,)*params –
array of parameters to use to calculate p-value, requires
Dparameters in the same sequence as listed in the bins, each with the same shape
- Returns:
array of same shape as the
paramsarrays with a p-value between 0-1