gnn_tracking.preprocessing.point_cloud_builder#

Build point clouds from the input data files.

Module Contents#

Classes#

PointCloudBuilder

Build point clouds, that is, read the input data files and convert them

Functions#

get_truth_edge_index(→ numpy.ndarray)

Get edge index for all edges, connecting hits of the same particle_id.

Attributes#

DEFAULT_FEATURES

_DEFAULT_FEATURE_SCALE

gnn_tracking.preprocessing.point_cloud_builder.get_truth_edge_index(pids: numpy.ndarray) numpy.ndarray#

Get edge index for all edges, connecting hits of the same particle_id. To save space, only edges in one direction are returned.

gnn_tracking.preprocessing.point_cloud_builder.DEFAULT_FEATURES = ('r', 'phi', 'z', 'eta_rz', 'u', 'v', 'charge_frac', 'leta', 'lphi', 'lx', 'ly', 'lz', 'geta', 'gphi')#
gnn_tracking.preprocessing.point_cloud_builder._DEFAULT_FEATURE_SCALE#
class gnn_tracking.preprocessing.point_cloud_builder.PointCloudBuilder(*, outdir: str | pathlib.PurePath, indir: str | pathlib.PurePath, detector_config: pathlib.PurePath, n_sectors: int, redo: bool = True, pixel_only: bool = True, sector_di: float = 0.0001, sector_ds: float = 1.1, measurement_mode: bool = False, thld: float = 0.5, remove_noise: bool = False, write_output: bool = True, log_level=logging.INFO, collect_data: bool = True, feature_names: tuple = DEFAULT_FEATURES, feature_scale: tuple = _DEFAULT_FEATURE_SCALE, add_true_edges: bool = False)#

Build point clouds, that is, read the input data files and convert them to pytorch geometric data objects (without any edges yet).

Parameters:
  • outdir – Directory for the output files

  • indir – Directory for the input files

  • detector_config – Path to the detector configuration file

  • n_sectors – Total number of sectors

  • redo – Re-compute the point cloud even if it is found

  • pixel_only – Construct tracks only from pixel layers

  • sector_di – The intercept offset for the extended sector

  • sector_ds – The slope offset for the extended sector

  • measurement_mode – Produce statistics about the sectorization

  • thld – Threshold pt for measurements

  • remove_noise – Remove hits with particle_id==0

  • write_output – Store the point clouds in a torch .pt file

  • log_level – Specify INFO (0) or DEBUG (>0)

  • collect_data – Collect data in memory

  • feature_names – Names of features to add

  • feature_scale – Scale of features

  • add_true_edges – Add true edges to the point cloud

calc_eta(r: numpy.ndarray, z: numpy.ndarray) numpy.ndarray#

Compute pseudorapidity (spatial).

restrict_to_subdetectors(hits: pandas.DataFrame, cells: pandas.DataFrame) tuple[pandas.DataFrame, pandas.DataFrame]#

Rename (volume, layer) pairs with an integer label.

append_features(hits: pandas.DataFrame, particles: pandas.DataFrame, truth: pandas.DataFrame, cells: pandas.DataFrame) pandas.DataFrame#

Add additional features to the hits dataframe and return it.

sector_hits(hits: pandas.DataFrame, sector_id: int, particle_id_counts: dict[int, int]) pandas.DataFrame#

Break an event into (optionally) extended sectors.

_get_edge_index(particle_id: numpy.ndarray) torch.Tensor#
to_pyg_data(hits: pandas.DataFrame) torch_geometric.data.Data#

Build the output data structure

get_measurements() dict[str, float]#
process(start: int | None = None, stop: int | None = None, ignore_loading_errors=False)#

Process input files from self.input_files and write output files to self.output_files

Parameters:
  • start – index of first file to process

  • stop – index of last file to process (or None). Can be higher than total number of files.

  • ignore_loading_errors – if True, ignore errors when loading event

Returns: