gnn_tracking.metrics.cluster_metrics

`gnn_tracking.metrics.cluster_metrics`#

Metrics evaluating the quality of clustering/i.e., the usefulness of the algorithm for tracking.

Module Contents#

Classes#

`ClusterMetricType`	Function type that calculates a clustering metric.
`TrackingMetrics`	Initialize self. See help(type(self)) for accurate signature.

Functions#

`tracking_metric_df`(→ pandas.DataFrame)	Label clusters as double majority/perfect/LHC.
`count_tracking_metrics`(→ TrackingMetrics)	Calculate TrackingMetrics from cluster and hit information.
`tracking_metrics`(→ dict[float, TrackingMetrics])	Calculate 'custom' metrics for matching tracks and hits.
`tracking_metrics_data`(→ dict[float, TrackingMetrics])	Convenience function to apply tracking_metrics to a Data object.
`tracking_metrics_vs_pt`(→ pandas.DataFrame)	Calculate tracking metrics for pt slices.
`tracking_metrics_vs_eta`(→ pandas.DataFrame)	param h_dfs: List of hit dataframes for different batches (see tracking_metrics_df)
`flatten_track_metrics`(→ dict[str, float])	Flatten the result of custom_metrics by using pt suffixes to arrive at a
`count_hits_per_cluster`(→ numpy.ndarray)	Count number of hits per cluster
`hits_per_cluster_count_to_flat_dict`(→ dict[str, float])	Turn result array from count_hits_per_cluster into a dictionary
`_sklearn_signature_wrap`(→ ClusterMetricType)	A decorator to make an sklearn cluster metric function accept/take the

Attributes#

`_tracking_metrics_nan_results`
`common_metrics`

class gnn_tracking.metrics.cluster_metrics.ClusterMetricType#

Bases: Protocol

Function type that calculates a clustering metric.

__call__(*, truth: numpy.ndarray, predicted: numpy.ndarray, pts: numpy.ndarray, reconstructable: numpy.ndarray, pt_thlds: list[float]) → float | dict[str, float]#

class gnn_tracking.metrics.cluster_metrics.TrackingMetrics#

Bases: TypedDict

Initialize self. See help(type(self)) for accurate signature.

n_particles: int#

n_cleaned_clusters: int#

perfect: float#

double_majority: float#

lhc: float#

fake_perfect: float#

fake_double_majority: float#

fake_lhc: float#

gnn_tracking.metrics.cluster_metrics._tracking_metrics_nan_results: TrackingMetrics#

gnn_tracking.metrics.cluster_metrics.tracking_metric_df(h_df: pandas.DataFrame, predicted_count_thld=3) → pandas.DataFrame#

Label clusters as double majority/perfect/LHC.

Parameters:

h_df – Hit information dataframe
predicted_count_thld – Number of hits a cluster must have to be considered a valid cluster

Returns:

cluster dataframe with columns such as “double_majority” etc.

gnn_tracking.metrics.cluster_metrics.count_tracking_metrics(c_df: pandas.DataFrame, h_df: pandas.DataFrame, c_mask: numpy.ndarray, h_mask: numpy.ndarray) → TrackingMetrics#

Calculate TrackingMetrics from cluster and hit information.

Parameters:

c_df – Output dataframe from tracking_metric_dfs
h_df – Hit information dataframe
c_mask – Cluster mask
h_mask – Hit mask

Returns:

TrackingMetrics namedtuple.

gnn_tracking.metrics.cluster_metrics.tracking_metrics(*, truth: numpy.ndarray, predicted: numpy.ndarray, pts: numpy.ndarray, reconstructable: numpy.ndarray, eta: numpy.ndarray, pt_thlds: Iterable[float], predicted_count_thld=3, max_eta=4) → dict[float, TrackingMetrics]#

Calculate ‘custom’ metrics for matching tracks and hits.

Parameters:

truth – Truth labels/PIDs for each hit
predicted – Predicted labels/cluster index for each hit. Negative labels are interpreted as noise (because this is how DBSCAN outputs it) and are ignored
pts – true pt value of particle belonging to each hit
reconstructable – Whether the hit belongs to a “reconstructable tracks” (this usually implies a cut on the number of layers that are being hit etc.)
eta – true pseudorapidity of particle belong to each hit
pt_thlds – pt thresholds to calculate the metrics for
predicted_count_thld – Minimal number of hits in a cluster for it to not be rejected.
max_eta – Maximum eta value to count

Returns:

See TrackingMetrics

gnn_tracking.metrics.cluster_metrics.tracking_metrics_data(data: torch_geometric.data.Data, labels, pt_thlds: Iterable[float], predicted_count_thld=3, max_eta=4) → dict[float, TrackingMetrics]#

Convenience function to apply tracking_metrics to a Data object.

Parameters:

data – Data object
labels – Predicted labels/cluster index for each hit. Negative labels are treated as noise
pt_thlds – pt thresholds to calculate the metrics for
predicted_count_thld – Minimal number of hits in a cluster for it to not be rejected.
max_eta – Maximum eta value to count

gnn_tracking.metrics.cluster_metrics.tracking_metrics_vs_pt(h_dfs: list[pandas.DataFrame], c_dfs: list[pandas.DataFrame], pts: list[float], *, max_eta: float = 4.0) → pandas.DataFrame#

Calculate tracking metrics for pt slices.

Parameters:

h_dfs – List of hit dataframes for different batches (see tracking_metrics_df)
c_dfs – List of cluster dataframes for different batches (see tracking_metrics_df)
pts – List of pt points to calculate the metrics for
max_eta – Maximum eta value to count

Returns:

Dataframe with tracking metrics for each pt slice

gnn_tracking.metrics.cluster_metrics.tracking_metrics_vs_eta(h_dfs: list[pandas.DataFrame], c_dfs: list[pandas.DataFrame], etas: list[float], pt_thld: float = 0.9) → pandas.DataFrame#

Parameters:

h_dfs – List of hit dataframes for different batches (see tracking_metrics_df)
c_dfs – List of cluster dataframes for different batches (see tracking_metrics_df)
etas – Eta points to calculate metrics for
pt_thld

Returns:

Dataframe with tracking metrics for each pt slice

gnn_tracking.metrics.cluster_metrics.flatten_track_metrics(custom_metrics_result: dict[float, dict[str, float]]) → dict[str, float]#: Flatten the result of custom_metrics by using pt suffixes to arrive at a flat dictionary, rather than a nested one.

gnn_tracking.metrics.cluster_metrics.count_hits_per_cluster(predicted: numpy.ndarray) → numpy.ndarray#: Count number of hits per cluster

gnn_tracking.metrics.cluster_metrics.hits_per_cluster_count_to_flat_dict(counts: numpy.ndarray, min_max=10) → dict[str, float]#

Turn result array from count_hits_per_cluster into a dictionary with cumulative counts.

Parameters:

counts – Result from count_hits_per_cluster
min_max – Pad the counts with zeros to at least this length

gnn_tracking.metrics.cluster_metrics._sklearn_signature_wrap(func: Callable) → ClusterMetricType#: A decorator to make an sklearn cluster metric function accept/take the arguments from ClusterMetricType.

gnn_tracking.metrics.cluster_metrics.common_metrics: dict[str, ClusterMetricType]#

gnn_tracking.metrics.cluster_metrics

Contents

gnn_tracking.metrics.cluster_metrics#

Module Contents#

Classes#

Functions#

Attributes#

`gnn_tracking.metrics.cluster_metrics`#