gnn_tracking.postprocessing.dbscanscanner#

Module Contents#

Classes#

OCScanResults

Restults of DBSCANHyperparamScanner and friends.

DBSCANHyperParamScanner

Scan for hyperparameters of DBSCAN. Use this scanner for validation.

DBSCANHyperParamScannerFixed

Scan grid for hyperparameters of DBSCAN. While DBSCANHyperParamScanner

DBSCANPerformanceDetails

Get information about detailed performance for fixed DBSCAN parameters.

Functions#

dbscan(→ numpy.ndarray)

Convenience wrapper around sklearn's DBSCAN implementation.

gnn_tracking.postprocessing.dbscanscanner.dbscan(graphs: numpy.ndarray, eps=0.99, min_samples=1) numpy.ndarray#

Convenience wrapper around sklearn’s DBSCAN implementation.

class gnn_tracking.postprocessing.dbscanscanner.OCScanResults(df: pandas.DataFrame)#

Restults of DBSCANHyperparamScanner and friends.

property df: pandas.DataFrame#
property df_mean: pandas.DataFrame#

Mean and std grouped by hyperparameters.

get_foms(guide='double_majority_pt0.9') dict[str, float]#

Get figures of merit

get_n_best_trials(n: int, guide='double_majority_pt0.9') list[dict[str, float]]#
class gnn_tracking.postprocessing.dbscanscanner.DBSCANHyperParamScanner(*, eps_range=(0, 1), min_samples_range=(1, 4), n_trials=10, keep_best=0, n_jobs: int | None = None, guide: str = 'double_majority_pt0.9', pt_thlds=(0.0, 0.5, 0.9, 1.5), max_eta: float = 4.0)#

Bases: gnn_tracking.postprocessing.clusterscanner.ClusterScanner

Scan for hyperparameters of DBSCAN. Use this scanner for validation. Even with few trials, it will eventually apply finer samples to the best region, because it will keep the best trials from the previous epoch (make sure th choose non-zero kep_best).

Parameters:
  • eps_range – Range of DBSCAN radii to scan

  • min_samples_range – Range (INCLUSIVE!) of minimum number of samples for DBSCAN

  • n_trials – Total number of trials

  • keep_best – Keep this number of the best (eps, min_samples) pairs from the current epoch and make sure to scan over them again in the next epoch.

  • n_jobs – Number of jobs to use for parallelization

  • guide – Report tracking metrics for parameters that maximize this metric

  • pt_thlds – list of pT thresholds for the tracking metrics

  • max_eta – Max eta for tracking metrics

get_results() OCScanResults#
get_foms() dict[str, float]#
_get_best_trials() list[dict[str, float]]#
_reset_trials() None#
reset()#

Reset the results. Will be automatically called every time we run on a batch with i_batch == 0.

__call__(data: torch_geometric.data.Data, out: dict[str, torch.Tensor], i_batch: int, *, progress=False)#
class gnn_tracking.postprocessing.dbscanscanner.DBSCANHyperParamScannerFixed(trials: list[dict[str, float]], *, n_jobs: int | None = None, pt_thlds=(0.0, 0.5, 0.9, 1.5), max_eta: float = 4.0)#

Bases: DBSCANHyperParamScanner

Scan grid for hyperparameters of DBSCAN. While DBSCANHyperParamScanner is for use in validation steps, this is for use in detailed testing.

Parameters:
  • trials – List of trials to run

  • n_jobs – Number of jobs to use for parallelization

  • pt_thlds – list of pT thresholds for the tracking metrics

  • max_eta – Max eta for tracking metrics

_reset_trials() None#
class gnn_tracking.postprocessing.dbscanscanner.DBSCANPerformanceDetails(eps: float, min_samples: int)#

Bases: DBSCANHyperParamScanner

Get information about detailed performance for fixed DBSCAN parameters. See get_results for outputs.

Parameters:
  • eps – DBSCAN epsilon

  • min_samples – DBSCAN min_samples

__call__(data: torch_geometric.data.Data, out: dict[str, torch.Tensor], i_batch: int) None#
get_results() tuple[list[pandas.DataFrame], list[pandas.DataFrame]]#

Get results

Returns:

Tuple of (h_dfs, c_dfs), where h_dfs is a list of dataframes with information about all hits and c_dfs is a list of dataframes with information about all clusters. See tracking_metric_df for details about the information about both dataframes..

get_foms() dict[str, float]#