gnn_tracking.postprocessing.dbscanscanner
#
Module Contents#
Classes#
Restults of DBSCANHyperparamScanner and friends. |
|
Scan for hyperparameters of DBSCAN. Use this scanner for validation. |
|
Scan grid for hyperparameters of DBSCAN. While DBSCANHyperParamScanner |
|
Get information about detailed performance for fixed DBSCAN parameters. |
Functions#
|
Convenience wrapper around sklearn's DBSCAN implementation. |
- gnn_tracking.postprocessing.dbscanscanner.dbscan(graphs: numpy.ndarray, eps=0.99, min_samples=1) numpy.ndarray #
Convenience wrapper around sklearn’s DBSCAN implementation.
- class gnn_tracking.postprocessing.dbscanscanner.OCScanResults(df: pandas.DataFrame)#
Restults of DBSCANHyperparamScanner and friends.
- property df: pandas.DataFrame#
- property df_mean: pandas.DataFrame#
Mean and std grouped by hyperparameters.
- get_foms(guide='double_majority_pt0.9') dict[str, float] #
Get figures of merit
- get_n_best_trials(n: int, guide='double_majority_pt0.9') list[dict[str, float]] #
- class gnn_tracking.postprocessing.dbscanscanner.DBSCANHyperParamScanner(*, eps_range=(0, 1), min_samples_range=(1, 4), n_trials=10, keep_best=0, n_jobs: int | None = None, guide: str = 'double_majority_pt0.9', pt_thlds=(0.0, 0.5, 0.9, 1.5), max_eta: float = 4.0)#
Bases:
gnn_tracking.postprocessing.clusterscanner.ClusterScanner
Scan for hyperparameters of DBSCAN. Use this scanner for validation. Even with few trials, it will eventually apply finer samples to the best region, because it will keep the best trials from the previous epoch (make sure th choose non-zero
kep_best
).- Parameters:
eps_range – Range of DBSCAN radii to scan
min_samples_range – Range (INCLUSIVE!) of minimum number of samples for DBSCAN
n_trials – Total number of trials
keep_best – Keep this number of the best (eps, min_samples) pairs from the current epoch and make sure to scan over them again in the next epoch.
n_jobs – Number of jobs to use for parallelization
guide – Report tracking metrics for parameters that maximize this metric
pt_thlds – list of pT thresholds for the tracking metrics
max_eta – Max eta for tracking metrics
- get_results() OCScanResults #
- get_foms() dict[str, float] #
- _get_best_trials() list[dict[str, float]] #
- _reset_trials() None #
- reset()#
Reset the results. Will be automatically called every time we run on a batch with i_batch == 0.
- __call__(data: torch_geometric.data.Data, out: dict[str, torch.Tensor], i_batch: int, *, progress=False)#
- class gnn_tracking.postprocessing.dbscanscanner.DBSCANHyperParamScannerFixed(trials: list[dict[str, float]], *, n_jobs: int | None = None, pt_thlds=(0.0, 0.5, 0.9, 1.5), max_eta: float = 4.0)#
Bases:
DBSCANHyperParamScanner
Scan grid for hyperparameters of DBSCAN. While DBSCANHyperParamScanner is for use in validation steps, this is for use in detailed testing.
- Parameters:
trials – List of trials to run
n_jobs – Number of jobs to use for parallelization
pt_thlds – list of pT thresholds for the tracking metrics
max_eta – Max eta for tracking metrics
- _reset_trials() None #
- class gnn_tracking.postprocessing.dbscanscanner.DBSCANPerformanceDetails(eps: float, min_samples: int)#
Bases:
DBSCANHyperParamScanner
Get information about detailed performance for fixed DBSCAN parameters. See get_results for outputs.
- Parameters:
eps – DBSCAN epsilon
min_samples – DBSCAN min_samples
- __call__(data: torch_geometric.data.Data, out: dict[str, torch.Tensor], i_batch: int) None #
- get_results() tuple[list[pandas.DataFrame], list[pandas.DataFrame]] #
Get results
- Returns:
Tuple of (h_dfs, c_dfs), where h_dfs is a list of dataframes with information about all hits and c_dfs is a list of dataframes with information about all clusters. See tracking_metric_df for details about the information about both dataframes..
- get_foms() dict[str, float] #