Evaluation

Toolbox for evaluation for the performance of federated online learning approaches, namely for intrusion detection. Since they require various kinds of inputs, there is no singular interface to implement and call when using one of them, check the individual modules for more info.

At the moment, the toolbox contains a set of classe for the online evaluation of anomaly detection approaches using customized extensions of the keras metric API:

SlidingWindowEvaluation - Interface class. Implementations must compute

incremental updates to the metric. * ConfMatrSlidingWindowEvaluation - Evaluator that computes the confusion matrix metrics over a sliding window. * TFMetricSlidingWindowEvaluation - Class wrapper for generic tensorflow metrics over a sliding window.

class daisy.evaluation.ConfMatrSlidingWindowEvaluation(name='conf_matrix_online_evaluation', window_size: int = None, **kwargs)[source]

Bases: SlidingWindowEvaluation

Sliding window evaluation metric that computes the entire confusion matrix along with most(*) its metrics over the k most recent predicted binary labels to evaluate the model’s recent performance on them in point-wise manner.

result() → dict[str, Tensor][source]

Based on the accumulated confusion matrix, computes its derived scalar metrics and returns them.

Returns:: Dictionary of all derived scalar (tensor) confusion matrix metrics.

class daisy.evaluation.SlidingWindowEvaluation(name='ad_online_evaluation', window_size: int = None, **kwargs)[source]

Bases: Metric, ABC

Abstract evaluation metric class that extends the existing tensorflow metric base class, with a sliding window to collect the k most recent predicted labels to evaluate the model’s recent performance on them in point-wise manner.

Note that depending on the metric, non-abstract methods must be extended with a new metric’s own functionality.

merge_state(metrics: Self)[source]

Merges the state from one or more metrics, by merging their sliding windows.

Note this is only possible if the sliding window of the current instance is able to encompass all other windows.

Parameters:: metrics – An iterable of sliding window metrics of the same type.

pred_labels: deque

reset_state()[source]: Resets the sliding window and all the metric’s state variables.

abstractmethod result()[source]

Computes and returns the scalar value(s) of the metric. Idempotent operation based on the underlying state variables and the sliding window.

Returns:: A scalar tensor, or a dictionary of scalar tensors.

true_labels: deque

update_state(y_true, y_pred, *args, **kwargs)[source]

Adds a mini-batch of inputs to the metric, removing old ones if the window is full, and adjusting statistics accordingly. Converts any tensors into numpy arrays as data points/pairs are processed in element-wise fashion anyway and this makes it easier for generic handling.

Parameters:

y_true – Vector/Tensor containing true labels of inputs.
y_pred – Vector/Tensor containing predicted labels of inputs.
args – Not supported arguments.
kwargs – Not supported keywords arguments.

class daisy.evaluation.TFMetricSlidingWindowEvaluation(tf_metric: Metric, window_size: int = None, **kwargs)[source]

Bases: SlidingWindowEvaluation

Wrapper class for all kinds of tensorflow evaluation metrics that operate on true-predicted label comparisons. Uses the provided sliding window to accumulate a subset of the overall data points and evaluates them using the tensorflow metric when called upon. Not very computational efficient since cumulative aggregation cannot be supported as not every metric can be computed in sliding window manner.

result()[source]

Based on the current window, computes and returns the scalar metric value tensor or a dict of scalars, after resetting the metric once more.

Returns:: Tensorflow metric result.