faster_etapr.etapr.eTaMetrics#

class faster_etapr.etapr.eTaMetrics(preds: list[tuple[int, int]], labels: list[tuple[int, int]], *, theta_p: float = 0.5, theta_r: float = 0.1)[source]#

Bases: object

Defines the enhanced time-aware (eTa) precision, recall, and f1. Moreover, we can also compute the point-wise and point-adjusted versions.

For a motivation to use eTaPR check out the documentation.

preds#

Predictions as a list of ranges.

Type:: list[tuple[int, int]]

labels#

Labels as a list of ranges.

Type:: list[tuple[int, int]]

theta_p#

Precision threshold. Only those predictions who overlap with at least theta_p with a detected anomaly are counted as correct. Defaults to 0.5.

Type:: float, optional

theta_r#

Recall threshold. Only those anomalies which overlap at least theta_r with an correct prediction are counted as detected. Defaults to 0.1.

Type:: float, optional

Methods

`f1`	Calculates the F1 score from precision and recall as the harmonic mean:
`from_preds`	Creates an instance from point-wise predictions and labels.
`point_adjust_precision`	Calculates the point-adjusted precision.
`point_adjust_recall`	Calculates the point-adjusted recall.
`point_adjust_scores`	Calculates the point-adjusted recall, precision, and f1.
`point_precision`	Calculates the point-wise precision score.
`point_recall`	Calculates the point-wise recall score.
`point_scores`	Calculates the point-wise (traditional) scores.
`precision`	Calculates the enhanced time-aware precision (eTaP).
`recall`	Calculates the enhanced time-aware recall (eTaR).
`scores`	Calculates the enhanced time-aware (eTa) scores.

f1(precision: float, recall: float) → float[source]#

Calculates the F1 score from precision and recall as the harmonic mean:

\[\mathrm{F1} \triangleq 2 \frac{ \mathrm{PR} \cdot \mathrm{RC} }{ \mathrm{PR} + \mathrm{RC} }\]

Parameters:

precision (float) – Precision score.
recall (float) – Recall score.

Returns:

Returns the F1 score.

Return type:

float

Creates an instance from point-wise predictions and labels.

Parameters:

y_hat (npt.ArrayLike) – Predictions (point-wise).
y (npt.ArrayLike) – Labels (point-wise).
theta_p (float, optional) – Precision threshold. Only those predictions who overlap with at least theta_p with a detected anomaly are counted as correct. Defaults to 0.5.
theta_r (float, optional) – Recall threshold. Only those anomalies which overlap at least theta_r with an correct prediction are counted as detected. Defaults to 0.1.

Returns:

Returns an instance.

Return type:

eTaMetrics

point_adjust_precision() → float[source]#

Calculates the point-adjusted precision. Precision answers the question of how accurate our predictions are. The point-adjusted precision is calculated in the same way as the point-wise precision (TP / (TP + FP)). However, the predictions are adjusted before calculation using the ground-truth. All predictions for an anomaly are set to 1 if at least one correct prediction for that anomaly segment exists.

Returns:: Returns the point-adjust precision.
Return type:: float

point_adjust_recall() → float[source]#

Calculates the point-adjusted recall. Recall answers the question of how much of anomaly is detected. The point-adjusted recall is calculated in the same way as the point-wise recall (TP / (TP + FN)). However, the predictions are adjusted before calculation using the ground-truth. All predictions for an anomaly are set to 1 if at least one correct prediction for that anomaly segment exists.

Returns:: Reutrns the point-adjusted recall.
Return type:: float

point_adjust_scores() → dict[str, float][source]#

Calculates the point-adjusted recall, precision, and f1. The metrics are calculated in the same way as the point-wise scores but the predictions are adjusted before calculation using the ground-truth. All predictions for an anomaly are set to 1 if at least one correct prediction for that anomaly segment exists.

Returns:

Returns the point-adjusted scores:

point_adjust/recall: point-adjusted recall
point_adjust/precision: point-adjusted precision
point_adjust/f1: point-adjusted f1

Return type:

dict[str, float]

point_precision() → float[source]#

Calculates the point-wise precision score. Precision answers the question of “How many predictions (for anomalies) concern real anomalies?”. In a point-wise manner, we categorize each prediction into true positives (TP), false positives (FP), true negative (TN), and false negatives (TN). Then we can calculate the precision as as TP / (TP + FP).

Returns:: Returns the point-wise precision.
Return type:: float

point_recall() → float[source]#

Calculates the point-wise recall score. Recall answers the question of “How much of anomalies is detected?”.In a point-wise manner, we categorize each prediction into true positives (TP), false positives (FP), true negative (TN), and false negatives (TN). Then we can calculate the recall as TP / (TP + FN).

Returns:: Returns the point-wise recall.
Return type:: float

point_scores() → dict[str, float | int][source]#

Calculates the point-wise (traditional) scores. Each data point can be categorized as either true positive (TP), false positive (FP), true negative (TN) or false negative (FN). Then, we can calculate the metrics as follows:

\begin{align*} \mathrm{RC}^{\mathrm{P}}(\tilde{\mathbf{y}}, \mathbf{y}) & \triangleq \frac{\mathrm{TP}}{\mathrm{TP} + \mathrm{FN}} \\ \mathrm{PR}^{\mathrm{P}}(\tilde{\mathbf{y}}, \mathbf{y}) & \triangleq \frac{\mathrm{TP}}{\mathrm{TP} + \mathrm{FP}} \\ \mathrm{F1}^{\mathrm{P}}(\tilde{\mathbf{y}}, \mathbf{y}) & \triangleq 2 \frac{\mathrm{PR}^{\mathrm{P}} \cdot \mathrm{RC}^{\mathrm{P}}}{\mathrm{PR}^{\mathrm{P}} + \mathrm{RC}^{\mathrm{P}}} = \frac{2 \mathrm{TP}}{2\mathrm{TP} + \mathrm{FP} + \mathrm{FN}}\\ \mathrm{SEG}^{\mathrm{P}}(\tilde{\mathbf{y}}, \mathbf{y}) & \triangleq \sum_{\mathbf{A}_i \in \mathcal{A}} \mathbb{1}( \sum_{\mathbf{P}_j \in \mathcal{P}} |\mathbf{P}_j \cap \mathbf{A}_i| > 0) \end{align*}

All keys in the return mapping are prefixed with point/.

Returns:

Returns a mapping containing:

point/recall: point-wise recall (TP / (TP + FN))
point/precision: point-wise precision (TP / (TP + FP))
point/f1: point-wise f1 score
point/TP: number of true positives, correctly classified as 1
point/FP: number of false positive, incorrectly classified as 1
point/FN: number of false negatives, incorrectly classified as 0
point/anomalies: total number of anomalies
point/detected_anomalies: number of detected anomalies (at least one point detected)
point/segments: percentage of detected anomalies

Return type:

dict[str, float | int]

precision() → eTaPrecision[source]#

Calculates the enhanced time-aware precision (eTaP). Precision answers the question of “How many predictions (for anomalies) concern real anomalies?”.

The precision $\mathrm{PR}^\mathrm{eTa}$ is calculated as a combination of the detection score $s^\mathrm{PD}$ and the portion score $s^\mathrm{PP}$ as follows:

\[\mathrm{PR}^{\mathrm{eTa}}(\tilde{\mathbf{y}}, \mathbf{y}) \triangleq \sum_{P_j \in \mathcal{P}} \left( \frac{s^{\mathrm{PD}}(P_j) + s^{\mathrm{PD}}(P_j) \cdot s^{\mathrm{PP}}(P_j)}{2} \right) \cdot w_{p},\]

where $\tilde{\mathbf{y}}$ are the predictions, $\mathbf{y}$ the labels, $P_j$ a prediction, $\mathcal{P}$ the set of all predictions and $w_{p}$ a weight for the prediction,

\[w_p = \frac{ \sqrt{|P_j|} }{ \sum_{P_i \in \mathcal{P}} \sqrt{|P_i|} }\]

The overall square roots of the lengths of all predictions $\sum_{\mathbf{Q} \in \mathcal{P}} \sqrt{|\mathbf{Q}|}$ restricts the precision score the range [0, 1]. Furthermore, it penalizes the detection method for lengthy and frequent incorrect predictions.

The detection score $s^\mathrm{PD}$ of a prediction $P_j$ is defined as:

\[\begin{split}s^{\mathrm{PD}}(P_j) = \begin{cases} 1, & \text{if $P_j \in \mathcal{P}^C$} \\ 0, & \text{otherwise}, \end{cases}\end{split}\]

where $\mathcal{P}^C$ is the set of correct predictions. A prediction $P_j$ belongs to this set, if at least $\theta_p$ of the prediction $P_j$ overlaps with a detected anomaly $A_i \in \mathcal{A}^D$.

Thus, a prediction $P_j$ can only contribute if it is precise enough and belongs to the set of correct predictions $\mathcal{P}^C$. Over all predictions $\mathcal{P}$, it is the ratio of correct predictions $\mathcal{P}^C$ to the number of all predictions $\mathcal{P}$, i.e., $\frac{|\mathcal{P}^C|}{|\mathcal{P}|}$.

The portion score $s^\mathrm{PP}$ is proportion of the overlapping parts with a detected anomaly $A_i$:

\[s^\mathrm{PP}(P_j) = \frac{ \sum_{A_i \in \mathcal{A}} | A_i \cap P_j | }{ | P_j | }\]

Thus, the precision $\mathrm{PR}^\mathrm{eTa}$ is a measure of the quality of the predictions. Only relevant predictions $P_j$, i.e., whose overlapping portions are greater than $\theta_p$, can directly contribute to the overall score. However, incorrect predictions $P_j \notin \mathcal{P}^C$ can impact the score through the weighting scheme $w_p$.

Returns:

Returns a namedtuple containing the

precision
detection score
portion score
number of correct predictions

Return type:

eTaPrecision

recall() → eTaRecall[source]#

Calculates the enhanced time-aware recall (eTaR). Recall answers the question of “How much of anomalies is detected?”

The recall $\mathrm{RC}^\mathrm{eTa}$ is calculated as a combination of the detection score $s^\mathrm{RD}$ and the portion score $s^\mathrm{RP}$ as follows:

\[\mathrm{RC}^{\mathrm{eTa}}(\tilde{\mathbf{y}}, \mathbf{y}) \triangleq \frac{1}{|\mathcal{A}|} \sum_{A_i \in \mathcal{A}} \frac{ s^{\mathrm{RD}}(A_i) + s^{\mathrm{RD}}(A_i) \cdot s^{\mathrm{RP}}(A_i) }{2}\]

where $\tilde{\mathbf{y}}$ are the predictions, $\mathbf{y}$ the labels, $A_i$ an anomaly, and $\mathcal{A}$ the set of all anomalies. The recall $\mathrm{RC}^\mathrm{eTa}$ is the average over all anomaly segments $\mathcal{A}$, but only those anomalies $A_i$ contribute to the overall score which belong to the set of the detected anomalies $\mathcal{A}^D$. Thus, the recall is a measure of how well we can anomaly segments.

The detection score $s^\mathrm{RD}$ of a anomaly $A_i$ is defined as:

\[\begin{split}s^{\mathrm{RD}}(A_i) = \begin{cases} 1, & \text{if $A_i \in \mathcal{A}^D$}\\ 0, & \text{otherwise}, \end{cases}\end{split}\]

where $\mathcal{A}^D$ is the set of detected anomalies. An anomaly $A_i$ belongs to this set, if the overlapped portion with a correct prediction $P_j \in \mathcal{P}^C$ is greater than $\theta_r$. Hence, the detection score $s^\mathrm{RD}$ indicates whether an anomaly $A_i$ is detected or not.

The portion score $s^\mathrm{RP}$ is the proportion of an anomaly $A_i$ which intersects with a correct prediction $P_j \in \mathcal{P}^C$. Mathematically defined as follows,

\[s^{\mathrm{RP}}(\mathbf{A}_i) = \frac{ \sum_{\mathbf{P}_j \in \mathcal{P}^C} |\mathbf{A}_i \cap \mathbf{P}_j| }{ |\mathbf{A}_i| }.\]

Returns:

Returns a namedtuple containing the

precision
detection score
portion score
number of correct predictions

Return type:

eTaRecall

scores() → dict[str, float | int][source]#

Calculates the enhanced time-aware (eTa) scores. All keys in the result mapping are prefixed with eta/.

Returns:

Returns a mapping containing:

eta/recall: recall score
eta/recall_detection: detection score of the recall
eta/recall_portion: portion score of the recall
eta/detected_anomalies: number of detected anomalies
eta/precision: precision score
eta/precision_detection: detection score of the precision
eta/precision_portion: portion score of the precision
eta/correct_predictions: number of correct predictions
eta/f1: f1 score (harmonic mean of precision and recall)
eta/TP: number of true positives (points counted)
eta/FP: number of false positives (points counted)
eta/FN: number of false negatives (points counted)
eta/wrong_predictions: number of wrong predictions
eta/missed_anomalies: number of undetected anomalies
eta/anomalies: total number of anomalies
eta/segments: percentage of detected anomalies

Return type:

dict[str, float | int]