Classification Metrics
Fraction of correct predictions: (TP + TN) / total. Simple but can be misleading with imbalanced classes.
Fraction of positive predictions that are correct: TP / (TP + FP). High precision means few false positives.
average: 'binary' | 'macro' | 'micro' | 'weighted' - Averaging strategy for multi-class (default: 'binary')Fraction of actual positives correctly identified: TP / (TP + FN). Also called sensitivity or true positive rate.
Harmonic mean of precision and recall: 2·P·R / (P + R). Balances precision and recall into a single metric.
Weighted harmonic mean: (1 + β²)·P·R / (β²·P + R). β > 1 weights recall higher; β < 1 weights precision higher.
Compute the confusion matrix. Entry [i, j] is the count of samples with true label i predicted as j. Shape: [nClasses, nClasses].
Per-class precision, recall, F1 and support, plus macro/weighted averages.
Area Under the ROC Curve. Measures the ability to distinguish between classes across all thresholds. 1.0 = perfect, 0.5 = random.
Compute the Receiver Operating Characteristic curve. Returns false positive rate, true positive rate, and thresholds.
Compute the precision-recall curve across different thresholds.
Area under the precision-recall curve. Summarizes the curve into a single number.
Logarithmic loss (cross-entropy). Measures the quality of predicted probabilities. Lower is better.
Matthews Correlation Coefficient. Balanced measure for binary classification even with imbalanced classes. Range: [−1, 1].
Cohen's Kappa. Measures agreement between predicted and true labels, adjusted for chance agreement.
Average of recall for each class. Handles class imbalance better than standard accuracy.
Fraction of labels that are incorrectly predicted. Equal to 1 − accuracy for single-label classification.
Jaccard similarity score (intersection over union) for classification.
Accuracy
Where:
- TP = True Positives
- FP = False Positives
Precision
Where:
- FP = False alarms (predicted positive but actually negative)
Recall (Sensitivity)
Where:
- FN = Missed detections (actually positive but predicted negative)
F1 Score
Where:
- F1 = Harmonic mean of precision and recall
Fβ Score
Where:
- β = β > 1 favors recall, β < 1 favors precision
Log Loss (Cross-Entropy)
Where:
- pᵢ = Predicted probability for sample i
Matthews Correlation Coefficient
Where:
- MCC = Range [−1, 1], 0 = random, 1 = perfect
Jaccard Index (IoU)
Where:
- J = Intersection over Union
import { accuracy, precision, recall, f1Score, confusionMatrix, rocAucScore, classificationReport } from "deepbox/metrics";import { tensor } from "deepbox/ndarray";const yTrue = tensor([0, 1, 1, 0, 1, 0, 1, 1]);const yPred = tensor([0, 1, 0, 0, 1, 1, 1, 1]);accuracy(yTrue, yPred); // 0.75precision(yTrue, yPred); // TP / (TP + FP)recall(yTrue, yPred); // TP / (TP + FN)f1Score(yTrue, yPred); // harmonic mean of P and Rconst cm = confusionMatrix(yTrue, yPred);// [[2, 1], ← true 0: 2 correct, 1 misclassified// [1, 4]] ← true 1: 1 misclassified, 4 correct// ROC AUC (requires probability scores)const yScore = tensor([0.1, 0.9, 0.4, 0.2, 0.8, 0.6, 0.95, 0.7]);rocAucScore(yTrue, yScore); // Area under ROC curve// Full reportconst report = classificationReport(yTrue, yPred);