Scalers
StandardScaler
Standardize features by removing the mean and scaling to unit variance: z = (x − μ) / σ. Each feature is independently transformed. The most common scaler for models that assume normally distributed input (SVM, logistic regression, neural networks).
MinMaxScaler
Scale features to a given range [min, max] (default [0, 1]): x' = (x − x_min) / (x_max − x_min). Preserves the shape of the original distribution. Sensitive to outliers.
RobustScaler
Scale features using statistics robust to outliers: x' = (x − median) / IQR. Uses median and interquartile range instead of mean and standard deviation.
MaxAbsScaler
Scale each feature by its maximum absolute value: x' = x / max(|x|). Scales to [−1, 1] range. Does not shift/center the data (preserves sparsity).
PowerTransformer
Apply a power transform (Yeo-Johnson or Box-Cox) to make data more Gaussian-like. Useful when features have skewed distributions.
QuantileTransformer
Transform features to follow a uniform or normal distribution using quantile information. Robust to outliers but can distort correlations between features.
Normalizer
Normalize samples (rows) to unit norm: x' = x / ‖x‖. Supports L1, L2, and max norms. Operates on rows, not columns.
Scaler API (Common Methods)
- .fit(X: Tensor) — Learn scaling parameters from training data
- .transform(X: Tensor) — Apply learned transformation
- .fitTransform(X: Tensor) — fit() + transform() in one call
- .inverseTransform(X: Tensor) — Reverse the transformation
Standard
Where:
- μ = Feature mean
- σ = Feature std
MinMax
Where:
- x_min, x_max = Feature min/max from training data
Robust
Where:
- IQR = Q3 − Q1 (interquartile range)
import { StandardScaler, MinMaxScaler, RobustScaler } from "deepbox/preprocess";import { tensor } from "deepbox/ndarray";const X = tensor([[1, 2], [3, 4], [5, 6], [7, 8]]);// StandardScaler: mean=0, std=1const ss = new StandardScaler();ss.fit(X);const XStd = ss.transform(X); // Standardizedconst XOrig = ss.inverseTransform(XStd); // Back to original// MinMaxScaler: scale to [0, 1]const mms = new MinMaxScaler();const XNorm = mms.fitTransform(X); // fit + transform in one call// RobustScaler: robust to outliersconst rs = new RobustScaler();const XRobust = rs.fitTransform(X);Choosing a Scaler
- StandardScaler — Default choice for most models. Assumes roughly Gaussian features.
- MinMaxScaler — When you need bounded output (e.g., [0, 1] for neural networks).
- RobustScaler — When data contains outliers that would distort mean/std.
- MaxAbsScaler — When you want to preserve sparsity (zero values remain zero).
- PowerTransformer — When features are skewed and you need Gaussian-like input.
- QuantileTransformer — When you need a specific output distribution regardless of input.