deepbox/ml
Dimensionality Reduction
Reduce the number of features while preserving the maximum amount of information (variance). PCA finds orthogonal directions of maximum variance by eigendecomposition of the covariance matrix (or equivalently, SVD of the centered data matrix). Used for noise reduction, visualization, speeding up downstream models, and fighting the curse of dimensionality.
PCA
Principal Component Analysis. Centers the data, computes the SVD of the centered matrix, and projects onto the top-k right singular vectors (principal components). These components are orthogonal, ordered by decreasing variance explained. The explainedVarianceRatio property tells you what fraction of total variance each component captures — use it to decide how many components to keep.
PCA Projection
Z = (X − μ) W
Where:
- μ = Column-wise mean of training data
- W = Matrix of top-k eigenvectors of Cov(X), shape [d, k]
- Z = Reduced data, shape [n, k]
Explained Variance Ratio
ratio_j = λⱼ / Σᵢ λᵢ
Where:
- λⱼ = Eigenvalue of the j-th principal component
Reconstruction
X̂ = ZWᵀ + μ
Where:
- X̂ = Approximate reconstruction of original data
Constructor Parameters & Properties
- nComponents: number — Number of principal components to keep. If < 1 (e.g., 0.95), interpreted as the fraction of variance to retain.
- .explainedVarianceRatio: number[] — Fraction of total variance captured by each component
- .components: Tensor — Principal component vectors, shape [nComponents, nFeatures]
- .transform(X) — Project new data into the reduced space
- .inverseTransform(Z) — Reconstruct approximate original data from reduced representation
pca.ts
import { PCA } from "deepbox/ml";import { tensor } from "deepbox/ndarray";import { loadBreastCancer } from "deepbox/datasets";// ── Basic usage ──const X = tensor([ [2.5, 2.4], [0.5, 0.7], [2.2, 2.9], [1.9, 2.2], [3.1, 3.0], [2.3, 2.7],]);const pca = new PCA({ nComponents: 1 });pca.fit(X);const XReduced = pca.transform(X); // shape: [6, 1]const XRecovered = pca.inverseTransform(XReduced); // Approximate reconstructionconsole.log(pca.explainedVarianceRatio); // e.g., [0.96] — 96% variance in 1 component// ── Real-world: reduce 30D breast cancer data to 2D ──const { data } = loadBreastCancer();const pca2 = new PCA({ nComponents: 2 });pca2.fit(data);const X2d = pca2.transform(data);console.log(X2d.shape); // [569, 2]console.log(pca2.explainedVarianceRatio); // [0.xx, 0.xx]