deepbox/ml

Naive Bayes

Probabilistic classifiers based on Bayes' theorem with the 'naive' conditional independence assumption: features are assumed independent given the class label. Despite this strong assumption rarely being true, Naive Bayes often performs surprisingly well in practice. Training is O(n·d) — simply computes per-class statistics. Prediction is O(k·d) where k is the number of classes.

GaussianNB

Gaussian Naive Bayes. Models each feature within each class as a Gaussian distribution N(μ_c,j, σ²_c,j). During training, estimates μ and σ² per class per feature. During prediction, applies Bayes' rule with the Gaussian likelihood. Extremely fast and requires very little training data. Works best when features are roughly normally distributed within each class.

Bayes' Theorem

P(y|x) = P(x|y) · P(y) / P(x)

Where:

P(y|x) = Posterior — probability of class y given features x
P(x|y) = Likelihood — probability of features given class
P(y) = Prior — relative frequency of class y

Naive Independence Assumption

P(x|y) = ∏ⱼ P(xⱼ|y)

Where:

xⱼ = Feature j, treated independently

Gaussian Likelihood

P(xⱼ|y=c) = (1 / σ_cj√(2π)) · exp(−(xⱼ − μ_cj)² / 2σ²_cj)

Where:

μ_cj = Mean of feature j in class c
σ_cj = Std dev of feature j in class c

naive-bayes.ts

import { GaussianNB } from "deepbox/ml";import { tensor } from "deepbox/ndarray";import { loadIris } from "deepbox/datasets";import { trainTestSplit } from "deepbox/preprocess";import { accuracy } from "deepbox/metrics";const { data, target } = loadIris();const [XTrain, XTest, yTrain, yTest] = trainTestSplit(data, target, { testSize: 0.3, randomState: 42 });const nb = new GaussianNB();nb.fit(XTrain, yTrain);const yPred = nb.predict(XTest);console.log(accuracy(yTest, yPred)); // ~0.95+

Nearest Neighbors

Clustering