Learning Rate Schedulers

Learning rate scheduling is critical for training neural networks effectively. A fixed learning rate is rarely optimal — you typically want a high rate early for fast progress and a lower rate later for fine-tuning. This example demonstrates all 8 Deepbox schedulers: StepLR (decay every N epochs), MultiStepLR (decay at specific milestones), ExponentialLR (multiplicative decay), CosineAnnealingLR (cosine curve from max to min), LinearLR (linear ramp), ReduceLROnPlateau (reduce when metric stalls), OneCycleLR (1cycle policy with warmup), and WarmupLR (gradual warmup before the main schedule). Each scheduler is stepped through 20 epochs and the LR trajectory is printed.

Deepbox Modules Used

deepbox/nndeepbox/optim

What You Will Learn

StepLR/MultiStepLR for simple milestone-based decay
CosineAnnealingLR for smooth decay — popular in modern training
OneCycleLR ramps up then down — often gives best results
WarmupLR prevents early training instability with large learning rates
Call scheduler.step() after each epoch (or step for OneCycleLR)

Source Code

16-lr-schedulers/index.ts

1import { Linear, Sequential, ReLU } from "deepbox/nn";2import { SGD, StepLR, CosineAnnealingLR, OneCycleLR, ExponentialLR, WarmupLR } from "deepbox/optim";34const model = new Sequential(new Linear(2, 8), new ReLU(), new Linear(8, 1));56// StepLR: decay by 0.1 every 5 epochs7const opt1 = new SGD(model.parameters(), { lr: 0.1 });8const sched1 = new StepLR(opt1, { stepSize: 5, gamma: 0.5 });9console.log("StepLR:");10for (let i = 0; i < 20; i++) {11  if (i % 5 === 0) console.log(`  Epoch ${i}: lr = ${opt1.lr.toFixed(6)}`);12  sched1.step();13}1415// CosineAnnealingLR: smooth cosine decay16const opt2 = new SGD(model.parameters(), { lr: 0.1 });17const sched2 = new CosineAnnealingLR(opt2, { tMax: 20, etaMin: 0.001 });18console.log("\19CosineAnnealingLR:");20for (let i = 0; i < 20; i++) {21  if (i % 5 === 0) console.log(`  Epoch ${i}: lr = ${opt2.lr.toFixed(6)}`);22  sched2.step();23}2425// OneCycleLR: 1cycle policy26const opt3 = new SGD(model.parameters(), { lr: 0.01 });27const sched3 = new OneCycleLR(opt3, { maxLr: 0.1, totalSteps: 20 });28console.log("\29OneCycleLR:");30for (let i = 0; i < 20; i++) {31  if (i % 5 === 0) console.log(`  Epoch ${i}: lr = ${opt3.lr.toFixed(6)}`);32  sched3.step();33}

Console Output

$ npx tsx 16-lr-schedulers/index.ts

StepLR:
  Epoch 0: lr = 0.100000
  Epoch 5: lr = 0.050000
  Epoch 10: lr = 0.025000
  Epoch 15: lr = 0.012500

CosineAnnealingLR:
  Epoch 0: lr = 0.100000
  Epoch 5: lr = 0.075000
  Epoch 10: lr = 0.050500
  Epoch 15: lr = 0.013000

OneCycleLR:
  Epoch 0: lr = 0.010000
  Epoch 5: lr = 0.078000
  Epoch 10: lr = 0.100000
  Epoch 15: lr = 0.034000

Activation FunctionsPrevious Preprocessing — EncodersNext