Example 31
beginner
31
DataLoader
Batching
Training

DataLoader — Batching & Shuffling

Training on the entire dataset at once is often impractical. The DataLoader class splits your data into mini-batches for stochastic gradient descent. This example demonstrates creating a DataLoader with custom batch size, iterating over batches in a for-of loop, enabling shuffling for randomized training, using dropLast to discard incomplete final batches, and running inference without labels. Each batch returns [X_batch, y_batch] tensors with shapes determined by the batch size. The DataLoader handles the tensor slicing and optional shuffling automatically.

Deepbox Modules Used

deepbox/datasetsdeepbox/ndarray

What You Will Learn

  • DataLoader splits data into mini-batches for SGD training
  • Use shuffle: true to randomize order each epoch — prevents ordering bias
  • dropLast: true discards the final batch if it's smaller than batchSize
  • Iterate with for-of: each iteration yields [X_batch, y_batch] tensors
  • Set batchSize based on memory — larger = smoother gradients, smaller = more noise

Source Code

31-dataloader/index.ts
1import { DataLoader } from "deepbox/datasets";2import { tensor } from "deepbox/ndarray";34console.log("=== DataLoader ===\5");67const X = tensor([[1,2],[3,4],[5,6],[7,8],[9,10],[11,12],[13,14]]);8const y = tensor([0, 1, 0, 1, 0, 1, 0]);910// Basic batching11const loader = new DataLoader(X, y, { batchSize: 3, shuffle: false });1213console.log("Batch size: 3, 7 samples");14let batch = 0;15for (const [xBatch, yBatch] of loader) {16  console.log(`Batch ${batch}: X shape [${xBatch.shape}], y shape [${yBatch.shape}]`);17  batch++;18}19console.log(`Total batches: ${batch}`);2021// With shuffling22const shuffled = new DataLoader(X, y, { batchSize: 3, shuffle: true });23console.log("\24Shuffled batches:");25for (const [xBatch] of shuffled) {26  console.log(`  X: ${xBatch.toString()}`);27}2829// Drop incomplete last batch30const dropped = new DataLoader(X, y, { batchSize: 3, dropLast: true });31console.log("\32With dropLast: true");33let dropCount = 0;34for (const [xBatch] of dropped) { dropCount++; }35console.log(`Batches: ${dropCount} (last incomplete batch dropped)`);

Console Output

$ npx tsx 31-dataloader/index.ts
=== DataLoader ===

Batch size: 3, 7 samples
Batch 0: X shape [3, 2], y shape [3]
Batch 1: X shape [3, 2], y shape [3]
Batch 2: X shape [1, 2], y shape [1]
Total batches: 3

Shuffled batches:
  X: [[9,10],[3,4],[11,12]]
  X: [[1,2],[13,14],[7,8]]
  X: [[5,6]]

With dropLast: true
Batches: 2 (last incomplete batch dropped)