Example 17
beginner
17
Preprocessing
Encoders
Categorical Data

Preprocessing — Encoders

Machine learning models require numeric input, but real-world data often contains categorical variables like 'red', 'green', 'blue' or 'small', 'medium', 'large'. This example demonstrates all 5 Deepbox encoders: LabelEncoder maps categories to integers (red→0, green→1, blue→2), OneHotEncoder creates binary columns for each category, OrdinalEncoder preserves ordering for ordinal data, MultiLabelBinarizer handles multiple labels per sample, and LabelBinarizer converts a single column of categories into a binary matrix. Each encoder supports .fit(), .transform(), and .inverseTransform() for round-trip conversion.

Deepbox Modules Used

deepbox/ndarraydeepbox/preprocess

What You Will Learn

  • LabelEncoder maps categories to integers — use for tree-based models
  • OneHotEncoder creates binary columns — required for linear models and neural nets
  • OrdinalEncoder preserves ordering for ordinal features (small < medium < large)
  • All encoders support .inverseTransform() for decoding predictions back to labels

Source Code

17-preprocessing-encoders/index.ts
1import { tensor } from "deepbox/ndarray";2import { LabelEncoder, OneHotEncoder, OrdinalEncoder, LabelBinarizer } from "deepbox/preprocess";34console.log("=== Preprocessing Encoders ===\n");56// LabelEncoder: categories → integers7const le = new LabelEncoder();8le.fit(tensor(["cat", "dog", "bird", "cat", "dog"]));9const encoded = le.transform(tensor(["cat", "bird", "dog"]));10console.log("LabelEncoder:", encoded.toString());  // [0, 1, 2]11const decoded = le.inverseTransform(encoded);12console.log("Inverse:", decoded.toString());  // ["cat", "bird", "dog"]1314// OneHotEncoder: categories → binary columns15const ohe = new OneHotEncoder();16ohe.fit(tensor(["red", "green", "blue", "red"]));17const onehot = ohe.transform(tensor(["red", "blue", "green"]));18console.log("\nOneHotEncoder:");19console.log(onehot.toString());  // [[1,0,0], [0,0,1], [0,1,0]]2021// OrdinalEncoder: preserves ordering22const oe = new OrdinalEncoder();23oe.fit(tensor(["small", "medium", "large"]));24console.log("\nOrdinalEncoder:", oe.transform(tensor(["large", "small"])).toString());

Console Output

$ npx tsx 17-preprocessing-encoders/index.ts
=== Preprocessing Encoders ===

LabelEncoder: [0, 1, 2]
Inverse: ["cat", "bird", "dog"]

OneHotEncoder:
[[1, 0, 0],
 [0, 0, 1],
 [0, 1, 0]]

OrdinalEncoder: [2, 0]