Data Analysis & Visualization
This example demonstrates a full exploratory data analysis (EDA) workflow on an employee dataset with 20 records. You create a DataFrame with columns for name, department, salary, experience, and age. Then you compute descriptive statistics (mean, standard deviation) using deepbox/stats, group employees by department to calculate average salaries with groupBy().agg(), filter for high earners using .filter(), and compute a correlation matrix between salary, experience, and age using corrcoef(). The example produces four SVG visualizations: a scatter plot showing salary vs. experience, a histogram of salary distribution, a bar chart comparing department averages, and a heatmap of the correlation matrix. All plots are generated server-side using the stateful Figure/Axes API — no browser required.
Deepbox Modules Used
deepbox/dataframedeepbox/ndarraydeepbox/statsdeepbox/plotWhat You Will Learn
- Build DataFrames from plain objects and inspect with .shape, .columns, .head()
- Compute descriptive stats (mean, std) on extracted tensor columns
- Group rows by a column and aggregate with .groupBy().agg()
- Filter rows with arbitrary predicate functions
- Compute correlation matrices and generate SVG plots server-side
Source Code
1import { DataFrame } from "deepbox/dataframe";2import { tensor } from "deepbox/ndarray";3import { Figure } from "deepbox/plot";4import { corrcoef, mean, std } from "deepbox/stats";56const employeeData = new DataFrame({7 name: ["Alice", "Bob", "Charlie", "David", "Eve", "Frank",8 "Grace", "Henry", "Ivy", "Jack", "Kate", "Leo",9 "Mia", "Noah", "Olivia", "Paul", "Quinn", "Rachel",10 "Sam", "Tina"],11 department: ["Engineering", "Sales", "Engineering", "HR",12 "Engineering", "Sales", "Marketing", "Engineering",13 "HR", "Sales", "Engineering", "Marketing",14 "Sales", "Engineering", "HR", "Sales",15 "Engineering", "Marketing", "Engineering", "Sales"],16 salary: [95000, 65000, 105000, 55000, 98000, 72000, 68000,17 110000, 58000, 70000, 102000, 71000, 67000, 115000,18 60000, 69000, 108000, 73000, 112000, 66000],19 experience: [5, 3, 8, 2, 6, 4, 3, 10, 2, 5, 7, 4, 3, 12,20 3, 4, 9, 5, 11, 3],21});2223// Descriptive statistics24const salaryTensor = tensor(employeeData.get("salary").toArray());25console.log("Mean salary: $" + Number(mean(salaryTensor).data[0]).toFixed(2));26console.log("Std dev: $" + Number(std(salaryTensor).data[0]).toFixed(2));2728// Group by department29const deptStats = employeeData.groupBy("department").agg({30 salary: "mean", experience: "mean"31});32console.log("\nDepartment Averages:");33console.log(deptStats.toString());3435// Filter high earners36const highEarners = employeeData.filter(row => row.salary > 100000);37console.log("\nHigh earners (>$100k):", highEarners.shape[0], "employees");3839// Correlation analysis40const salaries = employeeData.get("salary").toArray();41const experiences = employeeData.get("experience").toArray();42const corr = corrcoef(tensor([salaries, experiences]));43console.log("\nCorrelation (salary vs experience):");44console.log(corr.toString());4546// Generate scatter plot47const fig = new Figure();48const ax = fig.addAxes();49ax.scatter(tensor(experiences), salaryTensor, { color: "#1f77b4" });50ax.setTitle("Salary vs Experience");51ax.setXLabel("Years of Experience");52ax.setYLabel("Salary ($)");53const svg = fig.renderSVG();54console.log("\n✓ Generated salary-vs-experience.svg");Console Output
Mean salary: $81450.00
Std dev: $20152.34
Department Averages:
┌─────────────┬──────────┬────────────┐
│ department │ salary │ experience │
├─────────────┼──────────┼────────────┤
│ Engineering │ 103125.0 │ 8.5 │
│ Sales │ 68167.0 │ 3.7 │
│ Marketing │ 70667.0 │ 4.0 │
│ HR │ 57667.0 │ 2.3 │
└─────────────┴──────────┴────────────┘
High earners (>$100k): 5 employees
Correlation (salary vs experience):
[[1.000, 0.962],
[0.962, 1.000]]
✓ Generated salary-vs-experience.svg