Scalars, Vectors, and Matrices: The Language of Data
Machine Learning (ML) is built on data, and the language of data is linear algebra. Almost every ML algorithm represents information using scalars, vectors, or matrices. Understanding these basic building blocks is essential before moving to advanced concepts.
Scalars
A scalar is a single number. Scalars often represent:
- A single feature value (e.g., height = 170).
- A model parameter (e.g., learning rate
).
Formally, scalars are just real numbers:
Vectors
A vector is an ordered list of numbers, often representing a data point or a set of features.
- Each entry
is a feature value. - A vector with
entries lives in -dimensional space: .
Explanation of Vectors
Think of a vector as a row of values in your dataset.
- If you have 3 features (height, weight, age), one person’s data = a vector of 3 numbers.
- In ML, feature vectors are the input to models.
Mini example:
If we describe a student with height = 170 cm, weight = 65 kg, and age = 20:
Matrices
A matrix is a 2D array of numbers. In ML, matrices usually represent datasets.
- Each row = one data point (a feature vector).
- Each column = values of one feature across all data points.
Explanation of Matrices
A dataset with
= number of rows = number of examples. = number of columns = number of features.
So:.
Mini example:
Suppose we record 3 students with features [height, weight, age]:
- 3 rows = 3 students (examples).
- 3 columns = 3 features (height, weight, age).
Hands-on with Python and Rust
import numpy as np
# A scalar
a = 3.14
# A vector (student features: height, weight, age)
x = np.array([170, 65, 20])
# A matrix (3 students × 3 features)
X = np.array([
[170, 65, 20],
[180, 75, 22],
[160, 55, 19]
])
print("Scalar:", a)
print("Vector:", x)
print("Matrix:\n", X)
use ndarray::array;
fn main() {
// A scalar
let a: f64 = 3.14;
// A vector (student features: height, weight, age)
let x = array![170.0, 65.0, 20.0];
// A matrix (3 students × 3 features)
let X = array![
[170.0, 65.0, 20.0],
[180.0, 75.0, 22.0],
[160.0, 55.0, 19.0]
];
println!("Scalar: {}", a);
println!("Vector: {:?}", x);
println!("Matrix:\n{:?}", X);
}
Connection to ML
- Scalars → individual feature values or hyperparameters.
- Vectors → single data points (feature vectors).
- Matrices → entire datasets.
Every ML algorithm (from linear regression to neural networks) starts by manipulating these structures. Mastering them is the first step toward understanding ML.
Next Steps
Continue to Vector Operations: Dot Product, Norms, and Distances.