Skip to content

Matrix Operations - Multiplication, Transpose, and Inverse

Matrix Operations: Multiplication, Transpose, and Inverse

Section titled “Matrix Operations: Multiplication, Transpose, and Inverse”

Matrices are at the heart of machine learning. Operations on matrices power linear regression, neural networks, dimensionality reduction, and much more. In this lesson, we’ll cover three fundamental operations: multiplication, transpose, and inverse.

Given two matrices ARm×nA \in \mathbb{R}^{m \times n} and BRn×pB \in \mathbb{R}^{n \times p}, their product C=ABC = AB is defined as:

Cij=k=1nAikBkjC_{ij} = \sum_{k=1}^n A_{ik} B_{kj}
  • The number of columns of A must equal the number of rows of B.
  • The result has shape m×pm \times p.

::: info Explanation of Multiplication Matrix multiplication represents combining linear transformations.

  • In ML: multiplying dataset matrix XX with parameter vector ww computes predictions.
    :::

Mini Example:

A=[1234],B=[56]A = \begin{bmatrix}1 & 2 \\ 3 & 4\end{bmatrix}, B = \begin{bmatrix}5 \\ 6\end{bmatrix}

Then:

AB=[15+2635+46]=[1739]AB = \begin{bmatrix}1\cdot 5 + 2\cdot 6 \\ 3\cdot 5 + 4\cdot 6\end{bmatrix} = \begin{bmatrix}17 \\ 39\end{bmatrix}

The transpose of a matrix AA flips rows and columns:

(AT)ij=Aji(A^T)_{ij} = A_{ji}

Example:

A=[123456]AT=[142536]A = \begin{bmatrix}1 & 2 & 3 \\ 4 & 5 & 6\end{bmatrix} \quad \Rightarrow \quad A^T = \begin{bmatrix}1 & 4 \\ 2 & 5 \\ 3 & 6\end{bmatrix}

::: info Explanation of Transpose Transpose is essential in ML when switching between “row as samples” vs “column as features”.

  • In regression: XTXX^T X appears in the normal equation.
    :::

The inverse of a square matrix AA (if it exists) is A1A^{-1} such that:

AA1=A1A=IAA^{-1} = A^{-1}A = I

where II is the identity matrix.

  • Not all matrices are invertible (singular matrices don’t have inverses).
  • In ML: matrix inverses appear in the normal equation for linear regression:
    w=(XTX)1XTyw = (X^T X)^{-1} X^T y

Mini Example:

A=[4726]A1=1(4)(6)(7)(2)[6724]=110[6724]A = \begin{bmatrix}4 & 7 \\ 2 & 6\end{bmatrix} \quad \Rightarrow \quad A^{-1} = \frac{1}{(4)(6) - (7)(2)} \begin{bmatrix}6 & -7 \\ -2 & 4\end{bmatrix} = \frac{1}{10} \begin{bmatrix}6 & -7 \\ -2 & 4\end{bmatrix}

::: info Explanation of Inverse Matrix inverse is like division for matrices.

  • But inversion is computationally expensive.
  • In ML practice, we often avoid explicit inversion, using more stable methods (like QR or gradient descent).
    :::

::: code-group

import numpy as np
# Define matrices
A = np.array([[1, 2], [3, 4]])
B = np.array([[5], [6]])
# Multiplication
C = A.dot(B)
# Transpose
AT = A.T
# Inverse
A_inv = np.linalg.inv(np.array([[4, 7], [2, 6]]))
print("A * B =\n", C)
print("Transpose of A =\n", AT)
print("Inverse of [[4,7],[2,6]] =\n", A_inv)
use ndarray::{array, Array2};
use ndarray_linalg::Inverse;
fn main() {
// Define matrices
let a: Array2<f64> = array![[1.0, 2.0], [3.0, 4.0]];
let b: Array2<f64> = array![[5.0], [6.0]];
// Multiplication
let c = a.dot(&b);
// Transpose
let at = a.t().to_owned();
// Inverse
let a2: Array2<f64> = array![[4.0, 7.0], [2.0, 6.0]];
let a_inv = a2.inv().unwrap();
println!("A * B =\n{:?}", c);
println!("Transpose of A =\n{:?}", at);
println!("Inverse of [[4,7],[2,6]] =\n{:?}", a_inv);
}

:::

  • Multiplication → computing predictions, combining transformations.
  • Transpose → appears in regression, covariance, gradient formulas.
  • Inverse → theoretical solution in linear regression (though avoided in practice).

Continue to Special Matrices: Identity, Diagonal, Orthogonal.