Pseudo-Inverse & Ill-Conditioned Systems
Pseudo-Inverse & Ill-Conditioned Systems
Section titled “Pseudo-Inverse & Ill-Conditioned Systems”In machine learning, we often need to invert matrices (e.g., in linear regression: ).
But what if the matrix is not invertible or is ill-conditioned (unstable for inversion)?
This is where the pseudo-inverse and the concept of numerical stability come in.
1. The Moore–Penrose Pseudo-Inverse
Section titled “1. The Moore–Penrose Pseudo-Inverse”If is not square or not invertible, we use the Moore–Penrose inverse .
Definition:
is the unique matrix such that:
In regression:
Instead of solving
we use
where is the pseudo-inverse (often computed via SVD).
::: info ML relevance
- Works even if is singular (e.g., correlated features, fewer samples than features).
- Used in regularized regression and neural network pseudo-inverse training.
:::
2. Handling Non-Invertible Matrices in Regression
Section titled “2. Handling Non-Invertible Matrices in Regression”Situations where is not invertible:
- Multicollinearity: features are linearly dependent.
- Underdetermined systems: more features than samples.
Solutions:
- Use pseudo-inverse.
- Add regularization (Ridge regression: ).
- Reduce dimensionality (PCA).
3. Condition Number & Numerical Stability
Section titled “3. Condition Number & Numerical Stability”The condition number of a matrix (with respect to inversion) is:
- If is large → small input errors cause large output errors.
- High condition number → matrix is ill-conditioned.
::: info ML relevance
- Ill-conditioned means regression weights are highly unstable.
- Regularization (Ridge) reduces condition number.
- QR or SVD are often used instead of direct inversion for stability.
:::
Hands-on with Python and Rust
Section titled “Hands-on with Python and Rust”::: code-group
import numpy as np
# Feature matrix with collinearityX = np.array([[1, 2], [2, 4], [3, 6]]) # second column is 2x firsty = np.array([1, 2, 3])
# Direct normal equation (fails: X^T X not invertible)try: w = np.linalg.inv(X.T @ X) @ X.T @ yexcept np.linalg.LinAlgError: print("Matrix is singular, cannot invert.")
# Use pseudo-inverse insteadw_pinv = np.linalg.pinv(X) @ y
# Condition numbercond_num = np.linalg.cond(X)
print("Pseudo-inverse solution:", w_pinv)print("Condition number of X:", cond_num)use ndarray::{array, Array2, Array1};use ndarray_linalg::{PseudoInverse, Norm};
fn main() { // Feature matrix with collinearity let x: Array2<f64> = array![ [1.0, 2.0], [2.0, 4.0], [3.0, 6.0] ]; let y: Array1<f64> = array![1.0, 2.0, 3.0];
// Pseudo-inverse solution let x_pinv = x.pinv(1e-8).unwrap(); let w = x_pinv.dot(&y);
// Condition number let cond_num = x.norm_l2() * x_pinv.norm_l2();
println!("Pseudo-inverse solution: {:?}", w); println!("Condition number of X: {}", cond_num);}:::
Summary
Section titled “Summary”- Pseudo-inverse (Moore–Penrose) solves regression when is not invertible.
- Ill-conditioning → unstable solutions due to large condition numbers.
- Fixes → pseudo-inverse, regularization, SVD/QR-based methods.
Next Steps
Section titled “Next Steps”Continue to Block Matrices and Kronecker Products.