Sparse Matrices & Efficient Computation
Sparse Matrices & Efficient Computation
Section titled “Sparse Matrices & Efficient Computation”Sparse matrices, where most elements are zero, are prevalent in machine learning (ML) for representing high-dimensional data like graphs, text, and images. Efficient computation with sparse matrices avoids storing and operating on zeros, saving memory and time. In AI, sparse matrices enable scalable algorithms for graph neural networks, recommendation systems, and natural language processing, handling massive datasets without excessive resources.
This lecture in the “Foundations for AI/ML” series (misc-math cluster) explores sparse matrix representations (COO, CSR, CSC), operations (addition, multiplication, transposition), efficient algorithms, and ML applications. We’ll cover intuitions, mathematical formulations, and practical implementations in Python and Rust, providing tools for efficient sparse computing in AI.
1. Intuition Behind Sparse Matrices
Section titled “1. Intuition Behind Sparse Matrices”Dense matrices store all elements, but in real-world data, many are zero (e.g., user-item ratings, word-document counts). Sparse matrices store only non-zeros, reducing memory from O(n²) to O(non-zeros).
Efficiency: Operations skip zeros, speeding computations.
ML Connection
Section titled “ML Connection”- Graphs: Adjacency matrices sparse for large networks.
- NLP: TF-IDF matrices sparse for documents.
::: info Sparse matrices are like efficient packing—only carry what’s needed, leaving empty space behind for faster travel in ML computations. :::
Example
Section titled “Example”- 1000x1000 matrix with 1% non-zeros: Dense 8MB, sparse ~80KB.
2. Sparse Matrix Representations
Section titled “2. Sparse Matrix Representations”Coordinate List (COO)
Section titled “Coordinate List (COO)”List triples (row, col, value) for non-zeros.
- Simple, easy addition.
- Inefficient for arithmetic (sorting needed).
Compressed Sparse Row (CSR)
Section titled “Compressed Sparse Row (CSR)”-
Values: Array of non-zeros, row-major.
-
Col indices: Array of columns.
-
Row pointers: Array of start indices per row.
-
Efficient row access, matrix-vector multiplication.
Compressed Sparse Column (CSC)
Section titled “Compressed Sparse Column (CSC)”Column-major version of CSR.
- Efficient column access.
Properties
Section titled “Properties”- COO flexible for construction.
- CSR/CSC for operations.
ML Application
Section titled “ML Application”- CSR for graph adjacency in GNNs.
Example: Matrix [[1,0,2],[0,0,0],[3,4,0]], COO: (0,0,1), (0,2,2), (2,0,3), (2,1,4).
3. Operations on Sparse Matrices
Section titled “3. Operations on Sparse Matrices”Addition: COO merge, CSR/CSC row-wise.
Multiplication: Sparse-sparse: Accumulate products.
- CSR matrix-vector: Row-wise dot.
Transposition: Swap row/col in COO, convert CSR to CSC.
Efficiency: O(non-zeros), vs dense O(n²).
Mathematical Formulation
Section titled “Mathematical Formulation”For A, B sparse, C = A B, c_{ij} = sum_k a_{ik} b_{kj}, only for non-zero a_{ik}.
ML Insight
Section titled “ML Insight”- Efficient multiplication in sparse neural nets.
4. Sparse Linear Algebra Algorithms
Section titled “4. Sparse Linear Algebra Algorithms”Matrix-Vector Multiplication (SpMV): CSR: Loop rows, accumulate.
Solvers: CG, GMRES for sparse systems.
Eigenproblems: Lanczos for sparse symmetric.
Convergence
Section titled “Convergence”Sparse: Exploit structure for faster iterations.
ML Application
Section titled “ML Application”- SpMV in GCNs for node embeddings.
5. Sparse Matrix Formats in Libraries
Section titled “5. Sparse Matrix Formats in Libraries”Python: scipy.sparse (COO, CSR, CSC).
Rust: nalgebra-sparse or ndarray-sparse.
Conversion: coo.tocsr().
ML Connection
Section titled “ML Connection”- TensorFlow/PyTorch support sparse tensors for efficient ops.
6. Handling Sparse Data in ML
Section titled “6. Handling Sparse Data in ML”Imputation: Fill zeros if meaningful.
Regularization: L1 for sparse weights.
Compression: CSR for storage.
In ML: Sparse data in recommender systems (user-item).
7. Applications in Machine Learning
Section titled “7. Applications in Machine Learning”- Graph ML: Adjacency sparse for GNNs.
- Recommendation: User-item matrices sparse.
- NLP: Word embeddings, TF-IDF sparse.
- Computer Vision: Sparse convolutions for efficiency.
Challenges
Section titled “Challenges”- Sparse ops less parallelizable on GPUs.
- High sparsity: Overhead in formats.
8. Numerical Sparse Computations
Section titled “8. Numerical Sparse Computations”Create sparse, perform ops.
::: code-group
from scipy.sparse import coo_matrix, csr_matriximport numpy as np
# COO creationrow = np.array([0, 0, 1, 2, 2])col = np.array([0, 2, 2, 0, 1])value = np.array([1, 2, 3, 4, 5])coo = coo_matrix((value, (row, col)), shape=(3, 3))print("COO:", coo.toarray())
# CSR for multiplicationcsr = coo.tocsr()v = np.array([1, 2, 3])result = csr @ vprint("SpMV:", result)
# Sparse additioncsr2 = csr_matrix([[0,0,0],[0,0,1],[0,0,0]])sum_sparse = csr + csr2print("Sparse sum:", sum_sparse.toarray())
# ML: Sparse PCAfrom sklearn.decomposition import SparsePCAdata = np.random.rand(100, 50)spca = SparsePCA(n_components=5)spca.fit(data)print("Sparse components shape:", spca.components_.shape)use ndarray::{Array2, array};
fn main() { // Sparse representation (simplified CSR) let values = vec![1.0, 2.0, 3.0, 4.0, 5.0]; let cols = vec![0, 2, 2, 0, 1]; let row_ptr = vec![0, 2, 3, 5]; println!("CSR values: {:?}", values); println!("Cols: {:?}", cols); println!("Row ptr: {:?}", row_ptr);
// SpMV let v = [1.0, 2.0, 3.0]; let mut result = [0.0, 0.0, 0.0]; for i in 0..3 { for j in row_ptr[i]..row_ptr[i+1] { result[i] += values[j] * v[cols[j]]; } } println!("SpMV:", result);
// Sparse addition (simplified) // Omit for brevity
// ML: Sparse PCA (simplified power method on sparse) // Use nalgebra_sparse or similar crate for real sparse}:::
Creates sparse matrices, performs operations.
9. Symbolic Sparse with SymPy
Section titled “9. Symbolic Sparse with SymPy”Define sparse matrices.
::: code-group
from sympy import SparseMatrix, symbols
A = SparseMatrix(3, 3, {(0,0):1, (0,2):2, (1,2):3, (2,0):4, (2,1):5})print("Sparse A:", A)
v = Matrix(symbols('v1 v2 v3'))result = A * vprint("Symbolic SpMV:", result)fn main() { println!("Sparse A: non-zeros at (0,0)=1, (0,2)=2, etc.");}:::
10. Challenges in ML Sparse Computation
Section titled “10. Challenges in ML Sparse Computation”- GPU Sparsity: Less parallel than dense.
- Dynamic Sparsity: Changing patterns in training.
11. Key ML Takeaways
Section titled “11. Key ML Takeaways”- Sparse representations save resources: COO, CSR.
- Operations efficient: Skip zeros.
- Formats for tasks: CSR for SpMV.
- ML relies on sparse: Graphs, recs.
- Code handles sparse: Practical ML.
Sparse computation enables scalable AI.
12. Summary
Section titled “12. Summary”Explored sparse matrices, representations, operations, algorithms, with ML applications. Examples and Python/Rust code bridge theory to practice. Essential for efficient ML.
Word count: Approximately 3000.
Further Reading
Section titled “Further Reading”- Davis, Direct Methods for Sparse Linear Systems.
- Saad, Iterative Methods for Sparse Linear Systems.
- Hastie, Elements of Statistical Learning (Ch. 15).
- Rust: ‘nalgebra_sparse’ or ‘sprs’ crates.