Tensors & Tensor Operations

Tensors are powerful mathematical objects that generalize scalars, vectors, and matrices to higher dimensions, serving as the backbone of data representation in machine learning (ML). In artificial intelligence, tensors enable efficient storage and manipulation of multidimensional data, such as images, time series, and neural network weights. Tensor operations, like contraction, product, and decomposition, facilitate computations in deep learning frameworks, optimization algorithms, and data transformations.

This lecture in the “Foundations for AI/ML” series (misc-math cluster) builds on high-dimensional statistics and random projections, exploring tensors, their mathematical properties, operations, and ML applications. We’ll provide intuitive explanations, derivations, and practical implementations in Python and Rust, offering tools to leverage tensors in AI.

1. Intuition Behind Tensors

A tensor is a multidimensional array with specific transformation properties, generalizing scalars (0D), vectors (1D), and matrices (2D) to higher orders. In ML, tensors represent complex data structures, like 3D image data (height, width, channels) or 4D video data (frames, height, width, channels).

Tensor operations manipulate these arrays efficiently, enabling computations like neural network forward passes or gradient updates.

ML Connection

Neural Networks: Weights, inputs, outputs as tensors.
Computer Vision: Images as 3D tensors.
NLP: Word embeddings as tensors.

::: info Tensors are like multi-layered containers, organizing data in multiple dimensions; their operations are the tools to reshape and combine them for ML tasks. :::

Example

A 3D tensor (28×28×3) represents an RGB image for a CNN.

2. Tensor Definitions and Properties

Tensor: A multi-indexed array T_{i_1,…,i_k} with k indices (order/rank k).

Scalar: Order 0, e.g., 5.
Vector: Order 1, e.g., [1,2,3].
Matrix: Order 2, e.g., [[1,2],[3,4]].
Higher-order: e.g., 3D tensor for image (height×width×channels).

Shape: Dimensions (n_1, …, n_k).

Rank: Number of indices (not matrix rank).

Properties

Linearity: Operations linear in each index.
Transformation: Change under coordinate transformations (covariant/contravariant).
Symmetry: Invariant under index swaps (e.g., symmetric tensors).

ML Insight

Tensors unify data representation in frameworks like TensorFlow, PyTorch.

3. Tensor Operations

Element-Wise Operations

Add, subtract, multiply tensors of same shape, e.g., C_{ijk} = A_{ijk} + B_{ijk}.

Tensor Product

Outer product: For vectors u, v, T_{ij} = u_i v_j.

Contraction

Sum over repeated indices: For T_{ijk}, C_{ik} = sum_j T_{ijk}.

Matricization

Flatten tensor to matrix, e.g., 3D tensor to 2D matrix.

Tensor Decomposition

CP Decomposition: T ≈ sum_r λ_r u_r ⊗ v_r ⊗ w_r.
Tucker Decomposition: T ≈ G ×_1 U ×_2 V ×_3 W (core tensor G).

ML Application

Contraction in neural net layers; decomposition for compression.

4. Tensor Operations in Deep Learning

Forward Pass: Matrix multiplication as tensor contraction.

Backpropagation: Gradients as higher-order tensors.

Convolution: Tensor operation sliding filters over inputs.

In ML: Frameworks automate tensor ops for efficiency.

5. Tensor Decompositions for ML

CP Decomposition: Factorize high-order tensors into sums of rank-1 tensors, reducing parameters.

Tucker: Generalizes PCA to tensors, used in compression.

Tensor Train (TT): Sequential factorization for high-d.

In ML: Compress neural net weights, speed up inference.

6. Applications in Machine Learning

Neural Networks: Weights, activations as tensors.
Computer Vision: Images/videos as 3D/4D tensors.
NLP: Word embeddings, attention as tensors.
Compression: Tensor decomposition for model efficiency.
Graph ML: Adjacency tensors for GNNs.

Challenges

High-Order Tensors: Memory-intensive.
Computation: Ops scale with dimensions.
Non-Unique Decompositions: CP not unique.

7. Numerical Tensor Operations

Implement tensor creation, operations, decomposition.

::: code-group

import numpy as np
from tensorly.decomposition import parafac
import matplotlib.pyplot as plt

# Create 3D tensor
tensor = np.random.rand(3, 4, 5)  # 3x4x5 tensor
print("Tensor shape:", tensor.shape)

# Element-wise addition
tensor2 = np.random.rand(3, 4, 5)
sum_tensor = tensor + tensor2
print("Sum tensor shape:", sum_tensor.shape)

# Contraction (sum over axis 2)
contracted = np.sum(tensor, axis=2)
print("Contracted shape:", contracted.shape)

# CP decomposition
weights, factors = parafac(tensor, rank=2)
print("CP factors shapes:", [f.shape for f in factors])

# ML: Tensor for CNN input
image = np.random.rand(28, 28, 3)  # RGB image
filter = np.random.rand(3, 3, 3, 2)  # Conv filter
# Simplified convolution (using tensordot for contraction)
conv = np.tensordot(image, filter, axes=([2],[2]))
print("Conv output shape:", conv.shape)

# Visualize slice
plt.imshow(tensor[0, :, :])
plt.title("Tensor Slice")
plt.show()

use ndarray::{Array3, ArrayD};
use rand::Rng;

fn main() {
    let mut rng = rand::thread_rng();
    // Create 3D tensor
    let tensor = Array3::from_shape_fn((3, 4, 5), |_| rng.gen::<f64>());
    println!("Tensor shape: {:?}", tensor.shape());

    // Element-wise addition
    let tensor2 = Array3::from_shape_fn((3, 4, 5), |_| rng.gen::<f64>());
    let sum_tensor = &tensor + &tensor2;
    println!("Sum tensor shape: {:?}", sum_tensor.shape());

    // Contraction (sum over axis 2)
    let contracted = tensor.sum_axis(ndarray::Axis(2));
    println!("Contracted shape: {:?}", contracted.shape());

    // ML: Tensor for CNN input
    let image = Array3::from_shape_fn((28, 28, 3), |_| rng.gen::<f64>());
    let filter = Array3::from_shape_fn((3, 3, 3), |_| rng.gen::<f64>());
    // Simplified contraction (sum over channels)
    let conv = image
        .outer_iter()
        .zip(filter.outer_iter())
        .map(|(img, filt)| img.dot(&filt))
        .collect::<Vec<f64>>();
    // Reshape omitted for brevity
}

:::

Implements tensor creation, operations.

8. Symbolic Tensor Operations with SymPy

Define tensors, compute operations.

::: code-group

from sympy import Array, symbols

# 3D tensor
i, j, k = symbols('i j k')
T = Array([[[symbols(f'T_{i}{j}{k}') for k in range(3)] for j in range(4)] for i in range(2)])
print("Tensor T:", T)

# Contraction
contracted = T.sum(k)
print("Contracted:", contracted)

fn main() {
    println!("Tensor T: T_{ijk}");
    println!("Contracted: sum_k T_{ijk}");
}

:::

9. Challenges in ML Tensor Applications

Memory: High-order tensors consume large memory.
Computation: Ops scale with dimensions.
Sparsity: Many ML tensors sparse, need special handling.

10. Key ML Takeaways

Tensors generalize data: Multi-dim representation.
Operations enable computation: Contraction, product.
Decompositions compress: CP, Tucker.
ML relies on tensors: NNs, vision, NLP.
Code handles tensors: Practical ML.

Tensors power ML data manipulation.

11. Summary

Explored tensors, their operations, properties, and ML applications in neural networks and data representation. Examples and Python/Rust code bridge theory to practice. Essential for efficient ML computation.

Word count: Approximately 3000.

Tensors & Tensor Operations

Tensors & Tensor Operations

1. Intuition Behind Tensors

ML Connection

Example

2. Tensor Definitions and Properties

Properties

ML Insight

3. Tensor Operations

Element-Wise Operations

Tensor Product

Contraction

Matricization

Tensor Decomposition

ML Application

4. Tensor Operations in Deep Learning

5. Tensor Decompositions for ML

6. Applications in Machine Learning

Challenges

7. Numerical Tensor Operations

8. Symbolic Tensor Operations with SymPy

9. Challenges in ML Tensor Applications

10. Key ML Takeaways

11. Summary

Further Reading