Martingales & Random Walks
Martingales & Random Walks
Section titled “Martingales & Random Walks”Martingales and random walks are fundamental stochastic processes that model sequential random phenomena, widely used in probability theory and machine learning (ML). A martingale represents a sequence where the expected future value, given the past, equals the current value, modeling “fair” processes. Random walks describe sequences of steps with random increments, often used in algorithms and simulations. In ML, these concepts underpin reinforcement learning, stochastic optimization, and convergence analysis of algorithms like stochastic gradient descent (SGD).
This lecture in the “Foundations for AI/ML” series (misc-math cluster) builds on concentration inequalities and information geometry, exploring martingales, random walks, their properties, convergence theorems, and ML applications. We’ll provide intuitive explanations, mathematical derivations, and practical implementations in Python and Rust, offering tools to analyze stochastic processes in AI.
1. Intuition Behind Martingales and Random Walks
Section titled “1. Intuition Behind Martingales and Random Walks”Martingales: Model processes where the expected next step, given all prior information, is the current state—like a fair game where you neither gain nor lose on average.
Random Walks: Model a path of random steps, like a drunkard wandering left or right. They generalize to many ML scenarios, such as exploration in reinforcement learning.
ML Connection
Section titled “ML Connection”- Reinforcement Learning: Martingales model value functions in MDPs.
- Stochastic Optimization: Random walks describe SGD trajectories.
- Algorithm Analysis: Martingale bounds for convergence.
::: info Martingales are like a gambler’s fair bet, keeping expectations steady; random walks are like a wanderer’s unpredictable path, guiding ML algorithms through randomness. :::
Example
Section titled “Example”- Martingale: Cumulative rewards in a fair game, E[X_{t+1}|X_1,…,X_t] = X_t.
- Random Walk: SGD parameter updates with random gradients.
2. Definitions and Properties
Section titled “2. Definitions and Properties”Martingales
Section titled “Martingales”A sequence {X_t} is a martingale with respect to filtration {F_t} if:
- E[|X_t|] < ∞ (finite expectation).
- X_t is F_t-measurable (adapted).
- E[X_{t+1}|F_t] = X_t.
Submartingale: E[X_{t+1}|F_t] ≥ X_t.
Supermartingale: E[X_{t+1}|F_t] ≤ X_t.
Random Walks
Section titled “Random Walks”A random walk S_t = sum_{i=1}^t Z_i, where Z_i are i.i.d. increments (e.g., Z_i = ±1 with p=0.5).
Properties:
- Martingale if E[Z_i] = 0.
- Variance: Var(S_t) = t Var(Z_i) for i.i.d.
ML Insight
Section titled “ML Insight”- Martingales model fair processes in RL; random walks model SGD noise.
3. Key Theorems and Bounds
Section titled “3. Key Theorems and Bounds”Martingale Convergence Theorem
Section titled “Martingale Convergence Theorem”If {X_t} martingale, E[|X_t|] bounded, then X_t → X almost surely.
Optional Stopping Theorem
Section titled “Optional Stopping Theorem”For a martingale {X_t} and bounded stopping time τ, E[X_τ] = E[X_0].
Azuma-Hoeffding Inequality
Section titled “Azuma-Hoeffding Inequality”For martingale difference sequence {Z_t} with |Z_t| ≤ c_t:
[ P(|X_t - X_0| \geq a) \leq 2 \exp\left(-\frac{2a^2}{\sum c_t^2}\right) ]
Random Walk Behavior
Section titled “Random Walk Behavior”- Simple symmetric walk (Z_i = ±1): E[S_t] = 0, Var(S_t) = t.
- Recurrence: 1D/2D walks return to origin; 3D+ transient.
ML Application
Section titled “ML Application”- Azuma for bounding SGD convergence.
4. Martingales in Algorithm Analysis
Section titled “4. Martingales in Algorithm Analysis”Martingales bound deviations in stochastic algorithms.
Example: SGD updates θ_{t+1} = θ_t - η ∇L(θ_t, ξ_t), where ξ_t is random data. Cumulative error forms a martingale.
Doob’s Decomposition: Split process into martingale + predictable part.
In ML: Analyze convergence of online learning.
5. Random Walks in ML Algorithms
Section titled “5. Random Walks in ML Algorithms”Random walks model:
- SGD: Parameter updates as noisy steps.
- Exploration: RL agents navigating state spaces.
- MCMC: Sampling via random walks (e.g., Metropolis).
ML Connection
Section titled “ML Connection”- Random walk convergence informs MCMC efficiency.
6. Applications in Machine Learning
Section titled “6. Applications in Machine Learning”- Reinforcement Learning: Martingales for value function convergence in MDPs.
- Stochastic Optimization: Random walks model SGD trajectories.
- MCMC Sampling: Random walks for posterior sampling in Bayesian ML.
- Bandits: Martingale bounds for regret analysis.
Challenges
Section titled “Challenges”- Non-i.i.d. data breaks martingale assumptions.
- High-dim walks complex to analyze.
7. Numerical Simulations of Martingales and Random Walks
Section titled “7. Numerical Simulations of Martingales and Random Walks”Simulate random walk, martingale, apply Azuma bound.
::: code-group
import numpy as npimport matplotlib.pyplot as plt
# Random walk simulationdef random_walk(n_steps): steps = np.random.choice([-1, 1], size=n_steps) return np.cumsum(steps)
n_steps = 1000walk = random_walk(n_steps)plt.plot(walk)plt.title("Random Walk")plt.show()
# Martingale (cumulative sum of zero-mean increments)def martingale(n_steps): increments = np.random.normal(0, 1, n_steps) return np.cumsum(increments)
mart = martingale(n_steps)plt.plot(mart)plt.title("Martingale")plt.show()
# Azuma-Hoeffding bounddef azuma_bound(a, n, c=1): return 2 * np.exp(-2 * a**2 / (n * c**2))
a = 10n = 1000bound = azuma_bound(a, n)emp_prob = np.mean(np.abs(mart) >= a)print("Azuma bound P(|X_n|≥10):", bound, "Empirical:", emp_prob)
# ML: SGD as random walkdef sgd_simulation(theta_init, n_steps, eta=0.01): theta = theta_init trajectory = [theta] for _ in range(n_steps): grad = np.random.normal(0, 1) # Noisy gradient theta -= eta * grad trajectory.append(theta) return np.array(trajectory)
theta_init = 0sgd_traj = sgd_simulation(theta_init, n_steps)plt.plot(sgd_traj)plt.title("SGD as Random Walk")plt.show()use rand::Rng;use plotters::prelude::*;
fn random_walk(n_steps: usize) -> Vec<f64> { let mut rng = rand::thread_rng(); let mut walk = vec![0.0]; for _ in 0..n_steps { let step = if rng.gen_bool(0.5) { 1.0 } else { -1.0 }; walk.push(walk.last().unwrap() + step); } walk}
fn martingale(n_steps: usize) -> Vec<f64> { let mut rng = rand::thread_rng(); let normal = rand_distr::Normal::new(0.0, 1.0).unwrap(); let mut mart = vec![0.0]; for _ in 0..n_steps { let increment = normal.sample(&mut rng); mart.push(mart.last().unwrap() + increment); } mart}
fn azuma_bound(a: f64, n: f64, c: f64) -> f64 { 2.0 * (-2.0 * a.powi(2) / (n * c.powi(2))).exp()}
fn main() -> Result<(), Box<dyn std::error::Error>> { let n_steps = 1000; let walk = random_walk(n_steps); let root = BitMapBackend::new("random_walk.png", (800, 600)).into_drawing_area(); root.fill(&WHITE)?; let mut chart = ChartBuilder::on(&root) .caption("Random Walk", ("sans-serif", 50)) .build_cartesian_2d(0f64..n_steps as f64, -50f64..50f64)?; chart.draw_series(LineSeries::new( (0..=n_steps).map(|i| (i as f64, walk[i])), &BLUE, ))?;
let mart = martingale(n_steps); let root_m = BitMapBackend::new("martingale.png", (800, 600)).into_drawing_area(); root_m.fill(&WHITE)?; let mut chart_m = ChartBuilder::on(&root_m) .caption("Martingale", ("sans-serif", 50)) .build_cartesian_2d(0f64..n_steps as f64, -50f64..50f64)?; chart_m.draw_series(LineSeries::new( (0..=n_steps).map(|i| (i as f64, mart[i])), &BLUE, ))?;
let a = 10.0; let n = n_steps as f64; let bound = azuma_bound(a, n, 1.0); let emp_prob = mart.iter().filter(|&&x| x.abs() >= a).count() as f64 / n_steps as f64; println!("Azuma bound P(|X_n|≥10): {} Empirical: {}", bound, emp_prob);
// SGD as random walk let mut rng = rand::thread_rng(); let normal = rand_distr::Normal::new(0.0, 1.0).unwrap(); let mut theta = 0.0; let eta = 0.01; let mut traj = vec![theta]; for _ in 0..n_steps { let grad = normal.sample(&mut rng); theta -= eta * grad; traj.push(theta); } let root_sgd = BitMapBackend::new("sgd_walk.png", (800, 600)).into_drawing_area(); root_sgd.fill(&WHITE)?; let mut chart_sgd = ChartBuilder::on(&root_sgd) .caption("SGD as Random Walk", ("sans-serif", 50)) .build_cartesian_2d(0f64..n_steps as f64, -5f64..5f64)?; chart_sgd.draw_series(LineSeries::new( (0..=n_steps).map(|i| (i as f64, traj[i])), &BLUE, ))?;
Ok(())}:::
Simulates random walks, martingales, SGD.
8. Symbolic Derivations with SymPy
Section titled “8. Symbolic Derivations with SymPy”Derive martingale properties, Azuma bound.
::: code-group
from sympy import symbols, E, Sum, exp
# Martingale propertyX_t, X_t1 = symbols('X_t X_t1')F_t = symbols('F_t')mart_prop = E(X_t1, F_t) - X_tprint("Martingale: E[X_{t+1}|F_t] =", mart_prop)
# Azuma bounda, n, c = symbols('a n c', positive=True)azuma = 2 * exp(-2 * a**2 / (n * c**2))print("Azuma bound:", azuma)fn main() { println!("Martingale: E[X_{t+1}|F_t] = X_t"); println!("Azuma bound: 2 exp(-2 a² / (n c²))");}:::
9. Challenges in ML Applications
Section titled “9. Challenges in ML Applications”- Non-Martingale Processes: Non-i.i.d. data in ML.
- High-Dim Walks: Complex convergence analysis.
- Stopping Times: Hard to define in RL.
10. Key ML Takeaways
Section titled “10. Key ML Takeaways”- Martingales model fairness: In stochastic processes.
- Random walks model exploration: SGD, RL.
- Azuma bounds deviations: Algorithm analysis.
- Convergence theorems guide: ML guarantees.
- Code simulates processes: Practical ML.
Martingales and random walks enhance ML analysis.
11. Summary
Section titled “11. Summary”Explored martingales and random walks, their properties, theorems, and ML applications in RL and optimization. Examples and Python/Rust code bridge theory to practice. Strengthens stochastic ML analysis.
Word count: Approximately 3000.
Further Reading
Section titled “Further Reading”- Williams, Probability with Martingales.
- Grimmett, Stirzaker, Probability and Random Processes.
- Sutton, Barto, Reinforcement Learning (Ch. 5).
- Rust: ‘plotters’ for visualization, ‘rand_distr’ for sampling.