Probability Meets Calculus - Continuous Distributions
Probability Meets Calculus - Continuous Distributions
Section titled “Probability Meets Calculus - Continuous Distributions”Probability theory and calculus converge in continuous distributions, where densities replace discrete masses, and integrals compute probabilities over intervals. In artificial intelligence and machine learning, continuous distributions model real-valued data, uncertainties in predictions, and parameters in Bayesian inference. From the uniform distribution’s simplicity to the Gaussian’s ubiquity in central limit theorem applications, these tools enable likelihood calculations, expectation estimations, and variance analyses crucial for algorithms like Gaussian processes and variational autoencoders.
This lecture bridges integration basics with probability, exploring continuous random variables, probability density functions (PDFs), cumulative distribution functions (CDFs), moments like mean and variance, key distributions, and their calculus-based manipulations. We’ll integrate conceptual explanations with ML relevance, illustrated by examples and implementations in Python and Rust, fostering a solid grasp of probabilistic modeling in AI.
1. Intuition for Continuous Random Variables
Section titled “1. Intuition for Continuous Random Variables”Discrete variables take countable values with probabilities; continuous span intervals with densities—probability over point is zero, but intervals have mass.
PDF f(x): Non-negative, ∫_{-∞}^∞ f(x) dx =1, P(a≤X≤b) = ∫_a^b f(x) dx.
ML Connection
Section titled “ML Connection”- Model continuous features (e.g., heights in regression).
- In GANs: Generate from noise distributions.
::: info Continuous probs measure areas under density curves, like fluid volume from depth over length. :::
Example
Section titled “Example”- Uniform [0,1]: f(x)=1 for x in [0,1], P(0.2≤X≤0.5)=0.3.
2. Probability Density Functions (PDFs)
Section titled “2. Probability Density Functions (PDFs)”PDF properties: f(x)≥0, integrates to 1.
Not probability at x, but relative likelihood.
Formal Definition
Section titled “Formal Definition”X continuous if CDF F(x)=P(X≤x) absolutely continuous, F’(x)=f(x).
ML Insight
Section titled “ML Insight”- Kernel density estimation: Approx f from data.
Example: Exponential λ e^{-λx}, x≥0—models waiting times.
3. Cumulative Distribution Functions (CDFs)
Section titled “3. Cumulative Distribution Functions (CDFs)”F(x)= ∫{-∞}^x f(t) dt, monotone increasing, lim{x→-∞} F=0, →∞ F=1.
P(a<X≤b)=F(b)-F(a).
Inverse: Quantile function for sampling.
FTC Link
Section titled “FTC Link”f(x)=F’(x).
ML Application
Section titled “ML Application”- Empirical CDF for goodness-of-fit tests.
Example: Standard normal Φ(x)= ∫_{-∞}^x (1/sqrt(2π)) e^{-t^2/2} dt.
4. Expectations and Moments
Section titled “4. Expectations and Moments”E[X]= ∫ x f(x) dx.
Variance Var(X)= E[(X-μ)^2]= ∫ (x-μ)^2 f(x) dx = E[X^2] - μ^2.
Higher moments: Skewness, kurtosis.
Properties
Section titled “Properties”- Linearity E[aX+bY]=aE[X]+bE[Y].
- For indep, Var(X+Y)=Var(X)+Var(Y).
ML Insight
Section titled “ML Insight”- Loss functions minimize E[error].
Example: Uniform [a,b]: μ=(a+b)/2, Var=(b-a)^2/12.
5. Common Continuous Distributions
Section titled “5. Common Continuous Distributions”- Uniform [a,b]: Constant density.
- Exponential (λ): Memoryless, interarrival times.
- Normal (μ,σ^2): Bell curve, CLT.
- Beta (α,β): [0,1], flexible shapes for proportions.
- Gamma (k,θ): Waiting for k events.
ML Applications
Section titled “ML Applications”- Normal: Assumptions in linear reg.
- Beta: Bayesian priors for binaries.
6. Transformations and Change of Variables
Section titled “6. Transformations and Change of Variables”If Y=g(X), g invertible, f_Y(y)= f_X(g^{-1}(y)) |dg^{-1}/dy|.
For multiple vars, Jacobian determinant.
Example
Section titled “Example”If X~N(0,1), Y=μ+σX ~N(μ,σ^2), scale/shift.
In ML: Normalize data.
7. Joint Distributions and Marginals
Section titled “7. Joint Distributions and Marginals”Joint PDF f(x,y), marginal f_X(x)= ∫ f(x,y) dy.
Conditional f(y|x)= f(x,y)/f_X(x).
Indep if f(x,y)=f_X f_Y.
ML Connection
Section titled “ML Connection”- Multivariate Gaussians in GPs.
8. Numerical Computation in Code
Section titled “8. Numerical Computation in Code”Monte Carlo for expectations, quad for integrals.
::: code-group
import numpy as npfrom scipy.integrate import quadfrom scipy.stats import norm, uniform
# PDF integral checkdef pdf_normal(x): return norm.pdf(x, 0, 1)
integral, _ = quad(pdf_normal, -np.inf, np.inf)print("∫ normal PDF:", integral) # ~1
# Expectation MCsamples = norm.rvs(size=10000)exp_mc = np.mean(samples)print("MC E[X]:", exp_mc) # ~0
# CDFprint("P(X≤1):", norm.cdf(1))
# Uniform variancea, b = 0, 1var = (b - a)**2 / 12print("Uniform var:", var)use rand::Rng;use rand_distr::{Normal, Distribution};
fn main() { // MC integral approx for normal PDF ~1, but hard without quad; simulate expectation let normal = Normal::new(0.0, 1.0).unwrap(); let mut rng = rand::thread_rng(); let mut sum = 0.0; let n = 10000; for _ in 0..n { sum += normal.sample(&mut rng); } println!("MC E[X]: {}", sum / n as f64); // ~0
// Uniform variance let a = 0.0; let b = 1.0; let var = (b - a).powi(2) / 12.0; println!("Uniform var: {}", var);}:::
Computes integrals, moments. Rust uses rand_distr for sampling.
9. Symbolic Manipulations
Section titled “9. Symbolic Manipulations”SymPy for densities, integrals.
::: code-group
from sympy import symbols, integrate, exp, sqrt, pi, oo
x, lam = symbols('x lam', positive=True)pdf_exp = lam * exp(-lam * x)print("∫ exp PDF [0,∞):", integrate(pdf_exp, (x, 0, oo)))
mu, sigma = symbols('mu sigma')pdf_norm = 1/(sigma * sqrt(2*pi)) * exp(-(x - mu)**2 / (2 * sigma**2))exp_norm = integrate(x * pdf_norm, (x, -oo, oo))print("E[X normal]:", exp_norm)// Hardcodedfn main() { println!("∫ exp PDF [0,∞): 1"); println!("E[X normal]: mu");}:::
10. Convolution for Sum of Independents
Section titled “10. Convolution for Sum of Independents”f_{X+Y}(z)= ∫ f_X(x) f_Y(z-x) dx.
For normals: Sum normal.
In ML: Aggregate uncertainties.
11. Bayes’ Theorem in Continuous
Section titled “11. Bayes’ Theorem in Continuous”P(θ|data) ∝ P(data|θ) P(θ), normalize integral.
In Bayesian ML: Posteriors.
12. Key ML Takeaways
Section titled “12. Key ML Takeaways”- PDFs model densities: For continuous predictions.
- CDFs for probs: Cumulative metrics.
- Moments quantify: Mean, var in models.
- Distributions toolbox: Fit data uncertainties.
- Calculus enables: Integrals for normalizations.
Continuous probs power uncertain AI.
13. Summary
Section titled “13. Summary”Explored continuous distributions from PDFs to moments, common types, calculus ops, with ML links. Examples and Python/Rust code. Sets stage for multivariable probability.
Word count: ~2850.
Further Reading
Section titled “Further Reading”- Wasserman, All of Statistics (Ch. 2-3).
- Bishop, Pattern Recognition (Ch. 2).
- 3Blue1Brown: Probability videos.
- Rust: ‘rand_distr’, ‘statrs’.