Skip to content

Probability Meets Calculus - Continuous Distributions

Probability theory and calculus converge in continuous distributions, where densities replace discrete masses, and integrals compute probabilities over intervals. In artificial intelligence and machine learning, continuous distributions model real-valued data, uncertainties in predictions, and parameters in Bayesian inference. From the uniform distribution's simplicity to the Gaussian's ubiquity in central limit theorem applications, these tools enable likelihood calculations, expectation estimations, and variance analyses crucial for algorithms like Gaussian processes and variational autoencoders.

This lecture bridges integration basics with probability, exploring continuous random variables, probability density functions (PDFs), cumulative distribution functions (CDFs), moments like mean and variance, key distributions, and their calculus-based manipulations. We'll integrate conceptual explanations with ML relevance, illustrated by examples and implementations in Python and Rust, fostering a solid grasp of probabilistic modeling in AI.


1. Intuition for Continuous Random Variables

Discrete variables take countable values with probabilities; continuous span intervals with densities—probability over point is zero, but intervals have mass.

PDF f(x): Non-negative, ∫_{-∞}^∞ f(x) dx =1, P(a≤X≤b) = ∫_a^b f(x) dx.

ML Connection

  • Model continuous features (e.g., heights in regression).
  • In GANs: Generate from noise distributions.

INFO

Continuous probs measure areas under density curves, like fluid volume from depth over length.

Example

  • Uniform [0,1]: f(x)=1 for x in [0,1], P(0.2≤X≤0.5)=0.3.

2. Probability Density Functions (PDFs)

PDF properties: f(x)≥0, integrates to 1.

Not probability at x, but relative likelihood.

Formal Definition

X continuous if CDF F(x)=P(X≤x) absolutely continuous, F'(x)=f(x).

ML Insight

  • Kernel density estimation: Approx f from data.

Example: Exponential λ e^{-λx}, x≥0—models waiting times.


3. Cumulative Distribution Functions (CDFs)

F(x)= ∫{-∞}^x f(t) dt, monotone increasing, lim F=0, →∞ F=1.

P(a<X≤b)=F(b)-F(a).

Inverse: Quantile function for sampling.

f(x)=F'(x).

ML Application

  • Empirical CDF for goodness-of-fit tests.

Example: Standard normal Φ(x)= ∫_{-∞}^x (1/sqrt(2π)) e^{-t^2/2} dt.


4. Expectations and Moments

E[X]= ∫ x f(x) dx.

Variance Var(X)= E[(X-μ)^2]= ∫ (x-μ)^2 f(x) dx = E[X^2] - μ^2.

Higher moments: Skewness, kurtosis.

Properties

  • Linearity E[aX+bY]=aE[X]+bE[Y].
  • For indep, Var(X+Y)=Var(X)+Var(Y).

ML Insight

  • Loss functions minimize E[error].

Example: Uniform [a,b]: μ=(a+b)/2, Var=(b-a)^2/12.


5. Common Continuous Distributions

  1. Uniform [a,b]: Constant density.
  2. Exponential (λ): Memoryless, interarrival times.
  3. Normal (μ,σ^2): Bell curve, CLT.
  4. Beta (α,β): [0,1], flexible shapes for proportions.
  5. Gamma (k,θ): Waiting for k events.

ML Applications

  • Normal: Assumptions in linear reg.
  • Beta: Bayesian priors for binaries.

6. Transformations and Change of Variables

If Y=g(X), g invertible, f_Y(y)= f_X(g^{-1}(y)) |dg^{-1}/dy|.

For multiple vars, Jacobian determinant.

Example

If X~N(0,1), Y=μ+σX ~N(μ,σ^2), scale/shift.

In ML: Normalize data.


7. Joint Distributions and Marginals

Joint PDF f(x,y), marginal f_X(x)= ∫ f(x,y) dy.

Conditional f(y|x)= f(x,y)/f_X(x).

Indep if f(x,y)=f_X f_Y.

ML Connection

  • Multivariate Gaussians in GPs.

8. Numerical Computation in Code

Monte Carlo for expectations, quad for integrals.

python
import numpy as np
from scipy.integrate import quad
from scipy.stats import norm, uniform

# PDF integral check
def pdf_normal(x):
    return norm.pdf(x, 0, 1)

integral, _ = quad(pdf_normal, -np.inf, np.inf)
print("∫ normal PDF:", integral)  # ~1

# Expectation MC
samples = norm.rvs(size=10000)
exp_mc = np.mean(samples)
print("MC E[X]:", exp_mc)  # ~0

# CDF
print("P(X≤1):", norm.cdf(1))

# Uniform variance
a, b = 0, 1
var = (b - a)**2 / 12
print("Uniform var:", var)
rust
use rand::Rng;
use rand_distr::{Normal, Distribution};

fn main() {
    // MC integral approx for normal PDF ~1, but hard without quad; simulate expectation
    let normal = Normal::new(0.0, 1.0).unwrap();
    let mut rng = rand::thread_rng();
    let mut sum = 0.0;
    let n = 10000;
    for _ in 0..n {
        sum += normal.sample(&mut rng);
    }
    println!("MC E[X]: {}", sum / n as f64);  // ~0

    // Uniform variance
    let a = 0.0;
    let b = 1.0;
    let var = (b - a).powi(2) / 12.0;
    println!("Uniform var: {}", var);
}

Computes integrals, moments. Rust uses rand_distr for sampling.


9. Symbolic Manipulations

SymPy for densities, integrals.

python
from sympy import symbols, integrate, exp, sqrt, pi, oo

x, lam = symbols('x lam', positive=True)
pdf_exp = lam * exp(-lam * x)
print("∫ exp PDF [0,∞):", integrate(pdf_exp, (x, 0, oo)))

mu, sigma = symbols('mu sigma')
pdf_norm = 1/(sigma * sqrt(2*pi)) * exp(-(x - mu)**2 / (2 * sigma**2))
exp_norm = integrate(x * pdf_norm, (x, -oo, oo))
print("E[X normal]:", exp_norm)
rust
// Hardcoded
fn main() {
    println!("∫ exp PDF [0,∞): 1");
    println!("E[X normal]: mu");
}

10. Convolution for Sum of Independents

f_{X+Y}(z)= ∫ f_X(x) f_Y(z-x) dx.

For normals: Sum normal.

In ML: Aggregate uncertainties.


11. Bayes' Theorem in Continuous

P(θ|data) ∝ P(data|θ) P(θ), normalize integral.

In Bayesian ML: Posteriors.


12. Key ML Takeaways

  • PDFs model densities: For continuous predictions.
  • CDFs for probs: Cumulative metrics.
  • Moments quantify: Mean, var in models.
  • Distributions toolbox: Fit data uncertainties.
  • Calculus enables: Integrals for normalizations.

Continuous probs power uncertain AI.


13. Summary

Explored continuous distributions from PDFs to moments, common types, calculus ops, with ML links. Examples and Python/Rust code. Sets stage for multivariable probability.

Word count: ~2850.


Further Reading

  • Wasserman, All of Statistics (Ch. 2-3).
  • Bishop, Pattern Recognition (Ch. 2).
  • 3Blue1Brown: Probability videos.
  • Rust: 'rand_distr', 'statrs'.