Probability

Probability is fundamental to machine learning (ML), enabling models to handle uncertainty and make predictions. This section introduces probability basics, distributions, and Bayes’ theorem, with a Rust lab using the rand crate.

Probability Basics

Probability measures the likelihood of an event, ranging from 0 (impossible) to 1 (certain). For a random variable $X$ , the probability of an outcome $x$ is denoted $P (X = x)$ .

Example: For a fair die, $P (X = 3) = \frac{1}{6}$ .

Probability Distributions

A probability distribution describes how probabilities are distributed over a random variable’s values.

Discrete: For finite outcomes (e.g., die rolls). The probability mass function (PMF) gives $P (X = x)$ .
Continuous: For infinite outcomes (e.g., heights). The probability density function (PDF) defines probabilities over intervals.

Normal Distribution: A common continuous distribution with PDF:

f (x) = \frac{1}{\sqrt{2 π σ^{2}}} e^{- \frac{(x - μ)^{2}}{2 σ^{2}}}

where $μ$ is the mean and $σ$ is the standard deviation. Used in ML for modeling data.

Bayes’ Theorem

Bayes’ theorem updates probabilities based on new evidence:

P (A | B) = \frac{P (B | A) P (A)}{P (B)}

In ML, it’s used for classification (e.g., naive Bayes).

Lab: Simulating a Normal Distribution with `rand`

You’ll simulate samples from a normal distribution using the rand crate, illustrating probability in ML.

Edit src/main.rs in your rust_ml_tutorial project:

rust

use rand::distributions::{Distribution, Normal};
use rand::thread_rng;

fn main() {
    // Define normal distribution (mean=0, std=1)
    let normal = Normal::new(0.0, 1.0);
    let mut rng = thread_rng();

    // Generate 5 samples
    let samples: Vec<f64> = (0..5).map(|_| normal.sample(&mut rng)).collect();
    println!("Samples from N(0,1): {:?}", samples);

    // Compute sample mean
    let mean = samples.iter().sum::<f64>() / samples.len() as f64;
    println!("Sample Mean: {}", mean);
}

Ensure Dependencies:
- Verify Cargo.toml includes:
  toml
```
[dependencies]
rand = "0.8.5"
```
- Run cargo build.

Run the Program:

bash

cargo run

Expected Output (values vary due to randomness):

Samples from N(0,1): [0.123, -0.456, 1.789, -0.234, 0.678]
Sample Mean: ~0.38

Understanding the Results

Distribution: The rand crate generates samples from $N (0, 1)$ , mimicking a normal distribution.
Mean: The sample mean approximates the true mean ( $μ = 0$ ), converging with more samples.
ML Relevance: Distributions model data uncertainty, used in algorithms like Gaussian naive Bayes.

This lab prepares you for probabilistic ML models.

Learning from Official Resources

Deepen Rust skills with:

The Rust Programming Language (The Book): Free at doc.rust-lang.org/book.
Programming Rust: By Blandy, Orendorff, and Tindall.

Next Steps

Continue to Statistics for ML’s statistical methods, or revisit Calculus.

Probability ​

Probability Basics ​

Probability Distributions ​

Bayes’ Theorem ​

Lab: Simulating a Normal Distribution with rand ​

Understanding the Results ​

Learning from Official Resources ​

Next Steps ​

Further Reading ​