Appearance
Probability
Probability is fundamental to machine learning (ML), enabling models to handle uncertainty and make predictions. This section introduces probability basics, distributions, and Bayes’ theorem, with a Rust lab using the rand
crate.
Probability Basics
Probability measures the likelihood of an event, ranging from 0 (impossible) to 1 (certain). For a random variable
Example: For a fair die,
Probability Distributions
A probability distribution describes how probabilities are distributed over a random variable’s values.
- Discrete: For finite outcomes (e.g., die rolls). The probability mass function (PMF) gives
. - Continuous: For infinite outcomes (e.g., heights). The probability density function (PDF) defines probabilities over intervals.
Normal Distribution: A common continuous distribution with PDF:
where
Bayes’ Theorem
Bayes’ theorem updates probabilities based on new evidence:
In ML, it’s used for classification (e.g., naive Bayes).
Lab: Simulating a Normal Distribution with rand
You’ll simulate samples from a normal distribution using the rand
crate, illustrating probability in ML.
Edit
src/main.rs
in yourrust_ml_tutorial
project:rustuse rand::distributions::{Distribution, Normal}; use rand::thread_rng; fn main() { // Define normal distribution (mean=0, std=1) let normal = Normal::new(0.0, 1.0); let mut rng = thread_rng(); // Generate 5 samples let samples: Vec<f64> = (0..5).map(|_| normal.sample(&mut rng)).collect(); println!("Samples from N(0,1): {:?}", samples); // Compute sample mean let mean = samples.iter().sum::<f64>() / samples.len() as f64; println!("Sample Mean: {}", mean); }
Ensure Dependencies:
- Verify
Cargo.toml
includes:toml[dependencies] rand = "0.8.5"
- Run
cargo build
.
- Verify
Run the Program:
bashcargo run
Expected Output (values vary due to randomness):
Samples from N(0,1): [0.123, -0.456, 1.789, -0.234, 0.678] Sample Mean: ~0.38
Understanding the Results
- Distribution: The
rand
crate generates samples from, mimicking a normal distribution. - Mean: The sample mean approximates the true mean (
), converging with more samples. - ML Relevance: Distributions model data uncertainty, used in algorithms like Gaussian naive Bayes.
This lab prepares you for probabilistic ML models.
Learning from Official Resources
Deepen Rust skills with:
- The Rust Programming Language (The Book): Free at doc.rust-lang.org/book.
- Programming Rust: By Blandy, Orendorff, and Tindall.
Next Steps
Continue to Statistics for ML’s statistical methods, or revisit Calculus.
Further Reading
- Deep Learning by Goodfellow et al. (Chapter 3)
- An Introduction to Statistical Learning by James et al. (Prerequisites)
rand
Documentation: docs.rs/rand