Appearance
Support Vector Machines
Support Vector Machines (SVMs) are powerful supervised learning models for classification, excelling in finding optimal decision boundaries. This section dives into their theory, derivations, kernel methods, and evaluation, with a Rust lab using linfa
. We’ll explore computational intricacies and Rust’s role in optimizing SVM training.
Theory
SVMs classify data by finding the maximum-margin hyperplane that separates classes (e.g., spam vs. not spam). For a feature vector
where
Derivation: Hard-Margin SVM
For linearly separable data, the hard-margin SVM optimizes:
This is a constrained optimization problem solved using Lagrange multipliers. The Lagrangian is:
where
Substituting back, the dual problem maximizes:
subject to
Under the Hood: The dual problem depends only on dot products linfa
implements efficiently using Rust’s optimized linear algebra, avoiding memory leaks common in C++ solvers.
Soft-Margin SVM
For non-separable data, soft-margin SVM introduces slack variables
subject to:
where
Kernel Methods
For non-linear boundaries, SVMs use the kernel trick, mapping data to a higher-dimensional space via a kernel function
- Linear:
. - Polynomial:
. - Radial Basis Function (RBF):
.
Under the Hood: The kernel trick avoids explicit mapping, computing dot products in the transformed space. linfa
’s SVM implementation optimizes kernel evaluations, using Rust’s memory safety to handle large kernel matrices, unlike Python’s scikit-learn
, which may face memory issues for high-dimensional data.
Evaluation
Performance is assessed with:
- Accuracy: Proportion of correct predictions,
. - Precision, Recall, F1-Score: For imbalanced classes, as in logistic regression.
- ROC-AUC: Area under the ROC curve, measuring class separation.
Under the Hood: SVMs excel in high-dimensional spaces due to the kernel trick, but tuning linfa
provides efficient parameter sweeps, leveraging compile-time checks to ensure robust computation, unlike some C++ libraries prone to runtime errors.
Lab: SVM Classification with linfa
You’ll train an SVM classifier with an RBF kernel on a synthetic dataset (e.g., features predicting binary class) and evaluate accuracy.
Edit
src/main.rs
in yourrust_ml_tutorial
project:rustuse linfa::prelude::*; use linfa_svm::Svm; use ndarray::{array, Array2, Array1}; fn main() { // Synthetic dataset: features (x1, x2), binary target (0 or 1) let x: Array2<f64> = array![ [1.0, 2.0], [2.0, 1.0], [3.0, 3.0], [4.0, 5.0], [5.0, 4.0], [6.0, 1.0], [7.0, 2.0], [8.0, 3.0], [9.0, 4.0], [10.0, 5.0] ]; let y: Array1<f64> = array![0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0]; // Create dataset let dataset = Dataset::new(x.clone(), y.clone()); // Train SVM with RBF kernel let model = Svm::<_, f64>::params() .gaussian_kernel(0.1) .pos_neg_weights(1.0, 1.0) .fit(&dataset) .unwrap(); // Predict classes let predictions = model.predict(&x); println!("Predictions: {:?}", predictions); // Compute accuracy let accuracy = predictions.iter().zip(y.iter()) .filter(|(p, t)| p == t).count() as f64 / y.len() as f64; println!("Accuracy: {}", accuracy); }
Ensure Dependencies:
- Verify
Cargo.toml
includes:toml[dependencies] linfa = "0.7.1" linfa-svm = "0.7.0" ndarray = "0.15.0"
- Run
cargo build
.
- Verify
Run the Program:
bashcargo run
Expected Output (approximate):
Predictions: [0, 0, 0, 0, 0, 1, 1, 1, 1, 1] Accuracy: 1.0
Understanding the Results
- Dataset: Synthetic features (
, ) predict binary classes (0 or 1), designed to be separable in a transformed space. - Model: The SVM with an RBF kernel learns a non-linear boundary, achieving perfect accuracy.
- Under the Hood:
linfa
solves the dual problem using sequential minimal optimization (SMO), optimized for Rust’s memory safety and performance. The RBF kernel maps data to an infinite-dimensional space, enabling complex boundaries, with Rust’sndarray
ensuring efficient kernel computations. Compared to Python’sscikit-learn
, Rust’s SVM avoids memory leaks in large-scale kernel matrix operations, critical for high-dimensional datasets. - Evaluation: Perfect accuracy suggests good separation, but real-world data requires tuning
and via cross-validation.
This lab introduces advanced classification, preparing for clustering and PCA.
Next Steps
Continue to K-Means Clustering for unsupervised learning, or revisit Decision Trees.
Further Reading
- An Introduction to Statistical Learning by James et al. (Chapter 9)
- Hands-On Machine Learning by Géron (Chapter 5)
linfa
Documentation: github.com/rust-ml/linfa