Python Basics
This section covers essential Python programming concepts for machine learning (ML) tasks. You’ll learn syntax, data types, and libraries, preparing for ML labs with numpy
, pandas
, and scikit-learn
. Basic familiarity with Python helps, but no ML experience is required. Write the code yourself to learn Python, but you can refer to examples in our GitHub repository: https://github.com/ravishankarkumar/aiunderthehood-sample-code.
Why Python for ML?
Python dominates AI/ML because of its simplicity, readability, and massive ecosystem. Its high-level syntax enables rapid prototyping, while libraries like numpy
, pandas
, and pytorch
handle performance-critical operations in optimized C/C++ under the hood.
Basic Syntax
Python’s syntax is simple and beginner-friendly. This program sums a list, showing variables, loops, and functions:
def main():
numbers = [1, 2, 3, 4, 5]
total = 0
for num in numbers:
total += num
print("Sum:", total)
if __name__ == "__main__":
main()
- Variables: Declared dynamically, no type annotations needed (
total = 0
). - Lists:
[]
stores ordered data, commonly used for datasets. - Loops:
for
iterates directly over items. - Functions:
def
defines functions, withmain()
as convention.
Run with python main.py
to see “Sum: 15”.
Data Structures for ML
ML tasks require arrays, matrices, and datasets. Python provides:
- Lists (
list
): Flexible collections for data. - Tuples (
tuple
): Immutable sequences. - Dictionaries (
dict
): Key-value pairs for metadata. - NumPy Arrays: Efficient multidimensional arrays (preferred for ML).
Example dataset with Python structures:
dataset = {
"features": [[1.0, 2.0], [3.0, 4.0]],
"labels": [0, 1]
}
print("Dataset size:", len(dataset["features"]))
This mimics ML datasets, later used with numpy
, pandas
, or scikit-learn
.
Lab: Vector Operations with NumPy
Practice Python by computing the Euclidean distance between two vectors, a key ML operation.
Create
vector_distance.py
in your project:pythonimport numpy as np def euclidean_distance(v1, v2): v1, v2 = np.array(v1), np.array(v2) if v1.shape != v2.shape: raise ValueError("Vectors must have equal length") return np.sqrt(np.sum((v1 - v2) ** 2)) if __name__ == "__main__": v1 = [1.0, 2.0, 3.0] v2 = [4.0, 5.0, 6.0] distance = euclidean_distance(v1, v2) print(f"Euclidean Distance: {distance:.3f}")
Dependencies: Install NumPy if not already installed:
bashpip install numpy
Run:
bashpython vector_distance.py
Expected Output:
Euclidean Distance: 5.196
This builds skills for ML computations, using NumPy for efficient vector operations.
Learning from Official Resources
Deepen your Python knowledge with:
- Python Official Tutorial: docs.python.org/tutorial
- Automate the Boring Stuff with Python by Al Sweigart: Beginner-friendly, practical examples.
- Effective Python by Brett Slatkin: Best practices for writing clean, efficient code.
Next Steps
Move to First ML Lab to build your first ML model, or revisit Setup.
Further Reading
- Python Documentation: docs.python.org
- Automate the Boring Stuff with Python by Al Sweigart
- Effective Python by Brett Slatkin