Posts
All the articles I've posted.
-
Why Embeddings Matter
Updated:A deep dive into what embeddings are, why they matter, and how they power modern AI, semantic search, and RAG-based systems.
-
What LLMs Do at Inference: A Deep Dive Under the Hood
Updated:A step-by-step, reference-backed explanation of what happens during LLM inference: tokenization, embeddings, prefill & decode phases, KV caching, decoding strategies, bottlenecks and optimizations like quantization, FlashAttention and speculative decoding.
-
Transformers in AI
Updated:The Architecture That Revolutionized Machine Learning
-
Top-k vs. Nucleus Sampling
Updated:Decoding the Secrets of AI Text Generation