AI under the Hood (AUTH)
Stop chasing the AI noise. AI Under the Hood (AUTH) is a deep-dive resource dedicated to the foundational concepts and full-stack architecture of AI. Learn to build with first-principles reasoning and gain the advantage of hindsight in an evolving field.
AI is a multi-layer, full-stack system. While each layer is deep enough to define an entire career, this guide is designed to look at them as a single, connected architecture.
Our Approach: First Principles & Paved Paths
We follow a dual-track approach: we provide a paved path for you to follow, while enforcing first-principles reasoning at every step.
We acknowledge the limits of this approach. Since AI took center stage in late 2022, it has taken the world's finest minds over three years to effectively "tame the beast" through agentic systems. It would be ambitious to claim we can recreate every original idea from scratch in a tutorial.
However, we have the advantage of hindsight. By applying first-principles thinking to established breakthroughs, we give you the "why" behind the "how," ensuring you aren't just following a script, but building a mental model that lasts.
Note: This platform is a living resource and evolves at a rapid pace.
Read the blog posts or directly head ovet to the sample Code .
Recent Blogs
-
What LLMs Do at Inference: A Deep Dive Under the Hood
Updated:A step-by-step, reference-backed explanation of what happens during LLM inference: tokenization, embeddings, prefill & decode phases, KV caching, decoding strategies, bottlenecks and optimizations like quantization, FlashAttention and speculative decoding.
-
Understanding Tokenizers in AI — A Deep Dive into ChatGPT, Grok, and Gemini
Updated:A complete guide to tokenizers in modern LLMs, covering BPE, WordPiece, SentencePiece, Unigram, and how ChatGPT, Grok, and Gemini tokenize text. Includes examples, real-world impact, and why tokenization is the foundation of AI.
-
Why Embeddings Matter in AI and Large Language Models
Updated:A deep dive into what embeddings are, why they matter, and how they power modern AI, semantic search, and RAG-based systems.
-
KV Cache Explained - A Deep Dive into Transformer Optimization
Updated:A Deep Dive into Transformer Optimization