AI Under the Hood

AI Under the HoodDeep dives into AI internals — transformers, inference, embeddings, and everything under the hood.https://aiunderthehood.com/What LLMs Do at Inference: A Deep Dive Under the Hoodhttps://aiunderthehood.com/blogs/2025/what-llms-do-at-inference/https://aiunderthehood.com/blogs/2025/what-llms-do-at-inference/A step-by-step, reference-backed explanation of what happens during LLM inference: tokenization, embeddings, prefill & decode phases, KV caching, decoding strategies, bottlenecks and optimizations like quantization, FlashAttention and speculative decoding.Fri, 09 Jan 2026 15:00:15 GMTUnderstanding Tokenizers in AI — A Deep Dive into ChatGPT, Grok, and Geminihttps://aiunderthehood.com/blogs/2025/12-03-tokenizers/https://aiunderthehood.com/blogs/2025/12-03-tokenizers/A complete guide to tokenizers in modern LLMs, covering BPE, WordPiece, SentencePiece, Unigram, and how ChatGPT, Grok, and Gemini tokenize text. Includes examples, real-world impact, and why tokenization is the foundation of AI.Fri, 09 Jan 2026 15:00:15 GMTWhy Embeddings Matter in AI and Large Language Modelshttps://aiunderthehood.com/blogs/2025/12-04-why-embeddings-matter/https://aiunderthehood.com/blogs/2025/12-04-why-embeddings-matter/A deep dive into what embeddings are, why they matter, and how they power modern AI, semantic search, and RAG-based systems.Fri, 09 Jan 2026 15:00:15 GMTKV Cache Explained - A Deep Dive into Transformer Optimizationhttps://aiunderthehood.com/blogs/2025/12-06-kv-cache-explained/https://aiunderthehood.com/blogs/2025/12-06-kv-cache-explained/A Deep Dive into Transformer OptimizationFri, 09 Jan 2026 15:00:15 GMTTransformers in AI - The Architecture That Revolutionized Machine Learninghttps://aiunderthehood.com/blogs/2025/12-07-transformers-in-ai/https://aiunderthehood.com/blogs/2025/12-07-transformers-in-ai/The Architecture That Revolutionized Machine LearningFri, 09 Jan 2026 15:00:15 GMTTop-k vs. Nucleus Sampling - Decoding the Secrets of AI Text Generationhttps://aiunderthehood.com/blogs/2025/12-08-top-k-vs-nucleus-sampling/https://aiunderthehood.com/blogs/2025/12-08-top-k-vs-nucleus-sampling/Decoding the Secrets of AI Text GenerationFri, 09 Jan 2026 15:00:15 GMTGPU vs TPU - Decoding the Battle of AI Accelerators in 2025https://aiunderthehood.com/blogs/2025/12-09-gpu-vs-tpu/https://aiunderthehood.com/blogs/2025/12-09-gpu-vs-tpu/Decoding the Battle of AI Accelerators in 2025Fri, 09 Jan 2026 15:00:15 GMTWhy Does Retrieval-Augmented Generation (RAG) Exist?https://aiunderthehood.com/blogs/2025/12-14-why-rag-exists/https://aiunderthehood.com/blogs/2025/12-14-why-rag-exists/In the rapidly evolving world of artificial intelligence, large language models (LLMs) like GPT-4 or Grok have transformed how we interact with technology.Fri, 09 Jan 2026 15:00:15 GMT