<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>AI Under the Hood</title><description>Deep dives into AI internals — transformers, inference, embeddings, and everything under the hood.</description><link>https://aiunderthehood.com/</link><item><title>What LLMs Do at Inference: A Deep Dive Under the Hood</title><link>https://aiunderthehood.com/blogs/2025/what-llms-do-at-inference/</link><guid isPermaLink="true">https://aiunderthehood.com/blogs/2025/what-llms-do-at-inference/</guid><description>A step-by-step, reference-backed explanation of what happens during LLM inference: tokenization, embeddings, prefill &amp; decode phases, KV caching, decoding strategies, bottlenecks and optimizations like quantization, FlashAttention and speculative decoding.</description><pubDate>Fri, 09 Jan 2026 15:00:15 GMT</pubDate></item><item><title>Understanding Tokenizers in AI — A Deep Dive into ChatGPT, Grok, and Gemini</title><link>https://aiunderthehood.com/blogs/2025/12-03-tokenizers/</link><guid isPermaLink="true">https://aiunderthehood.com/blogs/2025/12-03-tokenizers/</guid><description>A complete guide to tokenizers in modern LLMs, covering BPE, WordPiece, SentencePiece, Unigram, and how ChatGPT, Grok, and Gemini tokenize text. Includes examples, real-world impact, and why tokenization is the foundation of AI.</description><pubDate>Fri, 09 Jan 2026 15:00:15 GMT</pubDate></item><item><title>Why Embeddings Matter in AI and Large Language Models</title><link>https://aiunderthehood.com/blogs/2025/12-04-why-embeddings-matter/</link><guid isPermaLink="true">https://aiunderthehood.com/blogs/2025/12-04-why-embeddings-matter/</guid><description>A deep dive into what embeddings are, why they matter, and how they power modern AI, semantic search, and RAG-based systems.</description><pubDate>Fri, 09 Jan 2026 15:00:15 GMT</pubDate></item><item><title>KV Cache Explained - A Deep Dive into Transformer Optimization</title><link>https://aiunderthehood.com/blogs/2025/12-06-kv-cache-explained/</link><guid isPermaLink="true">https://aiunderthehood.com/blogs/2025/12-06-kv-cache-explained/</guid><description>A Deep Dive into Transformer Optimization</description><pubDate>Fri, 09 Jan 2026 15:00:15 GMT</pubDate></item><item><title>Transformers in AI - The Architecture That Revolutionized Machine Learning</title><link>https://aiunderthehood.com/blogs/2025/12-07-transformers-in-ai/</link><guid isPermaLink="true">https://aiunderthehood.com/blogs/2025/12-07-transformers-in-ai/</guid><description>The Architecture That Revolutionized Machine Learning</description><pubDate>Fri, 09 Jan 2026 15:00:15 GMT</pubDate></item><item><title>Top-k vs. Nucleus Sampling - Decoding the Secrets of AI Text Generation</title><link>https://aiunderthehood.com/blogs/2025/12-08-top-k-vs-nucleus-sampling/</link><guid isPermaLink="true">https://aiunderthehood.com/blogs/2025/12-08-top-k-vs-nucleus-sampling/</guid><description>Decoding the Secrets of AI Text Generation</description><pubDate>Fri, 09 Jan 2026 15:00:15 GMT</pubDate></item><item><title>GPU vs TPU - Decoding the Battle of AI Accelerators in 2025</title><link>https://aiunderthehood.com/blogs/2025/12-09-gpu-vs-tpu/</link><guid isPermaLink="true">https://aiunderthehood.com/blogs/2025/12-09-gpu-vs-tpu/</guid><description>Decoding the Battle of AI Accelerators in 2025</description><pubDate>Fri, 09 Jan 2026 15:00:15 GMT</pubDate></item><item><title>Why Does Retrieval-Augmented Generation (RAG) Exist?</title><link>https://aiunderthehood.com/blogs/2025/12-14-why-rag-exists/</link><guid isPermaLink="true">https://aiunderthehood.com/blogs/2025/12-14-why-rag-exists/</guid><description>In the rapidly evolving world of artificial intelligence, large language models (LLMs) like GPT-4 or Grok have transformed how we interact with technology.</description><pubDate>Fri, 09 Jan 2026 15:00:15 GMT</pubDate></item></channel></rss>