Topic: LLMs
All essays filed under "LLMs".
-
Open Weights Is Not the Same as Open Source AI
A practical distinction between open-weight AI models and truly open source AI systems, and why the difference matters when choosing local LLMs.
-
What LLMs Do at Inference: A Deep Dive Under the Hood
Updated:A step-by-step, reference-backed explanation of what happens during LLM inference: tokenization, embeddings, prefill & decode phases, KV caching, decoding strategies, bottlenecks and optimizations like quantization, FlashAttention and speculative decoding.
-
Understanding Tokenizers in AI — A Deep Dive into ChatGPT, Grok, and Gemini
Updated:A complete guide to tokenizers in modern LLMs, covering BPE, WordPiece, SentencePiece, Unigram, and how ChatGPT, Grok, and Gemini tokenize text. Includes examples, real-world impact, and why tokenization is the foundation of AI.
-
Why Embeddings Matter in AI and Large Language Models
Updated:A deep dive into what embeddings are, why they matter, and how they power modern AI, semantic search, and RAG-based systems.