LLM Fundamentals

Sat, 04 Apr 2026 00:08:50 +0800

This post uses many abbreviations (BPE, FFN, RoPE, etc.). See the LLM Abbreviations Glossary for a quick reference.

What is LLM?

At its core, LLM is a next-token predictor.

Given a sequence of tokens, it predicts the most probable next token. By repeating this (autoregressive generation), it produces coherent text.

Input:  "The cat sat on the"
Model:  P("mat")=0.23, P("floor")=0.18, P("roof")=0.07, ...
Pick:   "mat"

Here’s the high-level pipeline — every concept in this post maps to one of these stages: