docs: update to 28 architectures (16 families), add GPT-2/Nemotron-H/MiniMax-M2

dndungu · dndungu · commit 1e0f8eafb864 · 2026-03-28T15:19:07.000-07:00
diff --git a/content/_index.html b/content/_index.html
@@ -487,7 +487,7 @@ <h3 style="font-size:1rem;font-weight:600;margin-bottom:16px">Performance journe
   <div class="wrap">
     <div class="section-head">
       <h2>Supported models</h2>
-      <p>24 architectures across 13 model families. Load any GGUF model from HuggingFace.</p>
+      <p>28 architectures across 16 model families. Load any GGUF model from HuggingFace.</p>
     </div>
     <div class="model-grid">
       <div class="model-card"><div class="name">Gemma 3</div><div class="status prod">Production</div></div>
diff --git a/content/docs/reference/migration-v1.md b/content/docs/reference/migration-v1.md
@@ -236,7 +236,7 @@ for usage of deprecated symbols.
 These are additive and do not require migration, but are worth knowing about:
 
 - **Architecture registry** -- `inference.RegisterArchitecture` / `inference.ListArchitectures` for pluggable model support.
-- **24 architectures (13 model families)** -- Llama 3, Gemma 3, Mistral, Qwen 2, Phi 3/4, DeepSeek V3, Falcon, Command R, Mixtral, RWKV, Jamba, Mamba 3, and more.
+- **28 architectures (16 model families)** -- Llama 3/4, Gemma 3/3n, Mistral, Qwen 2, Phi 3/4, DeepSeek V3, GPT-2, Nemotron-H, MiniMax M2, Falcon, Command R, Mixtral, RWKV, Jamba, Mamba 3, Whisper, and more.
 - **Speculative decoding** -- `inference.Model.SpeculativeGenerate` and `generate.WithSpeculativeDraft`.
 - **Paged KV cache** -- `generate.WithPagedKV` for memory-efficient serving.
 - **Prefix caching** -- `generate.WithPrefixCache` for shared system prompt reuse.