You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/_index.html
+29-11Lines changed: 29 additions & 11 deletions
Original file line number
Diff line number
Diff line change
@@ -386,7 +386,7 @@ <h3>Structured Output & Tools</h3>
386
386
<divclass="feat">
387
387
<divclass="icon">🧮</div>
388
388
<h3>Type-Safe Generics</h3>
389
-
<p>Go 1.25 generics throughout — <code>tensor.Numeric</code> constraint for compile-time type safety across float32, float16, bfloat16, float8, and quantized types.</p>
389
+
<p>Go 1.26 generics throughout — <code>tensor.Numeric</code> constraint for compile-time type safety across float32, float16, bfloat16, float8, and quantized types.</p>
390
390
</div>
391
391
<divclass="feat">
392
392
<divclass="icon">📊</div>
@@ -427,7 +427,7 @@ <h3>Advanced Serving</h3>
427
427
<divclass="wrap">
428
428
<divclass="section-head">
429
429
<h2>Faster than Ollama</h2>
430
-
<p>Benchmarked on NVIDIA DGX Spark (GB10), CUDA 13.0, Go 1.25. Gemma 3 1B Q4_K_M, 256 tokens.</p>
430
+
<p>Benchmarked on NVIDIA DGX Spark (GB10), CUDA 13.0, Go 1.26. Gemma 3 1B Q4_K_M, 256 tokens.</p>
431
431
</div>
432
432
<divstyle="overflow-x:auto">
433
433
<tableclass="bench-table">
@@ -490,14 +490,24 @@ <h2>Supported models</h2>
490
490
<p>28 architectures across 16 model families. Load any GGUF model from HuggingFace.</p>
<pstyle="color:var(--fg3);font-size:.875rem">Uses GGUF as the sole model format. Compatible with llama.cpp, Ollama, LM Studio, and GPT4All model files.</p>
@@ -529,10 +539,18 @@ <h2>CLI included</h2>
529
539
<spanclass="cmt"># OpenAI-compatible API server</span>
530
540
$ zerfoo serve gemma-3-1b-q4 --port 8080
531
541
532
-
<spanclass="cmt"># Query with any OpenAI client</span>
542
+
<spanclass="cmt"># QuaRot weight fusion for uniform 4-bit quantization</span>
543
+
$ zerfoo run --quarot model.gguf
544
+
545
+
<spanclass="cmt"># Train an EAGLE speculative decoding head</span>
0 commit comments