Skip to content

Latest commit

 

History

History
68 lines (50 loc) · 2.21 KB

File metadata and controls

68 lines (50 loc) · 2.21 KB

Real-World File Benchmark Summary

47 files, 7.2MB total - Downloaded from GitHub and generated from realistic templates

Results by File Type

Where mzip WINS (8 files)

File Size mzip Comp. Best Winner
apache_log_sample.log 2.3MB 21.39x zstd 18.71x mzip +14%
events.csv 592KB 8.53x zstd 5.71x mzip +49%
app.log 475KB 7.72x bzip2 7.70x mzip +0.3%
lodash.js 545KB 7.69x bzip2 7.69x mzip (tie)
dashboard.html 44KB 34.04x xz 31.46x mzip +8%
handlers.ts 25KB 17.16x zstd 17.16x mzip (tie)
metrics.prom 180KB 8.34x zstd 8.34x mzip (tie)
nginx_access.log 427KB 11.88x bzip2 11.95x bzip2 +0.6%

Where brotli WINS (24 files)

Brotli's 120KB built-in dictionary is optimized for code patterns.

File Type Gap vs mzip
Small code (<30KB) 10-24% smaller
Config files 10-20% smaller
Markdown 17-24% smaller
XML/JSON 13-15% smaller

Where bzip2 WINS (12 files)

File Type Gap vs mzip
Large source code (50-300KB) 0-1% smaller
Python, Go, Java, C code 0.5-1% smaller

Where xz WINS (2 files)

File Gap vs mzip
bootstrap.css 1% smaller
zig_std.zig 1% smaller

Summary

Category mzip Wins Win Rate
Logs (structured) 3/3 100%
CSV/Columnar 1/1 100%
Template-heavy HTML 1/1 100%
Large JS (>500KB) 1/1 100%
Template-heavy TS 1/1 100%
Source code (general) 1/35 3%
Config files 0/10 0%
TOTAL 8/47 17%

Key Insight

mzip excels on structured/templated data (logs, CSV, repeated patterns), achieving 14-49% better compression.

mzip is competitive on source code (within 1-8% of best), but brotli's built-in dictionary wins on small files.

Recommendations

  1. Use mzip for: Logs, CSV, structured data, large files with repetitive patterns
  2. Use brotli for: Small code files, config files, markdown
  3. mzip never catastrophically loses - worst case is ~20% behind brotli on small files