Spec-driven DOCX template filling for institution-issued forms.
Keep the original Word layout, fill only the intended content, and verify the output.
面向“单位/学校给定 Word 模板”的 DOCX 填充框架。
目标是只改内容,不破坏原模板格式,并且让结果可分析、可校验、可复用。
Many real-world Word automation tasks are not “generate a new document from scratch”. They are:
- a school training report
- a hospital internship form
- a company internal form
- a government-style submission template
In these cases, users usually need three things at the same time:
- preserve the original template layout
- fill content according to intent
- keep the process explainable and verifiable
This project focuses on that workflow.
很多真实场景并不是“从零生成 Word 文档”,而是:
- 学校实训报告
- 医院实习表单
- 公司内部固定模板
- 政府/机构格式化申报材料
这类场景通常同时要求:
- 保留原模板格式
- 按意图填充内容
- 整个过程可解释、可校验
这个项目就是为此设计的。
- Fill DOCX templates from
template_spec.json - Preserve formatting by cloning paragraph styles from the template or a reference document
- Support multiple locator strategies:
paraId,text_anchor,bookmark,content_control,xpath - Separate analysis, filling, and verification
- Support blank template + reference template workflows for long-form sections
Long fields in real Word forms often do not exist as ready-made paragraphs in the blank template. Instead, the blank template only contains a heading or a placeholder, while the filled version shows:
- how many paragraphs the section should use
- what indentation and spacing look like
- which run style should be cloned
That is why this project supports a reference_source in addition to template_source.
在真实 Word 模板里,长文本字段常常并不是空白模板里天然就存在的一整组段落。 空白模板里往往只有标题或一个占位段,真正的段落数量、缩进、行距、运行样式,只能从参考样例里看出来。
所以这里把:
template_source作为输出基底reference_source作为样式与长字段结构参考
两者一起纳入框架。
pip install -e .[dev]docx-template-analyze --spec examples/minimal_demo/template_spec.json --output analysis.json
docx-template-fill examples/minimal_demo/template_spec.json examples/minimal_demo/fill_input.json --output output.docx --report render-report.json
docx-template-verify examples/minimal_demo/template_spec.json --output-docx output.docx --render-report render-report.jsondocx-template-fill examples/minimal_demo/template_spec.json examples/minimal_demo/fill_input.json --output examples/minimal_demo/output.docx --report examples/minimal_demo/render-report.json
docx-template-verify examples/minimal_demo/template_spec.json --output-docx examples/minimal_demo/output.docx --render-report examples/minimal_demo/render-report.jsonpython examples/huashang_demo/resolve_and_fill.py HV-2026-017 "Digital Vital Sign Observation" "Clinical Practice Design" --output examples/huashang_demo/output.docxOr fill from explicit JSON:
docx-template-fill examples/huashang_demo/template_spec.json examples/huashang_demo/fill_input.json --output examples/huashang_demo/output.docx --report examples/huashang_demo/render-report.jsonThe main Python entry points are:
docx_template_fill.analyze_templatedocx_template_fill.fill_templatedocx_template_fill.verify_specdocx_template_fill.verify_rendered_output
See:
src/docx_template_fill/ # core package
tests/ # pytest suite
docs/ # public docs
examples/minimal_demo/ # smallest runnable demo
examples/huashang_demo/ # sanitized school-style demo
scripts/generate_demo_assets.py
- no real school templates
- no real student or teacher data
- no real internal spreadsheets
- no claim of full DOCX editing coverage
This repository is meant to open-source the framework, not any private institution materials.
v0.1.0 intentionally focuses on:
- spec-driven field mapping
- style-preserving paragraph rendering
- output verification
- reusable demos
It does not promise full support for every DOCX feature or every possible Word control pattern yet.