RxLM-Med is a production-grade clinical AI agent that interprets blood test reports through structured reasoning, evidence-backed validation, and human-aligned communication — all governed by an adaptive Traffic Light Protocol (TLP) for patient safety.
Built on Qwen-VL, trained on synthetic EHRs, and hardened against hallucination via System 2 self-correction.
🔍 See detailed end-to-end pipeline: Complete Architecture Diagram
⚠️ The simplified version above highlights the four-layer abstraction (Visual → Knowledge → Reasoning → Alignment).
For a comprehensive view of data flow, logic checks, and safety triggers, refer to the full diagram.
Most medical LLMs fail in real-world settings due to:
- ❌ Hallucinated diagnoses without textbook evidence
- ❌ Ignored critical values (e.g., K⁺ > 6.5 → cardiac arrest)
- ❌ Opaque reasoning with no audit trail
RxLM-Med solves this by mimicking expert clinician cognition:
Perceive → Verify → Reason → Align → Present
We enforce three pillars of trust:
- Accuracy: Grounded in Tier 0 expert rules + 53 clinical textbooks
- Safety: TLP protocol blocks dangerous outputs
- Explainability: Every claim cites source evidence
RxLM-Med implements a strictly layered, fail-safe pipeline across five phases:
- Sim-to-Real Augmentation: 10-variant distortion matrix (optical, geometric, occlusion, compression)
- OCR Engine: Qwen2.5-VL + LoRA fine-tuned on composite synthetic EHRs
- Output: Structured
lab.jsonwith MIMIC-III ontology alignment
✅ 4.3% CER on "hell-mode" scans (Var_09) vs. 32.7% for zero-shot Qwen-VL
- Tier 0 Rule Base: Absolute truth for critical values, reference ranges, triage
- Hierarchical RAG:
- Layer 2 (Base): Physiology, Pathophysiology → explain mechanisms
- Layer 3 (Specialist): Internal Medicine, Hematology → differential diagnosis
- Cross-Lingual Funnel: BM25 + FAISS + Reciprocal Rank Fusion (RRF)
- Modular Output:
annotated_lab.json,patient_context.json,evidence_repository/
- Deterministic Engine: Computes derived biomarkers before LLM reasoning:
- eGFR (CKD-EPI 2021)
- De Ritis Ratio (AST/ALT)
- Anion Gap
- Transferrin Saturation
- Progressive Pruning Loop:
- Retry 1: Attempt logical repair
- Retry 2: Soft prune unsupported claims
- Retry 3: Hard prune to Tier 0 facts only
- Output:
safety_level.json(TLP signal) +reasoning_chain.json(verified logic)
- Synthetic Data Factory: GPT-4 generates
(good_response, bad_response)pairs - Two-Stage Training:
- SFT: Learns "professional, calming, plain-language" nurse persona
- DPO: Locks safety red liens (e.g., rejects drug prescriptions)
- Output:
final_report.md(Markdown-ready for UI)
- Asynchronous Dual-Channel Rendering:
- Fast: TLP status in <200ms (
🔴 RED/🟡 YELLOW/🟢 GREEN) - Slow: Streaming full report with evidence tracing
- Fast: TLP status in <200ms (
- TLP UI Components:
- 🔴 RED: "Go to ER now!" + "Copy for Doctor" button
- 🟡 YELLOW: "Consult specialist" + department finder
- 🟢 GREEN: Health tips (logic auto-collapsed)
- Trust Widgets: Expandable textbook excerpts with source verification
To ensure the System 2 Reasoning engine traces back to authoritative, textbook-level truth (Tier 0/1 Evidence), the RAG database is architected across three distinct cognitive layers, utilizing the standard Chinese medical education curriculum (People's Medical Publishing House, 10th Edition) as the core corpus.
Provides baseline mechanistic reasoning (Physiology, Biochemistry, Anatomy).
- 《系统解剖学》(第10版) / 《局部解剖学》(第10版)
- 《组织学与胚胎学》(第10版) / 《生理学》(第10版)
- 《生物化学与分子生物学》(第10版) / 《医学细胞生物学》(第7版)
- 《医学遗传学》(第8版) / 《医学生物学》(第10版)
- 《医学免疫学》(第8版) / 《基础化学》(第10版) / 《有机化学》(第10版)
- 《医学物理学》(第10版) / 《医用高等数学》(第8版)
Connects mechanisms to clinical manifestations (Pathology, Pharmacology, Diagnostics).
- 《诊断学》(第10版) (Core Routing Node)
- 《病理学》(第10版) / 《病理生理学》(第10版)
- 《药理学》(第10版) / 《临床药理学》(第7版)
- 《医学微生物学》(第10版) / 《人体寄生虫学》(第10版)
- 《医学影像学》(第9版) / 《核医学》(第10版)
- 《流行病学》(第10版) / 《临床流行病学与循证医学》(第6版)
- 《医学统计学》(第8版) / 《预防医学》(第8版)
- 《法医学》(第8版) / 《医学心理学》(第8版)
Executes differential diagnosis and treatment alignments.
- 《内科学》(第10版) / 《外科学》(第10版)
- 《妇产科学》(第10版) / 《儿科学》(第10版)
- 《神经病学》(第9版) / 《精神病学》(第9版)
- 《传染病学》(第10版) / 《急诊与灾难医学》(第4版)
- 《眼科学》(第10版) / 《耳鼻咽喉头颈外科学》(第10版)
- 《皮肤性病学》(第10版) / 《口腔科学》(第10版)
- 《麻醉学》(第5版) / 《康复医学》(第7版)
- 《老年医学》(本科配增值) / 《全科医学概论》(第6版)
- 《临床营养学》(本科配增值) / 《中医学》(第10版)
Note on Implementation: Due to copyright and compute constraints, the current open-source repository contains a scaled-down subset of vector embeddings (JSON extracts) for demonstration. The architecture is fully compatible with the 53-book complete corpus injection.
{
"lab_data": {
"gender": "F",
"age": 52,
"lab_items": [
{"abbreviation": "WBC", "value": 18.5, "unit": "10³/μL"},
{"abbreviation": "CRP", "value": 120, "unit": "mg/L"}
]
},
"patient_context": {
"symptoms": ["fever", "abdominal pain"],
"exclude_history": ["no recent surgery"]
}
}{
"tlp_status": "YELLOW",
"clinical_conclusion": "Signs of systemic inflammation. Consider abdominal infection or autoimmune flare.",
"risk_level_assessment": "YELLOW",
"derived_metrics": {},
"reasoning_trace": [
"Initial: Bacterial infection likely.",
"Pruning (Retry 2): Removed 'appendicitis' due to lack of imaging evidence.",
"Final: Non-specific inflammatory response; recommend urgent outpatient workup."
]
}RxLM-Med/
├── .git/
├── .env.example
├── Dockerfile
├── README.md
├── requirements.txt
├── agent/
│ ├── medical_calc.py # Deterministic biomarker engine
│ ├── reasoning_chain.json # Example chain for testing
│ ├── reflection_logic.py # System 2 pruning loop
│ ├── retriever.py # Hierarchical RAG + RRF fusion
│ ├── prompt_templates/
│ │ ├── alignment_prompts.md # SFT/DPO prompts
│ │ └── nurse_persona.yaml # Nurse persona template
│ └── sample_inputs/
│ ├── extended_lab_data.json # Sample lab data
│ └── patient_context.json # Sample patient context
├── data/
│ ├── multi_disease_reports.json # Multi-disease report samples
│ ├── evidence_repository/
│ │ ├── acid_base_disorders.json
│ │ ├── critical_lab_values.json
│ │ ├── electrolyte_emergencies.json
│ │ ├── respiratory_failure.json
│ │ └── shock_differential.json
│ └── generator/
│ ├── augment_physics.py # Physics-based augmentation scripts
│ └── render_report.py # Report rendering utilities
│ └── Var_01_Defocus.png # Sample images for augmentation
│ └── Var_02_Flash.png # ...
│ └── Var_03_LowLight.png # ...
│ └── Var_04_Keystone.png # ...
│ └── Var_05_MotionBlur.png # ...
│ └── Var_06_MeshWarping.png # ...
│ └── Var_07_Stain.png # ...
│ └── Var_08_Annotation.png # ...
│ └── Var_09_AllInHell.png # ...
│ └── Var_10_JPEG.png # ...
├── deployment/
│ └── app.py # FastAPI + BackgroundTasks + LangSmith
├── docs/
│ ├── architecture.png # Full system diagram
│ └── results/
│ └── lora_ocr_performance_heatmap.png # OCR performance heatmap
├── fine_tuning/
│ ├── ds_config.json # DeepSpeed ZeRO-2 config
│ ├── train_lora_ocr.py # Phase 1 LoRA training script
│ └── train_lora_sft_dpo.py # Phase 4 alignment training script
└── quantization/
└── quant_bits.py # INT4 deployment stubRxLM-Med adheres to "AI as Assistant, Not Authority":
- ❌ Never prescribes medication
- ❌ Never overrides critical value alerts
- ✅ Always cites source (e.g., "《诊断学》P.562")
- ✅ Outputs include mandatory footer:
Disclaimer:
This report is generated by an AI system based on medical textbooks for informational purposes only. It does not constitute professional medical advice, diagnosis, or treatment. Always seek the advice of a qualified physician.
Apache License 2.0 — for research and non-clinical use only.
⚠️ This software is NOT approved for clinical decision-making or diagnostic use.

