Run CV experiments reproducibly across Colab, Kaggle, browser workflows, and GPU VMs, then decide whether a result is strong enough to promote.
CV Repro Lab Skills packages two public OpenClaw/Codex skills:
data-science-cv-repro-lab: run experiments, capture browser and notebook evidence, and bundle results for reviewsota-agent: define the benchmark, rank candidates, and decide whether a claimed gain is real
Use both together when you want one planning lane and one execution lane:
sota-agentfreezes the benchmark, candidate list, rerun policy, and claim rules before more compute gets spentdata-science-cv-repro-labexecutes runs across Colab, Kaggle, browser-heavy workflows, or VMs and captures the evidence needed to promote or reject the result
Use it when you need:
- Colab, Kaggle, or VM execution discipline
- browser evidence and validation scorecards
- dataset manifests, run cards, and promotion bundles
- reproducible artifact capture for a real training or export lane
Use it when you need:
- a fixed benchmark before spending more compute
- literature triage and candidate ranking
- ablation discipline and rerun policy for small deltas
- an honest claim decision instead of benchmark theater
Install from ClawHub or copy the skill folders into $CODEX_HOME/skills/.
mkdir -p "$CODEX_HOME/skills"
rsync -a skill/data-science-cv-repro-lab/ "$CODEX_HOME/skills/data-science-cv-repro-lab/"
rsync -a skill/sota-agent/ "$CODEX_HOME/skills/sota-agent/"- CV Repro Lab on ClawHub (
v1.9.1) - SOTA Agent on ClawHub (
v1.4.1) - Portfolio entry
- added an explicit improvement harness for plateau recovery and score-improvement work
- added a review-dashboard manifest for synced QA runs, benchmark panels, runtime sweeps, and audit surfaces
- expanded run cards and candidate/program records with reruns, slices, agent threads, and auth policy
- added explicit dashboard, source-audit, and leakage-audit references to the SOTA claim surface
- added redacted public summary rendering for the richer machine-readable records
- made OAuth-backed ChatGPT/Codex paths the default public story instead of API-key-first tooling
These skills are strongest when the user already has a real CV or DS workflow and wants a drop-in research harness around it. Good fits include:
- derm or segmentation plateau recovery
- browser-heavy notebook workflows
- benchmark campaigns that need stronger promotion gates
- public or reusable experiment-management patterns across repos
ClawHub is public. Keep the published skill bundles free of:
- absolute local paths
- browser profile names
- private notebook URLs
- secrets, tokens, and customer identifiers
- internal hostnames or VM labels
Private specializations should stay in local override skills, not in the public package.
python3 -m py_compile \
skill/data-science-cv-repro-lab/scripts/*.py \
skill/sota-agent/scripts/*.py
python3 skill/data-science-cv-repro-lab/scripts/init_cv_improvement_harness.py \
--out /tmp/cv-harness.json \
--task-id demo \
--candidate-family baseline-recovery
python3 skill/data-science-cv-repro-lab/scripts/init_cv_review_dashboard_manifest.py \
--out /tmp/cv-dashboard.json \
--dashboard-id demo-dashboard \
--title "Demo review dashboard"
python3 skill/sota-agent/scripts/init_sota_program.py \
--out /tmp/sota-program.json \
--campaign-id demo \
--task demo \
--metric scoreThe public bundle is informed by: