|
| 1 | +--- |
| 2 | +name: verify |
| 3 | +description: >- |
| 4 | + Orchestrate repo-level verification of a PR by pushing changes, then polling |
| 5 | + CI checks, PR review, and QA workflows until all pass — or until issues are |
| 6 | + found that need fixing. The agent reads feedback, fixes code, pushes again, |
| 7 | + and repeats. Uses only standard `gh` CLI commands that work on any GitHub repo. |
| 8 | +triggers: |
| 9 | +- /verify |
| 10 | +--- |
| 11 | + |
| 12 | +# /verify — Repo-Level Verification via Polling |
| 13 | + |
| 14 | +Orchestrate the push → poll → fix → push loop for a pull request. |
| 15 | +You poll the repo's verifiers with `gh` CLI, read feedback, fix issues, and iterate. |
| 16 | +No scripts — you are the orchestration loop. |
| 17 | + |
| 18 | +Requires: `gh` CLI authenticated with repo access, a PR branch. |
| 19 | + |
| 20 | +## Discover what the repo has |
| 21 | + |
| 22 | +Not every repo has all three verification layers. Before entering the loop, |
| 23 | +check which ones exist. Only poll layers that are actually set up. |
| 24 | + |
| 25 | +```bash |
| 26 | +gh workflow list --json name --jq '.[].name' |
| 27 | +``` |
| 28 | + |
| 29 | +- **CI checks** — almost every repo has these. If `gh pr checks` returns results, CI is present. |
| 30 | +- **PR review bot** — look for a workflow named like "PR Review" or "pr-review" in the output above, or check for `.github/workflows/pr-review*.yml` in the repo. If it's not there, the repo doesn't have automated PR review. Skip step 3 entirely. |
| 31 | +- **QA bot** — look for a workflow named like "QA" or "qa-changes". If it's not there, the repo doesn't have automated QA. Skip step 4 entirely. |
| 32 | + |
| 33 | +A repo might have only CI. Or CI + review. Or all three. Your "all passed" |
| 34 | +condition is: every *present* layer is green. Don't block waiting for layers |
| 35 | +that don't exist. |
| 36 | + |
| 37 | +## The loop |
| 38 | + |
| 39 | +1. Push and ensure PR exists. |
| 40 | +2. Poll each present verification layer. |
| 41 | +3. Decide: all passed? fix needed? wait? |
| 42 | +4. If fix needed — fix, commit, push, re-request review from bots, go to 2. |
| 43 | +5. If waiting — sleep 30-60s, go to 2. |
| 44 | +6. If all present layers passed on the *current* SHA — done. |
| 45 | + |
| 46 | +IMPORTANT: pushing a fix is NOT the end. After every fix+push you MUST |
| 47 | +re-request review from the review bot (if present) and go back to step 2. |
| 48 | +The loop only ends when the verifiers pass on your latest SHA. Addressing |
| 49 | +feedback and pushing a commit is just one iteration — the bot needs to |
| 50 | +review the new code too. |
| 51 | + |
| 52 | +## Step 1 — Push and ensure PR exists |
| 53 | + |
| 54 | +```bash |
| 55 | +git push origin HEAD |
| 56 | +gh pr create --fill 2>/dev/null || true |
| 57 | +gh pr view --json number,url,headRefOid --jq '"\(.number) \(.url) \(.headRefOid)"' |
| 58 | +``` |
| 59 | + |
| 60 | +## Step 2 — Poll CI checks |
| 61 | + |
| 62 | +```bash |
| 63 | +gh pr checks --json name,state,bucket --jq ' |
| 64 | + { passed: [.[] | select(.bucket=="pass")] | length, |
| 65 | + failed: [.[] | select(.bucket=="fail")] | length, |
| 66 | + pending: [.[] | select(.bucket=="pending")] | length }' |
| 67 | +``` |
| 68 | + |
| 69 | +- Zero failed, zero pending → CI green. |
| 70 | +- Any pending → wait and re-poll. |
| 71 | +- Any failed → diagnose (see "CI failure classification" below). |
| 72 | + |
| 73 | +To inspect a failure: |
| 74 | + |
| 75 | +```bash |
| 76 | +SHA=$(gh pr view --json headRefOid --jq .headRefOid) |
| 77 | +gh run list --commit "$SHA" --status failure --json databaseId,name,conclusion \ |
| 78 | + --jq '.[] | "\(.databaseId)\t\(.name)\t\(.conclusion)"' |
| 79 | +gh run view <run-id> --log-failed |
| 80 | +``` |
| 81 | + |
| 82 | +## Step 3 — Poll PR review (if present) |
| 83 | + |
| 84 | +Skip this step if the repo has no review bot. |
| 85 | + |
| 86 | +```bash |
| 87 | +gh pr view --json reviews --jq ' |
| 88 | + [.reviews[] | select( |
| 89 | + .authorAssociation == "OWNER" or |
| 90 | + .authorAssociation == "MEMBER" or |
| 91 | + .authorAssociation == "COLLABORATOR" or |
| 92 | + (.author.login | test("openhands|all-hands-bot"; "i")) |
| 93 | + )] | last | { state: .state, reviewer: .author.login, body: .body[0:300] }' |
| 94 | +``` |
| 95 | + |
| 96 | +- `APPROVED` → review passed. |
| 97 | +- `CHANGES_REQUESTED` → read the body and inline comments, fix code. |
| 98 | +- `COMMENTED` → may have actionable suggestions; read and decide. |
| 99 | +- No matching review yet → bot may still be running; wait and re-poll. |
| 100 | + |
| 101 | +Inline review comments (when changes requested): |
| 102 | + |
| 103 | +```bash |
| 104 | +gh api "repos/{owner}/{repo}/pulls/{number}/comments" \ |
| 105 | + --jq '.[] | select(.user.login | test("openhands|all-hands-bot"; "i")) |
| 106 | + | { path: .path, line: .line, body: .body[0:200] }' |
| 107 | +``` |
| 108 | + |
| 109 | +## Step 4 — Poll QA report (if present) |
| 110 | + |
| 111 | +Skip this step if the repo has no QA bot. |
| 112 | + |
| 113 | +QA reports are PR issue comments with a status line like `Status: PASS`. |
| 114 | + |
| 115 | +```bash |
| 116 | +gh api "repos/{owner}/{repo}/issues/{number}/comments" --paginate \ |
| 117 | + --jq '[.[] | select( |
| 118 | + (.user.login | test("openhands|all-hands-bot"; "i")) and |
| 119 | + (.body | test("Status:\\s*(PASS|FAIL|PARTIAL)"; "i")) |
| 120 | + )] | last | { author: .user.login, body: .body[0:500], url: .html_url }' |
| 121 | +``` |
| 122 | + |
| 123 | +- `PASS` → QA passed. |
| 124 | +- `FAIL` → read details, fix code. |
| 125 | +- `PARTIAL` → some passed, some failed; read details. |
| 126 | +- No QA comment yet → bot may still be running; wait and re-poll. |
| 127 | + |
| 128 | +## Step 5 — Decide and act |
| 129 | + |
| 130 | +For each present layer, check its status. If a layer is not present in the |
| 131 | +repo, treat it as passing. |
| 132 | + |
| 133 | +- All present layers green on current SHA → done. |
| 134 | +- CI failed → fix code, or rerun if flaky (see below). |
| 135 | +- Review requested changes → read comments, fix, push. |
| 136 | +- QA failed/partial → read report, fix, push. |
| 137 | +- Anything still pending → sleep 30-60s, re-poll. |
| 138 | +- PR closed/merged → stop. |
| 139 | + |
| 140 | +After fixing, commit, push, AND re-request review: |
| 141 | + |
| 142 | +```bash |
| 143 | +git add -A |
| 144 | +git commit -m "fix: address <CI failure | review feedback | QA failure>" |
| 145 | +git push origin HEAD |
| 146 | + |
| 147 | +# Re-request review from the bot so it reviews the new SHA: |
| 148 | +gh pr comment --body "Addressed feedback in $(git rev-parse --short HEAD). Ready for another look." |
| 149 | +gh api -X POST "repos/{owner}/{repo}/pulls/{number}/requested_reviewers" \ |
| 150 | + -f 'reviewers[]=all-hands-bot' |
| 151 | +``` |
| 152 | + |
| 153 | +Then go back to step 2. You are not done until the bot reviews the new |
| 154 | +SHA and all present layers pass. |
| 155 | + |
| 156 | +## CI failure classification |
| 157 | + |
| 158 | +Branch-related (fix the code): |
| 159 | +- Compile/lint/typecheck failures in files you touched |
| 160 | +- Deterministic test failures in changed areas |
| 161 | +- Snapshot or static-analysis violations from your changes |
| 162 | + |
| 163 | +Flaky / unrelated (rerun the jobs): |
| 164 | +- Network/DNS/registry timeouts |
| 165 | +- Runner provisioning or startup failures |
| 166 | +- GitHub Actions infrastructure errors |
| 167 | +- Non-deterministic failures in code you didn't touch |
| 168 | + |
| 169 | +Rerun: `gh run rerun <run-id> --failed` |
| 170 | + |
| 171 | +Retry budget: at most 3 reruns per SHA. After that, treat as real. |
| 172 | + |
| 173 | +## Stop conditions |
| 174 | + |
| 175 | +Stop ONLY when: |
| 176 | +- All present verification layers passed on the current SHA. |
| 177 | +- PR closed or merged (`gh pr view --json state --jq .state`). |
| 178 | +- Retry budget exhausted — CI still failing after 3 reruns of the same SHA. |
| 179 | +- Blocked on something requiring user input. |
| 180 | + |
| 181 | +NOT a stop condition: |
| 182 | +- You pushed a fix commit. That's just one iteration — re-request review and keep going. |
| 183 | +- You replied to review comments. The bot still needs to review the new code. |
| 184 | +- CI is green but review bot hasn't re-reviewed your fix yet. Wait for it. |
| 185 | + |
| 186 | +Keep going when: |
| 187 | +- Checks still pending. |
| 188 | +- Bots haven't posted yet (few minutes after push). |
| 189 | +- Just pushed a fix and CI hasn't started. |
| 190 | + |
| 191 | +## Polling cadence |
| 192 | + |
| 193 | +- CI pending/failing: every 30-60s. |
| 194 | +- CI green: back off (60s, 2m, 4m), reset on any state change. |
| 195 | +- Just pushed a fix: re-poll immediately. |
| 196 | + |
| 197 | +## vs babysit-pr |
| 198 | + |
| 199 | +`/verify` is an active orchestrator — you write code, push, poll, fix, repeat. |
| 200 | +`/babysit-pr` is a passive monitor — watches someone else's PR via a Python script. |
| 201 | +Use `/verify` when you're the coding agent. Use `/babysit-pr` when you just need to watch. |
| 202 | + |
| 203 | +## References |
| 204 | + |
| 205 | +- Verification signal details: `references/workflow-signals.md` |
| 206 | +- CI failure heuristics: same as `babysit-pr/references/heuristics.md` |
0 commit comments