[copilot-cli-research] Copilot CLI Deep Research - 2026-04-11 #25844

2026-04-11T21:10:24Z

github-actions[bot]
bot Apr 11, 2026

Analysis Date: 2026-04-11 · Run: §24291521035 · Previous Run: §24264054881

This is the 3rd run of the Copilot CLI Deep Research agent. Tracking trends over time.

📊 Executive Summary

Scope: 187 total workflows, 90 using the Copilot engine (48%) | Copilot CLI version: 1.0.21

The Copilot engine remains the dominant choice at 48% of all workflows. The most significant finding this cycle is a security gap: 52% of Copilot workflows (47/90) run with no network restrictions at all. Meanwhile, the newly available block-domains feature (for AWF sandbox domain blocking) has regressed from 1 workflow to 0 — missing an opportunity to harden sandbox security. On the positive side, version pinning doubled (11%→22%) and engine.args usage jumped from 1%→10%, showing good adoption of advanced configuration.

Three high-impact opportunities stand out: (1) adding network restrictions to analysis-only workflows, (2) leveraging the 8 unused custom agent files for specialized behaviors, and (3) enabling max-continuations for complex multi-step workflows that currently run single-pass.

🔴 Critical Findings

High Priority

52% of Copilot workflows have zero network restrictions — these workflows can reach any internet endpoint during execution. Read-only analysis/reporting workflows should lock down network access.
block-domains at 0% — the AWF sandbox domain-blocking feature (block-domains:) was previously used in 1 workflow but has now dropped to 0. For sandboxed workflows, explicitly blocking known-dangerous domains is a security best practice.

Medium Priority

8 of 10 custom agent files are never referenced via engine.agent — specialized agents like grumpy-reviewer, contribution-checker, adr-writer, and interactive-agent-designer are available but unused in workflow automation.
max-continuations nearly absent (2%) — only 2 of 90 Copilot workflows use autopilot/continuation mode, even though many complex daily workflows would benefit from multiple consecutive passes.
17 workflows use overly broad [default] toolset — these workflows grant 5+ GitHub API toolsets when more targeted access would suffice.

1️⃣ Current State Analysis

View Copilot CLI Capabilities Inventory

Copilot CLI Capabilities Inventory

Version: 1.0.21 (default; version pinning supported via engine.version)

Core CLI Flags (auto-applied by compiler):

--add-dir /tmp/gh-aw/ — workspace access
--disable-builtin-mcps — disables default MCPs (compiler always sets this)
--no-ask-user — fully autonomous mode (v1.0.19+, always set)
--log-level all --log-dir <path> — full logging

Configurable via Frontmatter:

Config Key	CLI Equivalent	Usage
`engine.agent`	`--agent <id>`	5/90 (6%)
`engine.bare: true`	`--no-custom-instructions`	2/90 (2%)
`engine.max-continuations`	`--autopilot --max-autopilot-continues N`	2/90 (2%)
`engine.model`	`COPILOT_MODEL` env var	7/90 (8%)
`engine.version`	installer pin	20/90 (22%)
`engine.args`	appended to CLI	9/90 (10%)
`engine.command`	replaces `copilot` binary	~5 (custom engines)
`engine.api-target`	GHEC/GHES endpoint	0/90 (0%)
`engine.env`	shell env vars	10/90 (11%)
`sandbox.agent: awf`	AWF firewall wrapping	15/90 (17%)
`sandbox.agent: srt`	SRT process isolation	1/90 (1%)
`network.allowed`	AWF domain allowlist	43/90 (48%)
`network.blocked` / `block-domains`	AWF domain blocklist	0/90 (0%)
`mcp-scripts`	dynamic MCP script injection	0/90 (0%)
`tools.web-fetch`	`--allow-tool web_fetch`	17/90 (19%)
`tools.web-search`	Brave/Tavily MCP	3/90 (3%)
`tools.playwright`	Playwright MCP container	12/90 (13%)
`tools.github`	GitHub MCP server	~80/90 (89%)
`tools.bash`	shell tool allowlist	~70/90 (78%)
`cache-memory`	`/tmp/gh-aw/cache-memory/`	53/187 (28%)
`strict: true`	strict validation mode	108/187 (58%)

View Usage Statistics

Usage Statistics

Metric	Count	%
Total workflows	187	—
Copilot engine	90	48%
With network restrictions	43	48% of Copilot
Without network restrictions	47	52% of Copilot
With safe-outputs	75	83% of Copilot
With timeout-minutes	89	99% of Copilot
With AWF/SRT sandbox	16	18% of Copilot
With cache-memory	53	59% of all workflows
With tracker-id	62	33% of all workflows
With strict: true	108	58% of all workflows

Most common GitHub toolsets: default, default + actions, default + discussions, pull_requests, repos

Most common timeout: 30 min (44 workflows), 10 min (35), 20 min (34), 15 min (32)

2️⃣ Feature Usage Matrix

Feature Category	Available Features	Used	Not Used	Usage Rate
CLI Flags	agent, bare, max-continuations, args, api-target	agent(6%), bare(2%), max-cont(2%), args(10%)	api-target	20–22% avg
Engine Config	version, model, command, env	version(22%), model(8%), env(11%)	api-target	14% avg
MCP / Tools	github, playwright, web-fetch, web-search, mcp-scripts	github(89%), playwright(13%), web-fetch(19%)	mcp-scripts(0%)	40% avg
Network Config	allowed, blocked, block-domains	allowed(48%)	blocked, block-domains	16% avg
Sandbox Options	AWF, SRT	AWF(17%), SRT(1%)	—	18%
Custom Agents	10 agent files	3 distinct agents used	8/10 files unused	20%

3️⃣ Missed Opportunities

🔴 View High Priority Opportunities

Opportunity 1: Network Restrictions Missing from Analysis Workflows

What: 47 Copilot workflows run with zero network restrictions.

Why It Matters: Unrestricted network access means the agent can call any external service — a significant security risk for workflows that only need GitHub API access.

Affected Workflows (sample): breaking-change-checker, copilot-token-audit, daily-architecture-diagram, daily-assign-issue-to-user, daily-cli-performance, daily-compiler-quality, daily-file-diet, daily-integrity-analysis, daily-issues-report, daily-malicious-code-scan, daily-mcp-concurrency-analysis, and 36 more.

How to Implement: For GitHub-only workflows, add:

network:
  allowed:
    - defaults
    - github

For pure analysis workflows with no external deps:

network:
  allowed:
    - defaults

Expected Benefits: Reduced attack surface, prevents accidental/malicious exfiltration.

Opportunity 2: `block-domains` — Sandbox Security Hardening at Zero

What: The block-domains: / network.blocked: feature explicitly denies specific domains even within an allowed network. It's supported in AWF sandbox mode and was previously used in 1 workflow but now 0.

Why It Matters: Allowlists alone don't prevent all exfiltration vectors. Explicit domain blocklists prevent known-risky services (e.g., pastebin, transfer.sh, requestbin) from being called.

How to Implement (for AWF-sandboxed workflows):

network:
  allowed:
    - defaults
    - github
  blocked:
    - pastebin.com
    - transfer.sh
    - requestbin.com

🟡 View Medium Priority Opportunities

Opportunity 3: Underused Custom Agent Files

What: 10 .agent.md files exist in .github/agents/ but only 3 distinct agents are referenced via engine.agent (technical-doc-writer ×2, ci-cleaner ×1, awf ×several). The following agents are never used via workflows:

Agent File	Purpose	Candidate Workflows
`grumpy-reviewer.agent.md`	Critical code reviewer	`pr-nitpick-reviewer`, `code-simplifier`
`contribution-checker.agent.md`	PR compliance checker	`contribution-check`
`adr-writer.agent.md`	Architecture decision records	`architecture-guardian`
`interactive-agent-designer.agent.md`	Agent design assistant	`workflow-generator`
`w3c-specification-writer.agent.md`	Spec writing	`docs-noob-tester`

How to Implement:

engine:
  id: copilot
  agent: grumpy-reviewer  # .github/agents/grumpy-reviewer.agent.md

Opportunity 4: `max-continuations` for Complex Multi-Step Workflows

What: Only 2 workflows use max-continuations (smoke-copilot: 2, test-quality-sentinel: 40). Many complex daily workflows run single-pass and may fail or produce incomplete output.

Why It Matters: Workflows like daily-safe-output-integrator (740 lines), daily-cli-performance (694 lines), release (657 lines) perform multi-phase analysis with large scopes. Enabling autopilot allows the agent to restart when it reaches context limits.

How to Implement:

engine:
  id: copilot
  max-continuations: 5  # Allow up to 5 consecutive continuation passes

Candidate Workflows: daily-safe-output-integrator, daily-cli-performance, daily-compiler-quality, portfolio-analyst, repository-quality-improver

Opportunity 5: Over-Broad GitHub Toolsets (`[default]` Only)

What: 17 Copilot workflows use only toolsets: [default], which grants access to context, repos, issues, pull requests, and more. Many of these workflows only need 1-2 specific toolsets.

Affected Workflows (sample): ai-moderator, bot-detection, ci-coach, code-simplifier, commit-changes-analyzer, contribution-check, daily-session-insights

How to Implement: Replace [default] with the minimum required:

# For PR-only workflows:
toolsets: [pull_requests]

# For issue-only workflows:
toolsets: [issues]

# For repo browsing only:
toolsets: [repos]

🟢 View Low Priority Opportunities

Opportunity 6: No Workflows Use Conversation Sharing (`--share`)

What: The Copilot CLI --share flag generates a shareable conversation URL for debugging. No production workflows currently enable this (accessible via engine.args).

How to Implement:

engine:
  id: copilot
  args: ["--share"]

Expected Benefits: Easier debugging of agent failures, shareable conversation traces for team review.

Opportunity 7: `mcp-scripts` Feature Completely Unused in Production

What: mcp-scripts: allows dynamically injecting custom MCP server scripts into workflows. Despite being a production-ready feature, 0 Copilot workflows use it (the only result in grep was a documentation mention in security-review.md).

Why It Matters: Enables powerful custom tool creation without requiring a separate hosted MCP server. Useful for workflow-specific tooling (custom APIs, data transforms, specialized queries).

Example Use Case: A workflow needing custom Slack notification formatting could define a local MCP script instead of relying on a remote MCP server.

Opportunity 8: `engine.api-target` Completely Unused

What: api-target allows routing Copilot API calls to GHEC/GHES instances. 0 workflows use it.

Note: This is only relevant if the team uses GitHub Enterprise Cloud with data residency. If running on github.com, this is expected to be unused.

Opportunity 9: `bare: true` Only in Smoke Tests

What: engine.bare: true adds --no-custom-instructions to suppress AGENTS.md loading. Only smoke-claude.md and smoke-copilot.md use it.

Opportunity: Clean-slate workflows (security reviews, unbiased analysis) could benefit from bare mode to prevent repository-specific instructions from skewing results.

Candidate Workflows: security-review, daily-malicious-code-scan, bot-detection

engine:
  id: copilot
  bare: true  # Ignores .github/AGENTS.md - pure prompting

4️⃣ Specific Workflow Recommendations

View Workflow-Specific Recommendations

`daily-safe-output-integrator.md` (740 lines, most complex)

Current: Single-pass, no max-continuations
Recommend: Add max-continuations: 5 — this workflow has 5 phases and may time out
Recommend: Use engine.agent: developer (developer instructions) for codebase awareness

`copilot-token-audit.md` + `copilot-token-optimizer.md`

Current: No network restrictions
Recommend: Add network: allowed: [defaults, github] — only needs GitHub API

`daily-malicious-code-scan.md`

Current: No network restrictions, no bare mode
Recommend: Add bare: true to prevent AGENTS.md from influencing security judgments
Recommend: Add network restrictions since this is a pure code analysis workflow

`contribution-check.md`

Current: Uses engine: copilot but does NOT reference contribution-checker.agent.md
Recommend: Add agent: contribution-checker to leverage the specialized agent:

engine:
  id: copilot
  agent: contribution-checker

`copilot-pr-merged-report.md` + `claude-code-user-docs-review.md`

Current: No safe-outputs configured
Clarification needed: If these are read-only analysis workflows, they should add safe-outputs: noop: as a minimum to signal completion intent explicitly.

`smoke-copilot-arm.md` + `copilot-pr-merged-report.md`

Current: tools: github: with no toolsets: specified
Recommend: Add explicit toolsets to avoid overly broad permissions:

tools:
  github:
    toolsets: [repos, pull_requests]

5️⃣ Trends vs Previous Analysis (2026-04-10)

View Historical Trends

Metric	Previous (2026-04-10)	Current (2026-04-11)	Trend
Copilot workflows	93	90	-3
Version pinning	11%	22%	✅ +11pp
`engine.args` usage	1%	10%	✅ +9pp
`engine.agent` usage	3%	6%	✅ +3pp
`engine.env` usage	11%	11%	→ stable
`web-fetch` usage	18%	19%	→ stable
`playwright` usage	13%	13%	→ stable
AWF sandbox	16%	18%	✅ +2pp
`block-domains`	1%	0%	⚠️ -1pp (regression)
`max-continuations`	3%	2%	⚠️ slight decrease
`mcp-scripts` production	unclear	0%	⚠️

Key Improvements Since Last Run: Version pinning and engine.args have both increased significantly, suggesting teams are more comfortable with advanced engine configuration.

Key Regressions: block-domains usage dropped to 0. This specific security feature should be re-enabled in appropriate AWF-sandboxed workflows.

6️⃣ Best Practice Guidelines

Based on this analysis, here are recommended best practices for Copilot workflows:

Always restrict network access for read-only/analysis workflows using network: allowed: [defaults, github]. Unrestricted access should be the exception, not the rule.
Match agent files to workflow specialization — if a .github/agents/*.agent.md file matches your workflow's domain (e.g., contribution-checker for PR compliance), always reference it via engine.agent.
Pin versions for critical workflows — daily/weekly automations should pin a specific Copilot CLI version to avoid unexpected breakage from CLI updates. Update pins monthly.
Use max-continuations for multi-phase workflows — any workflow with 3+ distinct phases or large code scopes should use max-continuations: 3-5 to handle context limit rollovers gracefully.
Specify minimal GitHub toolsets — use toolsets: [issues] not [default] unless you genuinely need all default toolsets. Reduces permissions and improves security posture.
Add block-domains to AWF-sandboxed workflows — even with an allowlist, adding explicit domain blocklists for known exfiltration endpoints (pastebin, transfer.sh) adds defense-in-depth.
Use bare: true for security-sensitive analysis — workflows performing security reviews, malicious code scanning, or unbiased quality analysis should use bare: true to prevent AGENTS.md from influencing results.

7️⃣ Action Items

Immediate (this week):

Re-enable block-domains for AWF-sandboxed workflows (regression from previous run)
Add network: allowed: [defaults, github] to top 10 read-only Copilot workflows without network restrictions
Wire contribution-check.md to use engine.agent: contribution-checker

Short-term (this month):

Audit all 47 network-unrestricted Copilot workflows and add appropriate network: configs
Add max-continuations: 5 to the top 5 most complex daily workflows
Update smoke-copilot-arm.md and copilot-pr-merged-report.md to specify explicit GitHub toolsets
Add bare: true to daily-malicious-code-scan.md and security-review.md

Long-term (this quarter):

Evaluate mcp-scripts for 2-3 pilot workflows to demonstrate the capability
Create a shared network-github-only.md import to standardize network restrictions
Review if any workflows could benefit from engine.api-target for GHE scenarios

View Supporting Evidence & Methodology

Research Methodology

This analysis was conducted by:

Inventory: Scanning all 28 Copilot-related Go files in pkg/cli/ and pkg/workflow/
Usage Analysis: grep-based survey of 90 Copilot workflows in .github/workflows/
Feature Gap Analysis: Comparing available compiler features (from execution code) with actual frontmatter usage
Trend Analysis: Comparing with previous run stored in repo-memory

Data Sources

pkg/workflow/copilot_engine_execution.go — CLI flags and execution logic
pkg/workflow/copilot_engine_tools.go — tool permission mapping
pkg/workflow/copilot_engine.go — engine feature flags
pkg/constants/version_constants.go — default version (1.0.21)
.github/workflows/*.md — 187 workflow definitions
.github/agents/*.md — 10 custom agent files
/tmp/gh-aw/repo-memory/default/copilot-research-latest.json — historical data

Previous Research

Run 2: §24264054881 (2026-04-10)
Run 1: §24051154005 (2026-04-06)

References:

§24291521035 — this run
§24264054881 — previous run
§24051154005 — run 1

Generated by Copilot CLI Deep Research Agent · ● 2.4M · ◷

expires on Apr 12, 2026, 9:10 PM UTC

2026-04-12T21:11:53Z

github-actions[bot]
bot Apr 12, 2026
Author

This discussion has been marked as outdated by Copilot CLI Deep Research Agent.

A newer discussion is available at Discussion #25936.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[copilot-cli-research] Copilot CLI Deep Research - 2026-04-11 #25844

Uh oh!

{{title}}

Uh oh!

Copilot CLI Capabilities Inventory

Usage Statistics

Opportunity 1: Network Restrictions Missing from Analysis Workflows

Opportunity 2: `block-domains` — Sandbox Security Hardening at Zero

Opportunity 3: Underused Custom Agent Files

Opportunity 4: `max-continuations` for Complex Multi-Step Workflows

Opportunity 5: Over-Broad GitHub Toolsets (`[default]` Only)

Opportunity 6: No Workflows Use Conversation Sharing (`--share`)

Opportunity 7: `mcp-scripts` Feature Completely Unused in Production

Opportunity 8: `engine.api-target` Completely Unused

Opportunity 9: `bare: true` Only in Smoke Tests

`daily-safe-output-integrator.md` (740 lines, most complex)

`copilot-token-audit.md` + `copilot-token-optimizer.md`

`daily-malicious-code-scan.md`

`contribution-check.md`

`copilot-pr-merged-report.md` + `claude-code-user-docs-review.md`

`smoke-copilot-arm.md` + `copilot-pr-merged-report.md`

Research Methodology

Data Sources

Previous Research

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[copilot-cli-research] Copilot CLI Deep Research - 2026-04-11 #25844

Uh oh!

github-actions[bot] bot Apr 11, 2026

📊 Executive Summary

🔴 Critical Findings

High Priority

Medium Priority

1️⃣ Current State Analysis

Copilot CLI Capabilities Inventory

Usage Statistics

2️⃣ Feature Usage Matrix

3️⃣ Missed Opportunities

Opportunity 1: Network Restrictions Missing from Analysis Workflows

Opportunity 2: block-domains — Sandbox Security Hardening at Zero

Opportunity 3: Underused Custom Agent Files

Opportunity 4: max-continuations for Complex Multi-Step Workflows

Opportunity 5: Over-Broad GitHub Toolsets ([default] Only)

Opportunity 6: No Workflows Use Conversation Sharing (--share)

Opportunity 7: mcp-scripts Feature Completely Unused in Production

Opportunity 8: engine.api-target Completely Unused

Opportunity 9: bare: true Only in Smoke Tests

4️⃣ Specific Workflow Recommendations

daily-safe-output-integrator.md (740 lines, most complex)

copilot-token-audit.md + copilot-token-optimizer.md

daily-malicious-code-scan.md

contribution-check.md

copilot-pr-merged-report.md + claude-code-user-docs-review.md

smoke-copilot-arm.md + copilot-pr-merged-report.md

5️⃣ Trends vs Previous Analysis (2026-04-10)

6️⃣ Best Practice Guidelines

7️⃣ Action Items

Research Methodology

Data Sources

Previous Research

Replies: 1 comment

Uh oh!

github-actions[bot] bot Apr 12, 2026 Author

github-actions[bot]
bot Apr 11, 2026

Opportunity 2: `block-domains` — Sandbox Security Hardening at Zero

Opportunity 4: `max-continuations` for Complex Multi-Step Workflows

Opportunity 5: Over-Broad GitHub Toolsets (`[default]` Only)

Opportunity 6: No Workflows Use Conversation Sharing (`--share`)

Opportunity 7: `mcp-scripts` Feature Completely Unused in Production

Opportunity 8: `engine.api-target` Completely Unused

Opportunity 9: `bare: true` Only in Smoke Tests

`daily-safe-output-integrator.md` (740 lines, most complex)

`copilot-token-audit.md` + `copilot-token-optimizer.md`

`daily-malicious-code-scan.md`

`contribution-check.md`

`copilot-pr-merged-report.md` + `claude-code-user-docs-review.md`

`smoke-copilot-arm.md` + `copilot-pr-merged-report.md`

github-actions[bot]
bot Apr 12, 2026
Author