[copilot-cli-research] Copilot CLI Deep Research - 2026-04-11 #25844
Closed
Replies: 1 comment
-
|
This discussion has been marked as outdated by Copilot CLI Deep Research Agent. A newer discussion is available at Discussion #25936. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Analysis Date: 2026-04-11 · Run: §24291521035 · Previous Run: §24264054881
This is the 3rd run of the Copilot CLI Deep Research agent. Tracking trends over time.
📊 Executive Summary
Scope: 187 total workflows, 90 using the Copilot engine (48%) | Copilot CLI version:
1.0.21The Copilot engine remains the dominant choice at 48% of all workflows. The most significant finding this cycle is a security gap: 52% of Copilot workflows (47/90) run with no network restrictions at all. Meanwhile, the newly available
block-domainsfeature (for AWF sandbox domain blocking) has regressed from 1 workflow to 0 — missing an opportunity to harden sandbox security. On the positive side, version pinning doubled (11%→22%) andengine.argsusage jumped from 1%→10%, showing good adoption of advanced configuration.Three high-impact opportunities stand out: (1) adding network restrictions to analysis-only workflows, (2) leveraging the 8 unused custom agent files for specialized behaviors, and (3) enabling
max-continuationsfor complex multi-step workflows that currently run single-pass.🔴 Critical Findings
High Priority
block-domainsat 0% — the AWF sandbox domain-blocking feature (block-domains:) was previously used in 1 workflow but has now dropped to 0. For sandboxed workflows, explicitly blocking known-dangerous domains is a security best practice.Medium Priority
engine.agent— specialized agents likegrumpy-reviewer,contribution-checker,adr-writer, andinteractive-agent-designerare available but unused in workflow automation.max-continuationsnearly absent (2%) — only 2 of 90 Copilot workflows use autopilot/continuation mode, even though many complex daily workflows would benefit from multiple consecutive passes.[default]toolset — these workflows grant 5+ GitHub API toolsets when more targeted access would suffice.1️⃣ Current State Analysis
View Copilot CLI Capabilities Inventory
Copilot CLI Capabilities Inventory
Version:
1.0.21(default; version pinning supported viaengine.version)Core CLI Flags (auto-applied by compiler):
--add-dir /tmp/gh-aw/— workspace access--disable-builtin-mcps— disables default MCPs (compiler always sets this)--no-ask-user— fully autonomous mode (v1.0.19+, always set)--log-level all --log-dir <path>— full loggingConfigurable via Frontmatter:
engine.agent--agent <id>engine.bare: true--no-custom-instructionsengine.max-continuations--autopilot --max-autopilot-continues Nengine.modelCOPILOT_MODELenv varengine.versionengine.argsengine.commandcopilotbinaryengine.api-targetengine.envsandbox.agent: awfsandbox.agent: srtnetwork.allowednetwork.blocked/block-domainsmcp-scriptstools.web-fetch--allow-tool web_fetchtools.web-searchtools.playwrighttools.githubtools.bashcache-memory/tmp/gh-aw/cache-memory/strict: trueView Usage Statistics
Usage Statistics
Most common GitHub toolsets:
default,default + actions,default + discussions,pull_requests,reposMost common timeout: 30 min (44 workflows), 10 min (35), 20 min (34), 15 min (32)
2️⃣ Feature Usage Matrix
3️⃣ Missed Opportunities
🔴 View High Priority Opportunities
Opportunity 1: Network Restrictions Missing from Analysis Workflows
What: 47 Copilot workflows run with zero network restrictions.
Why It Matters: Unrestricted network access means the agent can call any external service — a significant security risk for workflows that only need GitHub API access.
Affected Workflows (sample):
breaking-change-checker,copilot-token-audit,daily-architecture-diagram,daily-assign-issue-to-user,daily-cli-performance,daily-compiler-quality,daily-file-diet,daily-integrity-analysis,daily-issues-report,daily-malicious-code-scan,daily-mcp-concurrency-analysis, and 36 more.How to Implement: For GitHub-only workflows, add:
For pure analysis workflows with no external deps:
Expected Benefits: Reduced attack surface, prevents accidental/malicious exfiltration.
Opportunity 2:
block-domains— Sandbox Security Hardening at ZeroWhat: The
block-domains:/network.blocked:feature explicitly denies specific domains even within an allowed network. It's supported in AWF sandbox mode and was previously used in 1 workflow but now 0.Why It Matters: Allowlists alone don't prevent all exfiltration vectors. Explicit domain blocklists prevent known-risky services (e.g., pastebin, transfer.sh, requestbin) from being called.
How to Implement (for AWF-sandboxed workflows):
🟡 View Medium Priority Opportunities
Opportunity 3: Underused Custom Agent Files
What: 10
.agent.mdfiles exist in.github/agents/but only 3 distinct agents are referenced viaengine.agent(technical-doc-writer×2,ci-cleaner×1,awf×several). The following agents are never used via workflows:grumpy-reviewer.agent.mdpr-nitpick-reviewer,code-simplifiercontribution-checker.agent.mdcontribution-checkadr-writer.agent.mdarchitecture-guardianinteractive-agent-designer.agent.mdworkflow-generatorw3c-specification-writer.agent.mddocs-noob-testerHow to Implement:
Opportunity 4:
max-continuationsfor Complex Multi-Step WorkflowsWhat: Only 2 workflows use
max-continuations(smoke-copilot: 2,test-quality-sentinel: 40). Many complex daily workflows run single-pass and may fail or produce incomplete output.Why It Matters: Workflows like
daily-safe-output-integrator(740 lines),daily-cli-performance(694 lines),release(657 lines) perform multi-phase analysis with large scopes. Enabling autopilot allows the agent to restart when it reaches context limits.How to Implement:
Candidate Workflows:
daily-safe-output-integrator,daily-cli-performance,daily-compiler-quality,portfolio-analyst,repository-quality-improverOpportunity 5: Over-Broad GitHub Toolsets (
[default]Only)What: 17 Copilot workflows use only
toolsets: [default], which grants access to context, repos, issues, pull requests, and more. Many of these workflows only need 1-2 specific toolsets.Affected Workflows (sample):
ai-moderator,bot-detection,ci-coach,code-simplifier,commit-changes-analyzer,contribution-check,daily-session-insightsHow to Implement: Replace
[default]with the minimum required:🟢 View Low Priority Opportunities
Opportunity 6: No Workflows Use Conversation Sharing (
--share)What: The Copilot CLI
--shareflag generates a shareable conversation URL for debugging. No production workflows currently enable this (accessible viaengine.args).How to Implement:
Expected Benefits: Easier debugging of agent failures, shareable conversation traces for team review.
Opportunity 7:
mcp-scriptsFeature Completely Unused in ProductionWhat:
mcp-scripts:allows dynamically injecting custom MCP server scripts into workflows. Despite being a production-ready feature, 0 Copilot workflows use it (the only result in grep was a documentation mention insecurity-review.md).Why It Matters: Enables powerful custom tool creation without requiring a separate hosted MCP server. Useful for workflow-specific tooling (custom APIs, data transforms, specialized queries).
Example Use Case: A workflow needing custom Slack notification formatting could define a local MCP script instead of relying on a remote MCP server.
Opportunity 8:
engine.api-targetCompletely UnusedWhat:
api-targetallows routing Copilot API calls to GHEC/GHES instances. 0 workflows use it.Note: This is only relevant if the team uses GitHub Enterprise Cloud with data residency. If running on github.com, this is expected to be unused.
Opportunity 9:
bare: trueOnly in Smoke TestsWhat:
engine.bare: trueadds--no-custom-instructionsto suppress AGENTS.md loading. Onlysmoke-claude.mdandsmoke-copilot.mduse it.Opportunity: Clean-slate workflows (security reviews, unbiased analysis) could benefit from bare mode to prevent repository-specific instructions from skewing results.
Candidate Workflows:
security-review,daily-malicious-code-scan,bot-detection4️⃣ Specific Workflow Recommendations
View Workflow-Specific Recommendations
daily-safe-output-integrator.md(740 lines, most complex)max-continuations: 5— this workflow has 5 phases and may time outengine.agent: developer(developer instructions) for codebase awarenesscopilot-token-audit.md+copilot-token-optimizer.mdnetwork: allowed: [defaults, github]— only needs GitHub APIdaily-malicious-code-scan.mdbare: trueto prevent AGENTS.md from influencing security judgmentscontribution-check.mdengine: copilotbut does NOT referencecontribution-checker.agent.mdagent: contribution-checkerto leverage the specialized agent:copilot-pr-merged-report.md+claude-code-user-docs-review.mdsafe-outputs: noop:as a minimum to signal completion intent explicitly.smoke-copilot-arm.md+copilot-pr-merged-report.mdtools: github:with notoolsets:specified5️⃣ Trends vs Previous Analysis (2026-04-10)
View Historical Trends
engine.argsusageengine.agentusageengine.envusageweb-fetchusageplaywrightusageblock-domainsmax-continuationsmcp-scriptsproductionKey Improvements Since Last Run: Version pinning and
engine.argshave both increased significantly, suggesting teams are more comfortable with advanced engine configuration.Key Regressions:
block-domainsusage dropped to 0. This specific security feature should be re-enabled in appropriate AWF-sandboxed workflows.6️⃣ Best Practice Guidelines
Based on this analysis, here are recommended best practices for Copilot workflows:
Always restrict network access for read-only/analysis workflows using
network: allowed: [defaults, github]. Unrestricted access should be the exception, not the rule.Match agent files to workflow specialization — if a
.github/agents/*.agent.mdfile matches your workflow's domain (e.g.,contribution-checkerfor PR compliance), always reference it viaengine.agent.Pin versions for critical workflows — daily/weekly automations should pin a specific Copilot CLI version to avoid unexpected breakage from CLI updates. Update pins monthly.
Use
max-continuationsfor multi-phase workflows — any workflow with 3+ distinct phases or large code scopes should usemax-continuations: 3-5to handle context limit rollovers gracefully.Specify minimal GitHub toolsets — use
toolsets: [issues]not[default]unless you genuinely need all default toolsets. Reduces permissions and improves security posture.Add
block-domainsto AWF-sandboxed workflows — even with an allowlist, adding explicit domain blocklists for known exfiltration endpoints (pastebin, transfer.sh) adds defense-in-depth.Use
bare: truefor security-sensitive analysis — workflows performing security reviews, malicious code scanning, or unbiased quality analysis should usebare: trueto prevent AGENTS.md from influencing results.7️⃣ Action Items
Immediate (this week):
block-domainsfor AWF-sandboxed workflows (regression from previous run)network: allowed: [defaults, github]to top 10 read-only Copilot workflows without network restrictionscontribution-check.mdto useengine.agent: contribution-checkerShort-term (this month):
network:configsmax-continuations: 5to the top 5 most complex daily workflowssmoke-copilot-arm.mdandcopilot-pr-merged-report.mdto specify explicit GitHub toolsetsbare: truetodaily-malicious-code-scan.mdandsecurity-review.mdLong-term (this quarter):
mcp-scriptsfor 2-3 pilot workflows to demonstrate the capabilitynetwork-github-only.mdimport to standardize network restrictionsengine.api-targetfor GHE scenariosView Supporting Evidence & Methodology
Research Methodology
This analysis was conducted by:
pkg/cli/andpkg/workflow/.github/workflows/Data Sources
pkg/workflow/copilot_engine_execution.go— CLI flags and execution logicpkg/workflow/copilot_engine_tools.go— tool permission mappingpkg/workflow/copilot_engine.go— engine feature flagspkg/constants/version_constants.go— default version (1.0.21).github/workflows/*.md— 187 workflow definitions.github/agents/*.md— 10 custom agent files/tmp/gh-aw/repo-memory/default/copilot-research-latest.json— historical dataPrevious Research
References:
Beta Was this translation helpful? Give feedback.
All reactions