A multi-agent system that researches, generates, validates, and commits production-ready Terraform modules for AWS — through conversation.
Quick Start · Docs · Report Bug · Request Feature
Generating a VPC module from a single prompt — planning, skill creation, code generation, validation, and GitHub commit — all through conversation.
The problem. Writing Terraform modules isn't the hard part. Writing them well is. Every new AWS service means reading through provider docs, figuring out CIS benchmarks, remembering your org's tagging conventions, setting up state backends, and wiring up outputs that downstream consumers actually need. Multiply that by every module your team maintains, and you've got a full-time job that isn't making your infrastructure better — it's just keeping it alive.
And when you do get it right, the knowledge lives in one person's head. The rest of the team copies from another module and hopes it works. When the provider updates, the same senior engineer rewrites everything from scratch.
What AWS Orchestrator brings to the table.
AWS Orchestrator doesn't just generate Terraform — it researches the service first. It queries the Terraform registry for the latest provider docs, analyzes security best practices, writes a skill blueprint, and then generates the code. The result is modules that are current, security-hardened, and follow your org's conventions — not stale copies from training data.
Three things make this different:
-
It works like a senior infrastructure engineer, available to everyone. A junior DevOps engineer using AWS Orchestrator gets access to the same depth — CIS benchmarks, provider-specific nuances, multi-AZ patterns, state backend design — that would normally take years to build up. It levels the playing field.
-
It's a pipeline, not a prompt. This isn't "paste a prompt, get some HCL back." It's a multi-stage pipeline: research → skill creation → code generation → sandbox validation → human approval → GitHub commit. Each stage is a separate agent that can fail, retry, or ask for help independently.
-
Nothing ships without your sign-off. Human-in-the-loop approval gates are mandatory — after validation and before every commit. The agent can't push to your repo without you explicitly saying "yes, this looks right." Governance is baked in, not bolted on.
Available now:
- MCP-powered research — queries Terraform Registry for latest provider versions, module patterns, and policy details instead of relying on stale training data
- Skill-based generation — creates per-service skill blueprints (SKILL.md + references) that guide code generation, preventing hallucinated provider configs
- Complete module output — generates
main.tf,variables.tf,outputs.tf,versions.tf,locals.tf,README.md, plus service-specific files (iam.tf,policies.tf,security_groups.tf) as needed - Sandbox validation — runs
terraform init,fmt -check, andvalidatein a local sandbox before presenting results - Human-in-the-loop governance — mandatory approval gates after validation and before GitHub commits; optional gates for ambiguous requirements and cost-sensitive architecture decisions
- GitHub commit via MCP — commits directly to your repo using GitHub MCP tools, no shell
gitcommands - Module updates — fetches existing modules from GitHub and applies targeted, surgical edits — not full rewrites
- Persistent memory — remembers your org's conventions, past failures, and module locations across sessions
- Multi-provider LLM support — Google Gemini, OpenAI, Anthropic, AWS Bedrock, Azure OpenAI
- A2A + A2UI — speaks the Google A2A protocol with rich interactive UI components for approval cards and deployment gates
Coming soon:
- Multi-cloud support — Azure and GCP module generation alongside AWS
- Drift detection — compare generated modules against deployed infrastructure
- Cost estimation — estimate infrastructure cost before commit
- Custom policy engine — plug in your org's compliance rules as validation gates
The system is a hierarchy of agents built with LangGraph, communicating via the A2A protocol.
graph TD
User([You]) -->|"Create a VPC module"| S[Supervisor Agent]
S -->|transfer_to_terraform| C[TF Coordinator — Deep Agent]
S -->|request_human_input| H([HITL: greetings · out-of-scope])
C --> P[tf-planner]
C --> SB[tf-skill-builder]
C --> G[tf-generator]
C --> V[tf-validator]
C --> UP[update-planner]
C --> TU[tf-updater]
C --> GA[github-agent]
P --> RA[Req Analyzer]
P --> SEC[Security BP]
P --> EP[Exec Planner]
P -.->|provider docs| TF_MCP[(Terraform MCP)]
GA -.->|commit files| GH_MCP[(GitHub MCP)]
TU -.->|fetch modules| GH_MCP
UP -.->|read structure| GH_MCP
C -.->|approval gate| User
style S fill:#4A90D9,stroke:#2E6BA6,color:#fff
style C fill:#7B68EE,stroke:#5A4FCF,color:#fff
style P fill:#50C878,stroke:#3BA366,color:#fff
style SB fill:#50C878,stroke:#3BA366,color:#fff
style G fill:#50C878,stroke:#3BA366,color:#fff
style V fill:#50C878,stroke:#3BA366,color:#fff
style UP fill:#50C878,stroke:#3BA366,color:#fff
style TU fill:#50C878,stroke:#3BA366,color:#fff
style GA fill:#50C878,stroke:#3BA366,color:#fff
style RA fill:#FFB347,stroke:#E09530,color:#fff
style SEC fill:#FFB347,stroke:#E09530,color:#fff
style EP fill:#FFB347,stroke:#E09530,color:#fff
style TF_MCP fill:#FF6B6B,stroke:#E04A4A,color:#fff
style GH_MCP fill:#FF6B6B,stroke:#E04A4A,color:#fff
The flow in practice:
- You describe what you need (e.g.
"Create a VPC module with public and private subnets across 3 AZs") - Supervisor parses intent and delegates to the TF Coordinator deep agent
- tf-planner researches the service via Terraform MCP → requirements analysis → security best practices → execution planning → skill writing
- tf-generator reads the skill blueprint and writes every
.tffile to a virtual filesystem - tf-validator runs
terraform init,fmt -check, andvalidatein a sandbox - You get an approval card: push to GitHub, or keep local?
- On approval, github-agent commits everything via GitHub MCP
- Why AWS Orchestrator?
- Key Features
- Architecture
- Tech Stack
- Getting Started
- Usage
- Agent Details
- Human-in-the-Loop
- Skills and Memory
- Project Structure
- Roadmap
- Contributing
- FAQ
- License
- Contact
- Acknowledgments
| Category | Technologies |
|---|---|
| Agent Framework | LangChain + LangGraph + deepagents |
| Language | Python 3.12+ |
| IaC Platform | Terraform (AWS provider) |
| LLM Providers | Google Gemini · OpenAI · Anthropic · AWS Bedrock · Azure OpenAI |
| Protocol | A2A · A2UI |
| MCP Servers | Terraform Registry MCP · GitHub MCP |
| Validation | Terraform CLI (init, fmt, validate) in sandbox |
| Infrastructure | Docker · uv · Uvicorn · Starlette |
- An LLM API key (Google Gemini, OpenAI, or Anthropic)
- Docker + Docker Compose (recommended)
- A GitHub Personal Access Token with
reposcope (optional — for GitHub commit support)
No cloning required. You just need two files: docker-compose.yml and .env.
1. Create a docker-compose.yml — copy from this repo's docker-compose.yml, or use:
services:
aws-orchestrator-agent:
image: talkopsai/aws-orchestrator-agent:latest
container_name: aws-orchestrator-agent
ports:
- "10104:10104"
environment:
- GOOGLE_API_KEY=${GOOGLE_API_KEY}
- GITHUB_PERSONAL_ACCESS_TOKEN=${GITHUB_PERSONAL_ACCESS_TOKEN}
- GITHUB_MCP_URL=https://api.githubcopilot.com/mcp
- TERRAFORM_WORKSPACE=./workspace/terraform_modules
- ENVIRONMENT=production
# ── LLM: Standard tier (fast, cheap — validator + routing) ──
- LLM_PROVIDER=google_genai
- LLM_MODEL=gemini-3.1-flash-lite-preview
- LLM_TEMPERATURE=0.0
- LLM_MAX_TOKENS=15000
# ── LLM: Higher tier (planner + supervisor) ──
- LLM_HIGHER_PROVIDER=google_genai
- LLM_HIGHER_MODEL=gemini-3.1-pro-preview
# ── LLM: Deep Agent tier (coordinator + generator) ──
- LLM_DEEPAGENT_PROVIDER=google_genai
- LLM_DEEPAGENT_MODEL=gemini-3.1-pro-preview
- LLM_DEEPAGENT_TEMPERATURE=1.0
- LLM_DEEPAGENT_MAX_TOKENS=25000
- LOG_LEVEL=INFO
restart: unless-stopped
networks:
- aws-orchestrator-net
talkops-ui:
image: talkopsai/talkops:0.2.0
container_name: talkops-ui
ports:
- "8080:80"
depends_on:
- aws-orchestrator-agent
restart: unless-stopped
networks:
- aws-orchestrator-net
networks:
aws-orchestrator-net:
driver: bridge2. Create a .env file in the same directory with your API keys:
GOOGLE_API_KEY=your_google_api_key_here
GITHUB_PERSONAL_ACCESS_TOKEN=your_github_pat_hereUsing OpenAI or Anthropic instead? Change the
LLM_PROVIDERandLLM_MODELvalues in the compose file. ReplaceGOOGLE_API_KEYwithOPENAI_API_KEYorANTHROPIC_API_KEYin your.env. See.env.examplefor all supported providers.
3. Start everything:
docker compose up -d
# AWS Orchestrator running at http://localhost:10104
# TalkOps UI running at http://localhost:8080That's it. Open http://localhost:8080 and start talking to the orchestrator.
If you want to run it directly (for development or customization):
-
Install uv for dependency management.
-
Clone the repo and create a virtual environment with Python 3.12:
git clone https://github.com/talkops-ai/aws-orchestrator-agent.git
cd aws-orchestrator-agent
uv venv --python=3.12
source .venv/bin/activate # On Unix/macOS
# or
.venv\Scripts\activate # On Windows- Install dependencies from
pyproject.toml:
uv pip install -e .- Create a
.envfile and add your environment variables:
cp .env.example .env
# Edit .env — at minimum set your LLM API keyAll available configuration options can be found in
aws_orchestrator_agent/config/default.py. You can set any of these via your.envfile.
- Start the A2A server:
aws-orchestrator --host localhost --port 10104To interact with the agent, we recommend using the TalkOps UI client. Pull and run it with Docker:
docker run -d \
--name talkops-ui \
-p 8080:80 \
talkopsai/talkops:0.2.0Then open http://localhost:8080 in your browser.
Create a VPC module with public and private subnets across 3 AZs
The system handles research, planning, skill creation, code generation, validation, and (optionally) GitHub commit. You approve the plan and the final output.
Generate Terraform for an S3 bucket with server-side encryption,
versioning, lifecycle rules, and cross-region replication to us-west-2.
Create an EKS cluster module with managed node groups and IRSA
Update the VPC module on my-org/infra-modules to add a NAT gateway per AZ
Generate a Lambda function module with API Gateway trigger and DynamoDB access
Each agent has a defined scope and its own set of tools.
Entry point. Routes requests — Terraform tasks go to the TF Coordinator, everything else gets handled directly (greetings, out-of-scope guidance).
Tools: transfer_to_terraform, request_human_input
The brain. Orchestrates the full module lifecycle — decides which sub-agents to invoke and in what order. Manages virtual filesystem, skills, and memory. Reads HITL policies at session start.
Tools: sync_workspace, request_user_input
Deep research pipeline. Runs a 3-phase flow: requirements analysis → security & best practices → execution planning. Queries the Terraform MCP server for latest provider docs and writes service-specific skill files so downstream agents don't hallucinate.
| Phase | What it does |
|---|---|
| Requirements Analyzer | Extracts infrastructure requirements from the user's request |
| Security & Best Practices | Evaluates CIS benchmarks, encryption, access controls, tagging |
| Execution Planner | Creates detailed module specification with file set, variables, outputs |
Reads the skill blueprint and writes every .tf file to the virtual filesystem. Follows the skill's declared file set exactly — no bundling everything into main.tf.
Runs terraform init -input=false, terraform fmt -check, and terraform validate in a local sandbox. Returns VALID or INVALID with structured errors.
Fetches existing modules from GitHub via MCP, applies targeted edits. Surgical changes, not full rewrites. Preserves existing formatting and conventions.
Reads an existing module on GitHub and produces a structured update plan. Does not modify files — analysis only. Flags breaking changes and dependency impacts.
Commits generated files to GitHub using MCP tools. Never uses shell git commands. For new files, commits directly; for existing files, fetches SHA first.
Infrastructure changes are irreversible. A bad terraform apply can take down production. So the agent doesn't just generate and commit — it pauses at critical gates and asks for your input.
| Gate | When | What it asks |
|---|---|---|
| Commit gate | After validation passes | Push to GitHub or keep local? Which repo and branch? |
| Next steps | After task completion | Generate another module? Update an existing one? Done? |
| Destructive ops | Before deleting modules or force-pushing | Explicit human approval — never skipped |
- Ambiguous requirements — "Create a VPC" → which region? How many AZs? Public/private?
- Cost-sensitive decisions — NAT gateway per AZ vs. shared, dedicated vs. shared tenancy
The full HITL policy lives in memory/hitl-policies.md and the coordinator reads it at the start of every session. If it discovers a new situation that should require human input, it updates the file — so the system gets smarter over time.
Each AWS service gets its own skill directory:
skills/
├── tf-module-generator/ # General generation patterns
├── tf-module-updater/ # Update workflow rules
├── tf-module-validator/ # Validation workflow + error rules
├── tf-skill-builder/ # How to create new skills
├── github-committer/ # Commit workflow via MCP
└── update-planner/ # Module analysis patterns
When the planner runs, it queries the Terraform MCP server for the latest provider docs and writes service-specific skill files (SKILL.md + references). The generator reads these files and follows them exactly — which means it doesn't hallucinate provider version constraints or resource arguments.
If a skill already exists and its provider version is current, the planner is skipped entirely. This makes repeated generations for the same service significantly faster.
The coordinator maintains persistent memory across sessions:
| File | Purpose |
|---|---|
AGENTS.md |
Memory index — what files exist and reading rules |
hitl-policies.md |
When to pause and ask the human |
org-standards.md |
Your org's Terraform conventions (tags, naming, providers) |
module-index.md |
Where modules live in the GitHub repo |
failure-log.md |
Past validation failures — so it doesn't repeat mistakes |
learned-patterns.md |
Patterns to reuse across sessions |
When the agent generates a module, you get a complete, production-ready directory:
workspace/terraform_modules/s3/
├── main.tf # Core resources with security best practices
├── variables.tf # Typed, validated, documented variables
├── outputs.tf # All useful outputs with try() for conditionals
├── versions.tf # Provider and Terraform version constraints
├── locals.tf # Computed values and tag merging
└── README.md # Usage example, inputs table, outputs table
Depending on the service, it might also generate iam.tf, policies.tf, security_groups.tf, data.tf, or templates.tf — the skill blueprint decides based on what the service actually needs.
Every module follows these conventions:
- No hardcoded values — regions, account IDs, and credentials are always variables
- Provider version locking —
>= x.y.zconstraints inversions.tf - Tag merging — every resource gets
merge({"Name" = ...}, var.tags, var.<resource>_tags) - Conditional resources —
countfor simple on/off,for_eachfor collections - Safe outputs —
try(resource[0].id, null)for conditional resources
The agent uses a three-tier LLM configuration — different models for different jobs:
| Tier | Used by | Default | Why |
|---|---|---|---|
| Standard | Validator, routing | gemini-3.1-flash-lite-preview |
Fast and cheap for yes/no decisions |
| Higher | Planner, supervisor | gemini-3.1-pro-preview |
Better reasoning for research and planning |
| Deep Agent | Coordinator, generator | gemini-3.1-pro-preview |
Full capability for multi-step code generation |
Switching LLM providers: Set
LLM_PROVIDER(orLLM_HIGHER_PROVIDER,LLM_DEEPAGENT_PROVIDER) toopenai,anthropic,google_genai, orazure. The system supports all of them out of the box.
For the full list of configuration options, see aws_orchestrator_agent/config/default.py.
aws-orchestrator-agent/
├── aws_orchestrator_agent/
│ ├── server.py # A2A server entry point (Uvicorn + Starlette)
│ ├── card/
│ │ └── aws_orchestrator_agent.json # A2A agent card
│ ├── config/
│ │ ├── config.py # Config management (env → defaults → overrides)
│ │ └── default.py # Default values
│ ├── core/
│ │ ├── a2a_executor.py # A2A task executor
│ │ └── agents/
│ │ ├── aws_orchestrator_supervisor.py # Supervisor agent (router)
│ │ ├── types.py # BaseAgent, BaseDeepAgent, AgentResponse
│ │ └── tf_operator/
│ │ ├── tf_cordinator.py # TF Coordinator (deep agent)
│ │ ├── subagents.py # Sub-agent specs + JIT MCP wrappers
│ │ ├── middleware.py # Deep agent middleware chain
│ │ ├── backends/ # Terraform-specific backends
│ │ ├── tools/ # Coordinator-level tools (sync, HITL)
│ │ ├── tf_planner/ # Planner supervisor sub-graph
│ │ ├── tf_generator/ # Generator utilities
│ │ ├── tf_validator/ # Validation utilities
│ │ └── tf_updater/ # Update utilities
│ └── utils/ # Logger, LLM factory, MCP client
├── skills/ # Service-specific skill directories
├── memory/ # Persistent agent memory files
├── workspace/ # Generated Terraform modules (output)
├── a2ui_extenstion/ # TalkOps UI agent extension (A2UI)
├── docker-compose.yml # Full stack: Orchestrator + TalkOps UI
├── Dockerfile # Multi-stage build (Python 3.12 + Terraform CLI + MCP server)
├── pyproject.toml # Metadata, dependencies, build config
└── uv.lock # Locked dependencies
Phase 1 — Module Generation (shipped):
- Multi-agent Terraform module generation pipeline
- MCP-powered research (Terraform Registry + GitHub)
- Skill-based code generation with per-service blueprints
- Sandbox validation (terraform init, fmt, validate)
- Human-in-the-loop approval gates (commit gate, next steps, destructive ops)
- GitHub commit via MCP tools
- Module update workflow (fetch → plan → edit → validate → commit)
- Persistent agent memory across sessions
- A2A protocol + A2UI interactive components
- Multi-provider LLM support (Gemini, OpenAI, Anthropic, Bedrock, Azure)
- Docker deployment with TalkOps UI
Phase 2 — Extended AWS Coverage (in progress):
- Parallel module generation (multiple services in one request)
- Modify and update existing modules through conversation
- Expanded service skill library (EKS, RDS, Lambda, ALB, CloudFront, etc.)
- Terraform state backend auto-configuration
- Module dependency graph (outputs → inputs wiring across modules)
Phase 3 — Operations & Governance:
- Drift detection against deployed infrastructure
- Cost estimation before commit
- Custom compliance policy engine
- Terraform plan preview (dry-run before apply)
Phase 4 — Infrastructure Intelligence:
- Infrastructure knowledge base (best practices, common pitfalls, optimization)
- Module monitoring and version management
- Self-healing modules (auto-fix on provider updates)
- Custom agent plugin system
See open issues for the full list.
Contributions are welcome. The process is straightforward:
- Fork the repo
- Create a branch (
git checkout -b feature/your-feature) - Make your changes and commit
- Push and open a PR
If you're considering something bigger, open an issue first so we can align on the approach.
Please follow the existing code style (enforced by Ruff), add tests for new features, and make sure pytest passes before submitting.
Which AWS services does it support?
Any service supported by the AWS Terraform provider. It doesn't have a hardcoded list — the planner researches each service dynamically via the Terraform MCP server. VPC, S3, EC2, RDS, EKS, Lambda, IAM, CloudFront, etc. all work.
Which LLMs work?
Google Gemini, OpenAI, Anthropic, AWS Bedrock, and Azure OpenAI. Set LLM_PROVIDER and LLM_MODEL in your .env. The default config uses Gemini.
Will it commit code without asking?
No. Two mandatory approval gates: (1) after validation passes — push to GitHub or keep local, (2) after completion — generate another module or done. Destructive operations always require explicit approval, no exceptions.
Does it work with private GitHub repos?
Yes — your GITHUB_PERSONAL_ACCESS_TOKEN needs repo scope.
How does it avoid hallucinating provider configs?
The planner queries the Terraform Registry MCP server for the latest provider documentation before generating any code. It writes a skill blueprint with the exact resource arguments, variable types, and provider version constraints — the generator follows this blueprint, not its training data.
What if the generated module fails validation?
The coordinator re-dispatches the generator with the error details and re-runs validation. If it fails again after retry, it reports the errors and stops — it doesn't loop forever.
Does it need Terraform CLI installed?
For Docker: no. The Docker image includes Terraform CLI and the Terraform MCP server. For local development: yes, you need Terraform CLI installed on your machine.
How do I connect a client?
AWS Orchestrator speaks the A2A protocol. Any A2A client works. The included docker-compose.yml ships with TalkOps UI at localhost:8080. You can also use the CLI client in aws_orchestrator_client/.
Apache 2.0 — see LICENSE.
TalkOps AI — github.com/talkops-ai
Project: github.com/talkops-ai/aws-orchestrator-agent
Discord: Join the community
- LangChain + LangGraph — agent orchestration
- deepagents — deep agent framework and middleware
- Google A2A Protocol — agent-to-agent communication
- A2UI — interactive UI protocol
- Terraform MCP Server — real-time registry queries
- GitHub MCP Server — GitHub operations via MCP
- uv — Python package management

