agent-cost-monitor

Stop guessing what your AI agents cost. Start knowing.

Why? · Features · Quick Start · SDK Integration · Sessions · Budget Alerts · Anomaly Detection · Persistence · Export & Reporting · Rate Monitoring · CLI · Supported Models · API Reference

Why?

AI agents don't make one API call -- they make dozens. Across different models, different providers, different tasks. Costs spiral silently until the invoice arrives.

No visibility -- you have no idea which agent task burned through your budget until the bill comes
No guardrails -- a single runaway loop can drain your API credits in minutes
No attribution -- when costs spike, you can't pinpoint which model, session, or task is responsible

agent-cost-monitor solves all three. Drop it into any Python agent and get real-time cost tracking, budget enforcement, anomaly detection, and per-task attribution -- across Claude, GPT, and Gemini.

Features

	Feature	What it does
📊	Multi-provider pricing	Built-in rates for 12 models across Anthropic, OpenAI, and Google
🛡️	Budget enforcement	Callback alerts, hard-stop exceptions, or both
🔌	SDK wrappers	`wrap_anthropic()` / `wrap_openai()` auto-track every call (sync + async)
🏷️	Decorator pattern	`@track_usage` / `@async_track_usage` for custom functions
📁	Session tracking	Per-task cost attribution with named sessions and context managers
💾	Persistence	`save()` / `load()` / `auto_save` for durable state across restarts
📄	Export	`to_json()`, `to_csv()`, and `report()` formatted tables
🚨	Anomaly detection	Automatic 3x cost-spike alerts with callback hooks
⏱️	Rate tracking	`cost_per_minute()` and `requests_per_minute()` in real time
💻	CLI demo	`python -m agent_cost_monitor demo` for instant visualization
⚙️	History cap	Bounded memory via configurable `max_history` (default 10,000)

Quick Start

Install

pip install -e .

5-Line Usage

from agent_cost_monitor import CostTracker

tracker = CostTracker(budget=1.00)
tracker.record("claude-sonnet-4-6", input_tokens=2000, output_tokens=800)
tracker.record("gpt-4o", input_tokens=1000, output_tokens=400)
print(f"Total: ${tracker.total_cost:.4f} | Over budget: {tracker.is_over_budget}")

SDK Integration

Anthropic (sync)

import anthropic
from agent_cost_monitor import CostTracker, wrap_anthropic

client = anthropic.Anthropic()
tracker = CostTracker(budget=5.00)
wrap_anthropic(client, tracker)

# Every call is now automatically tracked -- no other changes needed
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
)
print(f"Running total: ${tracker.total_cost:.6f}")

Anthropic (async)

import anthropic
from agent_cost_monitor import CostTracker, wrap_anthropic_async

client = anthropic.AsyncAnthropic()
tracker = CostTracker(budget=5.00)
wrap_anthropic_async(client, tracker)

response = await client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
)

OpenAI (sync)

from openai import OpenAI
from agent_cost_monitor import CostTracker, wrap_openai

client = OpenAI()
tracker = CostTracker(budget=5.00)
wrap_openai(client, tracker)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(f"Running total: ${tracker.total_cost:.6f}")

OpenAI (async)

from openai import AsyncOpenAI
from agent_cost_monitor import CostTracker, wrap_openai_async

client = AsyncOpenAI()
tracker = CostTracker(budget=5.00)
wrap_openai_async(client, tracker)

response = await client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

Decorator Pattern

from agent_cost_monitor import CostTracker, track_usage, async_track_usage

tracker = CostTracker()

@track_usage(tracker, model="claude-sonnet-4-6")
def call_claude(prompt):
    return anthropic_client.messages.create(
        model="claude-sonnet-4-6", max_tokens=1024,
        messages=[{"role": "user", "content": prompt}],
    )

# Async version
@async_track_usage(tracker, model="gpt-4o")
async def call_gpt(prompt):
    return await openai_client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
    )

response = call_claude("Summarize this document")
print(tracker.summary())

Note: The model parameter is optional. If omitted, the decorator reads response.model automatically.

Sessions

Track costs per-task with named sessions. Sessions act as scoped windows into the same tracker -- every session record also rolls up into the global total.

from agent_cost_monitor import CostTracker

tracker = CostTracker(budget=10.00)

# Use as a context manager
with tracker.session("research") as s:
    s.record("claude-sonnet-4-6", input_tokens=5000, output_tokens=2000)
    s.record("gemini-2.5-pro", input_tokens=3000, output_tokens=1500)
    print(f"Research cost: ${s.total_cost:.4f}")

with tracker.session("writing") as s:
    s.record("gpt-4o", input_tokens=2000, output_tokens=4000)
    print(f"Writing cost: ${s.total_cost:.4f}")

# See cost breakdown by session
print(tracker.cost_by_session())
# {'research': 0.0405, 'writing': 0.045}

# Global total includes everything
print(f"Total across all sessions: ${tracker.total_cost:.4f}")

Sessions expose total_cost, total_input_tokens, total_output_tokens, and summary().

Budget Alerts

Callback

Get notified when spending crosses the threshold:

def alert(usage, tracker):
    print(f"WARNING: Budget exceeded! Spend: ${tracker.total_cost:.4f}")

tracker = CostTracker(budget=0.50, on_budget_exceeded=alert)

Exception

Hard-stop to prevent runaway costs:

from agent_cost_monitor import CostTracker, BudgetExceededError

tracker = CostTracker(budget=0.50, raise_on_budget=True)

try:
    tracker.record("claude-opus-4-6", input_tokens=100_000, output_tokens=50_000)
except BudgetExceededError as e:
    print(f"Stopped: {e}")
    # Stopped: Budget of 0.5 exceeded: total cost is 5.250000

Both

Use a callback for logging and an exception for enforcement:

import logging
log = logging.getLogger(__name__)

tracker = CostTracker(
    budget=1.00,
    on_budget_exceeded=lambda u, t: log.warning(f"Over budget: ${t.total_cost:.4f}"),
    raise_on_budget=True,
)

Anomaly Detection

Automatically detect cost spikes. When any single request costs more than 3x the running average (after at least 5 prior records), the on_anomaly callback fires.

def spike_alert(anomaly, usage, tracker):
    print(f"ANOMALY: {anomaly['type']} detected!")
    print(f"  Cost: ${anomaly['cost']:.4f} (avg: ${anomaly['avg_cost']:.4f})")
    print(f"  Ratio: {anomaly['ratio']:.1f}x the average")

tracker = CostTracker(on_anomaly=spike_alert)

# Build up a baseline of cheap calls
for _ in range(6):
    tracker.record("gpt-4o-mini", input_tokens=100, output_tokens=50)

# This expensive call triggers the anomaly alert
tracker.record("claude-opus-4-6", input_tokens=50_000, output_tokens=20_000)
# ANOMALY: spike detected!
#   Cost: $2.2500 (avg: $0.0001)
#   Ratio: 30186.2x the average

Persistence

Save and Load

tracker = CostTracker(budget=5.00)
tracker.record("claude-sonnet-4-6", input_tokens=1000, output_tokens=500)

# Save state to disk
tracker.save("costs.json")

# Load it back later -- budget and history are restored
restored = CostTracker.load("costs.json")
print(f"Restored cost: ${restored.total_cost:.6f}")

Auto-save

Automatically persist after every record() call:

tracker = CostTracker(budget=5.00, auto_save="costs.json")

# Every record() call now writes state to disk automatically
tracker.record("gpt-4o", input_tokens=1000, output_tokens=500)
# costs.json is updated immediately

Note: load() returns a fresh empty tracker if the file is missing or corrupted -- no exceptions to handle.

Export & Reporting

Formatted Report

print(tracker.report())

+======================================+
|     Agent Cost Monitor Report        |
+======================================+
| Total Cost:        $0.031950         |
| Total Requests:    5                 |
| Budget:            $1.00 (3.2% used) |
+--------------------------------------+
| Cost by Model:                       |
|   claude-sonnet-4-6   $0.021000      |
|   gpt-4o-mini         $0.001950      |
|   gemini-2.5-flash    $0.001170      |
|   gpt-4o              $0.006500      |
+======================================+

JSON Export

json_str = tracker.to_json()
with open("costs.json", "w") as f:
    f.write(json_str)

[
  {
    "timestamp": "2026-03-25T12:00:00+00:00",
    "model": "claude-sonnet-4-6",
    "input_tokens": 2000,
    "output_tokens": 800,
    "cost": 0.018
  }
]

CSV Export

csv_str = tracker.to_csv()
with open("costs.csv", "w") as f:
    f.write(csv_str)

timestamp,model,input_tokens,output_tokens,cost
2026-03-25T12:00:00+00:00,claude-sonnet-4-6,2000,800,0.018
2026-03-25T12:00:00+00:00,gpt-4o,1000,400,0.0065

Rate Monitoring

Track how fast you're spending:

tracker = CostTracker()

# ... after some API calls ...

print(f"Burn rate: ${tracker.cost_per_minute():.4f}/min")
print(f"Request rate: {tracker.requests_per_minute():.1f} req/min")

Both methods compute averages from the timestamps of the first and last recorded usage. Returns 0.0 if fewer than 2 records exist.

CLI

Run the built-in demo to see the tracker in action:

python -m agent_cost_monitor demo

Sample output:

+======================================+
|     Agent Cost Monitor Report        |
+======================================+
| Total Cost:        $0.031950         |
| Total Requests:    5                 |
| Budget:            $1.00 (3.2% used) |
+--------------------------------------+
| Cost by Model:                       |
|   claude-sonnet-4-6   $0.021000      |
|   gpt-4o-mini         $0.001950      |
|   gemini-2.5-flash    $0.001170      |
|   gpt-4o              $0.006500      |
+======================================+

--- JSON export (first 3 lines) ---
[
  {
    "timestamp": "2026-03-25T...",
...

--- CSV export ---
timestamp,model,input_tokens,output_tokens,cost
...

Supported Models

All pricing is built-in. No configuration required.

Provider	Model	Input (per 1M tokens)	Output (per 1M tokens)
Anthropic	`claude-opus-4-6`	$15.00	$75.00
Anthropic	`claude-sonnet-4-6`	$3.00	$15.00
Anthropic	`claude-haiku-4-5`	$0.80	$4.00
OpenAI	`gpt-4o`	$2.50	$10.00
OpenAI	`gpt-4o-mini`	$0.15	$0.60
OpenAI	`gpt-4.1`	$2.00	$8.00
OpenAI	`gpt-4.1-mini`	$0.40	$1.60
Google	`gemini-2.5-pro`	$1.25	$10.00
Google	`gemini-2.5-flash`	$0.15	$0.60

Unknown models automatically fall back to default pricing ($3.00 / $15.00 per 1M tokens). You never need to configure pricing manually.

API Reference

`CostTracker`

CostTracker(
    budget=None,              # Optional spending limit in USD
    max_history=10_000,       # Max records kept in memory (oldest evicted)
    on_budget_exceeded=None,  # Callback: fn(usage, tracker)
    raise_on_budget=False,    # Raise BudgetExceededError when over budget
    auto_save=None,           # File path for auto-saving after every record()
    on_anomaly=None,          # Callback: fn(anomaly_dict, usage, tracker)
)

Methods

Method	Returns	Description
`record(model, input_tokens, output_tokens)`	`Usage`	Record a single API call
`summary()`	`dict`	Cost, tokens, request count, and budget status
`cost_by_model()`	`dict`	Map of model name to total cost
`session(name)`	`Session`	Create or retrieve a named session
`cost_by_session()`	`dict`	Map of session name to total cost
`check_anomaly(usage)`	`dict \| None`	Check if a usage record is anomalous
`cost_per_minute()`	`float`	Average cost per minute
`requests_per_minute()`	`float`	Average requests per minute
`report()`	`str`	Formatted ASCII table report
`to_json()`	`str`	Usage history as JSON string
`to_csv()`	`str`	Usage history as CSV string
`save(path)`	`None`	Save full state to a JSON file
`reset()`	`None`	Clear all recorded usage data

Class Methods

Method	Returns	Description
`CostTracker.load(path)`	`CostTracker`	Load state from file (returns empty tracker if file missing/corrupt)

Properties

Property	Type	Description
`total_cost`	`float`	Running total cost in USD
`total_input_tokens`	`int`	Total input tokens across all calls
`total_output_tokens`	`int`	Total output tokens across all calls
`is_over_budget`	`bool`	`True` if total cost exceeds budget

`Session`

Returned by tracker.session(name). Supports use as a context manager.

Member	Type	Description
`name`	`str`	Session name
`record(model, input_tokens, output_tokens)`	`Usage`	Record usage (also recorded on parent tracker)
`total_cost`	`float`	Session cost in USD
`total_input_tokens`	`int`	Session input tokens
`total_output_tokens`	`int`	Session output tokens
`summary()`	`dict`	Session name, cost, tokens, and request count

`Usage`

Dataclass returned by record().

Field	Type	Description
`model`	`str`	Model name
`input_tokens`	`int`	Input token count
`output_tokens`	`int`	Output token count
`timestamp`	`str`	ISO 8601 UTC timestamp
`cost`	`float`	Computed cost in USD (property)

`BudgetExceededError`

Exception raised when raise_on_budget=True and total cost exceeds the budget. Inherits from Exception.

Functions

Function	Description
`wrap_anthropic(client, tracker)`	Auto-track `client.messages.create()` calls
`wrap_openai(client, tracker)`	Auto-track `client.chat.completions.create()` calls
`wrap_anthropic_async(client, tracker)`	Auto-track async Anthropic calls
`wrap_openai_async(client, tracker)`	Auto-track async OpenAI calls
`track_usage(tracker, model=None)`	Sync decorator for functions returning SDK-style responses
`async_track_usage(tracker, model=None)`	Async decorator for functions returning SDK-style responses

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
agent_cost_monitor		agent_cost_monitor
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.json		build.json
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

agent-cost-monitor

Why?

Features

Quick Start

Install

5-Line Usage

SDK Integration

Anthropic (sync)

Anthropic (async)

OpenAI (sync)

OpenAI (async)

Decorator Pattern

Sessions

Budget Alerts

Callback

Exception

Both

Anomaly Detection

Persistence

Save and Load

Auto-save

Export & Reporting

Formatted Report

JSON Export

CSV Export

Rate Monitoring

CLI

Supported Models

API Reference

CostTracker

Methods

Class Methods

Properties

Session

Usage

BudgetExceededError

Functions

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`CostTracker`

`Session`

`Usage`

`BudgetExceededError`

Packages