Each HTTP request to the metadata service triggers one or more database queries. Currently there is no per-request visibility into how many queries are executed, how many rows are returned, or how long queries take. For agent-facing workloads — where a single logical client operation fans out into many HTTP requests, each triggering its own DB activity — this hidden multiplier is the key unknown.
All database access in the metadata service flows through the execute_sql() method in postgres_async_db.py. Adding opt-in query-level logging here would make the full DB cost of any API operation measurable, forming the server-side counterpart to the client-side instrumentation in issue #4.
Goals
- Add opt-in per-request query logging to
execute_sql() that records: query count, rows returned, and execution time
- Verify this captures DB activity for at least two different API endpoints (e.g.
GET /flows/{id}/runs and GET /flows/{id}/runs/{id}/steps/{name}/tasks)
- Confirm the instrumentation is a no-op when disabled
Instructions
This issue is a venue for questions and design discussion. Keep the instrumentation orthogonal to any changes in API behaviour or schema — this is observability only.
Use the dev stack (see Resources) to run the service locally. The Swagger UI at http://localhost:8080/api/doc is useful for manually triggering individual endpoints while observing query logs.
When you have met the goals, show your work via a fork with an issue / PRs explaining your work. You can add comments with what you achieve on the issue thread. You can reference the Fork/PR/issue in this issue.
Resources
Dev stack:
- Setting Up the Dev Stack — run
metaflow-dev up to get a full local environment; use metaflow-dev shell for a pre-configured shell pointing at it
- Swagger UI:
http://localhost:8080/api/doc (once the stack is running) — fire individual endpoints and observe query behaviour
Key code:
Docs:
Each HTTP request to the metadata service triggers one or more database queries. Currently there is no per-request visibility into how many queries are executed, how many rows are returned, or how long queries take. For agent-facing workloads — where a single logical client operation fans out into many HTTP requests, each triggering its own DB activity — this hidden multiplier is the key unknown.
All database access in the metadata service flows through the
execute_sql()method inpostgres_async_db.py. Adding opt-in query-level logging here would make the full DB cost of any API operation measurable, forming the server-side counterpart to the client-side instrumentation in issue #4.Goals
execute_sql()that records: query count, rows returned, and execution timeGET /flows/{id}/runsandGET /flows/{id}/runs/{id}/steps/{name}/tasks)Instructions
This issue is a venue for questions and design discussion. Keep the instrumentation orthogonal to any changes in API behaviour or schema — this is observability only.
Use the dev stack (see Resources) to run the service locally. The Swagger UI at
http://localhost:8080/api/docis useful for manually triggering individual endpoints while observing query logs.When you have met the goals, show your work via a fork with an issue / PRs explaining your work. You can add comments with what you achieve on the issue thread. You can reference the Fork/PR/issue in this issue.
Resources
Dev stack:
metaflow-dev upto get a full local environment; usemetaflow-dev shellfor a pre-configured shell pointing at ithttp://localhost:8080/api/doc(once the stack is running) — fire individual endpoints and observe query behaviourKey code:
services/data/postgres_async_db.py#L248—execute_sql(), the single path for all DB queriesservices/data/postgres_async_db.py#L149—AsyncPostgresDBsingleton, manages the connection poolservices/data/db_utils.py#L10—DBResponseandDBPaginationnamed tuples returned by all queriesservices/metadata_service/api/task.py— a good endpoint to trace end-to-end as a starting pointDocs: