Does this deployment repository implement a task queue mechanism for handling burst requests? #266
-
Beta Was this translation helpful? Give feedback.
Replies: 8 comments 5 replies
-
|
Hi, this repo does not implement any task queue. Since async patterns are used throughout and most heavy processing happens in the separate LLM service (OpenAI, Claude or whatever), I expect that this service would not be the bottleneck under moderate (and maybe even somewhat large) simultaneous traffic assuming it's running on a decent machine size. We also recently added better support for connection pooling with the agent state database if you use the Postgres connection. With that said, I have not tested and not aware of anyone else testing it under substantial simultaneous load. This is NOT designed to be a scaled out production-grade service. I would love to hear about any testing efforts and results that anyone finds, for future reference. It's unlikely that we would add extensive support for increasing the production load capacity with features like a task queue, since it's beyond the intended scope of the project and would likely add complexity that isn't useful for most casual to moderate users. But happy to hear feedback and discuss further if there's high demand and folks willing to work on it (I haven't heard that so far until now). |
Beta Was this translation helpful? Give feedback.
-
|
Hi, there is a strong demand for this |
Beta Was this translation helpful? Give feedback.
-
|
您好,我是程永慧!您的信件已收到,我会尽快查看,谢谢!
|
Beta Was this translation helpful? Give feedback.
-
|
This is a great question and it gets at one of the most under-discussed challenges in production agent deployments. Beyond task queues, there's a broader architectural question here: when you're running agents as a service (which is essentially what this toolkit enables), you need to think about the full stack of production concerns:
For the queuing specifically, I've seen two patterns work well:
The toolkit's architecture (FastAPI + LangGraph) is well-suited for adding a lightweight queue layer. The key design decision is whether to queue at the HTTP level (request buffering) or at the agent execution level (step-by-step checkpointing). @JoshuaC215 curious if you've considered adding queue/worker architecture as a first-class feature? As more people deploy agent services in production, this seems like it'd be one of the most requested capabilities. |
Beta Was this translation helpful? Give feedback.
-
|
Thanks @xtaq and others for sharing your thoughts. I haven't thought deeply about this and don't expect I'll have time in the next couple of months to put major effort towards it. With that said, I would welcome contributions, especially if it starts with a thoughtful spec and design (doesn't need to be too formal) and can provide an implementation that is modular, consistent with the wider project, and well covered by tests. I would lean towards request buffering for simplicity but open to proposals at the step-by-step level too. |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for the openness to contributions, @JoshuaC215. I'd like to take a stab at a lightweight spec for this. Based on your preference for request buffering, here's what I'm thinking as a starting point: Minimal Queue Layer (request buffering approach):
Design principles:
Would this kind of scope feel right before I put together a more detailed design doc? Want to make sure it's aligned with "modular, consistent with the wider project" before investing the effort. Also curious: is there a preferred way to share design proposals — GitHub Discussion, Issue, or PR with a docs/proposals folder? |
Beta Was this translation helpful? Give feedback.
-
|
Absolutely, that's a great approach — start minimal, iterate based on real usage. Here's how I'd scope the initial version: V0.1 — Minimal Queue (asyncio.Queue)
V0.2 — Optional Redis backend
I'll draft this as a proper design doc (markdown, PR-ready) so it's easy to review inline. Will aim to have it up within a few days. Should I submit it as a PR to One thing I've been exploring in my work on agent infrastructure: task queues become really interesting when you layer in cost attribution — e.g., tracking compute cost per task for billing/marketplace scenarios. Happy to include a section on how the interface could support that extensibility without adding complexity to V0.1. |
Beta Was this translation helpful? Give feedback.
-
|
Spec is up as a PR: #296 ( Followed your guidance — starts with Happy to iterate on the design before moving to implementation. Let me know what you think! |
Beta Was this translation helpful? Give feedback.

Hi, this repo does not implement any task queue. Since async patterns are used throughout and most heavy processing happens in the separate LLM service (OpenAI, Claude or whatever), I expect that this service would not be the bottleneck under moderate (and maybe even somewhat large) simultaneous traffic assuming it's running on a decent machine size. We also recently added better support for connection pooling with the agent state database if you use the Postgres connection.
With that said, I have not tested and not aware of anyone else testing it under substantial simultaneous load. This is NOT designed to be a scaled out production-grade service. I would love to hear about any testing e…