bug: cache hits bypass the RAG and memory injection pipeline

## Summary

Cache reads happen before RAG and memory injection in the request processing pipeline. When there's a cache hit, the response is returned immediately and RAG/memory retrieval never runs — even if the decision has RAG enabled or the user has conversation memory.

**Severity: Low** — this is a correctness issue, not a security issue. The user gets a valid (generic) answer, just not their personalized one.

## Pipeline order (current)

```
runRequestPreRoutingStages():
  1. applyRateLimitAndCacheChecks()  →  handleCaching() — cache READ here
  2. executeRAGPlugin()              →  sets ctx.RAGRetrievedContext
  3. prepareRequestForModelRouting() →  handleMemoryRetrieval() — sets ctx.MemoryContext
```

If step 1 returns a cache hit, steps 2-3 never execute.

## Impact

- User with RAG-enabled decision gets a generic cached response instead of their document-augmented one
- User with memory enabled gets a cached response without their conversation history
- Only affects requests where a semantically similar query was previously cached from a non-personalized request

## Possible fixes

1. **Move cache read after RAG/memory** — check cache only after all context augmentation, using the full augmented query as the cache key
2. **Skip cache reads for decisions with RAG or memory enabled** — simpler, but reduces cache effectiveness for those decisions
3. **Include a "personalization hash" in the cache key** — hash of (has_rag, has_memory, user_id) so personalized and generic queries don't collide

Option 2 is probably the right tradeoff — decisions with RAG/memory enabled should not serve cached responses since the response depends on user-specific context.

Found during work on #1448.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: cache hits bypass the RAG and memory injection pipeline #1500

Summary

Pipeline order (current)

Impact

Possible fixes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bug: cache hits bypass the RAG and memory injection pipeline #1500

Description

Summary

Pipeline order (current)

Impact

Possible fixes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions