feat: adaptive replanning when task results deviate from plan#5392
feat: adaptive replanning when task results deviate from plan#5392Ricardo-M-L wants to merge 2 commits intocrewAIInc:mainfrom
Conversation
When `planning=True`, the plan is currently static and never updated during execution, causing compounding errors when early tasks return unexpected results. This adds an optional `replan_on_failure` flag that enables adaptive re-planning: after each task, a lightweight ReplanningEvaluator checks whether the result deviates from the plan's assumptions, and if so, triggers CrewPlanner.replan() to generate revised plans for remaining tasks. New API (fully backwards compatible): - `replan_on_failure=True` on Crew enables the feature - `max_replans=N` prevents infinite replanning loops - `replanning_evaluator` allows plugging in a custom evaluator - `evaluation_criteria` configures quality threshold, completeness, relevance Fixes crewAIInc#4983 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Move _replan_count increment after successful replanning so failed attempts don't consume the replan budget - Remove unused ReplanDecision import Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 782f96a. Configure here.
| self._logger.log( | ||
| "warning", | ||
| f"Replanning failed: {e}. Continuing with original plan.", | ||
| ) |
There was a problem hiding this comment.
Replan count not incremented on replanning failure
High Severity
_replan_count += 1 is only reached on the success path inside the try block (after planner.replan() returns). When planner.replan() raises an exception, the increment is skipped and the except block doesn't increment it either. This means persistent replanning failures never consume the max_replans budget, so _should_evaluate_for_replan() keeps returning True and the system retries on every subsequent task — defeating the runaway-loop protection. The corresponding test also asserts _replan_count == 1 after a failure, which will fail against this code.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 782f96a. Configure here.
|
Closing — branch has diverged significantly from upstream, and large features should be discussed first. Will resubmit properly if needed. |


Summary
planning=Truecrews encounter task results that deviate from the original plan's assumptionsReplanningEvaluatorthat runs a lightweight LLM check after each task to detect deviations (missing data, unexpected results, infeasible approaches)CrewPlanner.replan()generates revised plans for remaining tasks only, using actual results as contextreplan_on_failuredefaults toFalse, existing crews are completely unaffectedNew API
Files Changed
lib/crewai/src/crewai/utilities/replanning_evaluator.pyReplanningEvaluator,ReplanDecision,EvaluationCriterialib/crewai/src/crewai/utilities/planning_handler.pyreplan()method toCrewPlannerlib/crewai/src/crewai/crew.py_evaluate_and_replan()hook in_execute_tasks()lib/crewai/tests/utilities/test_replanning_evaluator.pyHow It Works
_should_evaluate_for_replan()checks if replanning is enabled and budget remainsReplanningEvaluator.evaluate()makes a structured LLM call: "Does this result deviate significantly from what the plan assumed?"ReplanDecision.should_replan=True,CrewPlanner.replan()generates revised plans for remaining tasks using completed results as context[REVISED PLAN]sections (old revisions are replaced, not stacked)_replan_countprevents runaway loops (capped atmax_replans)Test Plan
ReplanDecisionmodel validation (bounds, defaults, all fields)EvaluationCriteriamodel validation (bounds, custom criteria)ReplanningEvaluator(init, no-remaining-tasks, replan/no-replan decisions, parse failure fallback, criteria text building)CrewPlanner.replan()(returns revised plans, raises on failure, remaining tasks summary)test_planning_handler.pytests still pass (12/12)Fixes #4983
🤖 Generated with Claude Code
Note
Medium Risk
Adds new LLM-driven evaluation and dynamic replanning into the core sequential execution loop, which can change task descriptions mid-run and affect determinism/cost. Guardrails exist (
replan_on_failuredefault off,max_replanscap), but the behavior is complex and touches planning/execution paths.Overview
Introduces adaptive replanning for
planning=Truecrews via newCrewoptions (replan_on_failure,max_replans,evaluation_criteria, and pluggablereplanning_evaluator). After each synchronous task, an LLM-basedReplanningEvaluatorcan decide whether outputs deviate from the original plan and, if so,CrewPlanner.replan()regenerates plans for remaining tasks and injects them into task descriptions as[REVISED PLAN].Adds
CrewPlanner.replan()to produce revised plans using completedTaskOutputs as context, plus a newutilities/replanning_evaluator.pymodule with structured decision/criteria models and robust fallback behavior. Includes a comprehensive new test suite covering evaluator behavior, replanning generation, andCrewintegration/backwards-compatibility.Reviewed by Cursor Bugbot for commit 782f96a. Bugbot is set up for automated code reviews on this repo. Configure here.