[Improvement](unlock)optimize the place of unlock which deadlock occurs when database tables are frequently created or deleted by LuGuangming · Pull Request #62391 · apache/doris

LuGuangming · 2026-04-11T07:40:55Z

…rs when database tables are frequently created or deleted

What problem does this PR solve?

Issue Number: close #62390

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

Test
- Regression test
- Unit Test
- Manual test (add detailed scripts or steps below)
- No need to test or manual test. Explain why:
  - This is a refactor/code format and no logic has been changed.
  - Previous test can cover this change.
  - No code files have been changed.
  - Other reason
Behavior changed:
- No.
- Yes.
Does this need documentation?
- No.
- Yes.

Check List (For Reviewer who merge this PR)

Confirm the release note
Confirm test cases
Confirm document
Add branch pick label

…rs when database tables are frequently created or deleted

Thearas · 2026-04-11T07:41:05Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

morrySnow · 2026-04-13T03:18:21Z

run buildall

github-actions · 2026-04-13T03:19:03Z

PR approved by at least one committer and no changes requested.

github-actions · 2026-04-13T03:19:05Z

PR approved by anyone and no changes requested.

yiguolei · 2026-04-16T22:49:09Z

/review

github-actions

Found 1 blocking issue.

InsertIntoTableCommand.initPlan() now calls insertExecutor.onFail(e) before releasing newestTargetTableIf.readLock(). For the OLAP insert path, OlapInsertExecutor.onFail() aborts the transaction through GlobalTransactionMgr.abortTransaction(), which acquires the target table write lock via MetaLockUtils.tryWriteLockTablesOrMetaException(...). Because the current thread still holds the read lock, the abort path can only time out after 5s instead of cleaning the transaction up immediately. This regresses the failure-path locking behavior that the existing comment is explicitly protecting.

Critical checkpoints

Goal of the task: simplify unlock handling around initPlan(). The current code does not preserve the failure-path behavior, and there is no test proving the intended deadlock fix.
Scope/minimality: the diff is small and focused, but the new finally changes lock ordering on the error path.
Concurrency: not safe. The insert thread now holds a table read lock while invoking a failure handler that upgrades to table write locking.
Lifecycle: no additional lifecycle issues beyond the lock-ownership regression above.
Configuration: not applicable.
Compatibility: not applicable.
Parallel code paths: the generic initPlan() path serves all insert executors, but the regression is observable on OLAP insert executors whose onFail() aborts a transaction.
Conditional checks: the schema/id recheck logic is still fine.
Test coverage: no regression or unit test was added for beginTransaction()/finalizeSink() failures under concurrent metadata changes.
Observability: existing WARN logs are enough to diagnose once reproduced.
Transaction/persistence: not safe. The abort path may fail to take the required table write lock, delaying cleanup of a PREPARE transaction.
Data writes/modifications: the failure path no longer guarantees prompt transactional cleanup.
FE/BE variable passing: not applicable.
Performance: the regression adds a 5s write-lock timeout on the failure path.
Other issues: none beyond the blocking lock-order regression above.

github-actions · 2026-04-16T23:00:12Z

                }
                Throwables.throwIfInstanceOf(e, RuntimeException.class);
                throw new IllegalStateException(e.getMessage(), e);
+            } finally {


Calling insertExecutor.onFail(e) before this readUnlock() reintroduces the lock-order problem the old code was explicitly avoiding. For the normal OLAP path, onFail() goes through OlapInsertExecutor.abortTransactionOnFail() -> GlobalTransactionMgr.abortTransaction() -> MetaLockUtils.tryWriteLockTablesOrMetaException(...), which needs the same table's write lock. Because the current thread is still holding newestTargetTableIf.readLock(), that write-lock acquisition times out instead of aborting the txn immediately.

Please keep the unlock ahead of onFail() on the exception path, and use finally only for the success / retry branches if you still want to deduplicate the non-error unlocks.

[Improvement](unlock)optimize the place of unlock which deadlock occu…

1415a52

…rs when database tables are frequently created or deleted

morrySnow approved these changes Apr 13, 2026

View reviewed changes

github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 13, 2026

github-actions bot added the reviewed label Apr 13, 2026

yiguolei approved these changes Apr 16, 2026

View reviewed changes

yiguolei added the dev/4.1.x label Apr 16, 2026

github-actions bot requested changes Apr 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Improvement](unlock)optimize the place of unlock which deadlock occurs when database tables are frequently created or deleted#62391

[Improvement](unlock)optimize the place of unlock which deadlock occurs when database tables are frequently created or deleted#62391
LuGuangming wants to merge 1 commit intoapache:masterfrom
LuGuangming:master_unlock

LuGuangming commented Apr 11, 2026

Uh oh!

Thearas commented Apr 11, 2026

Uh oh!

morrySnow commented Apr 13, 2026

Uh oh!

github-actions bot commented Apr 13, 2026

Uh oh!

github-actions bot commented Apr 13, 2026

Uh oh!

yiguolei commented Apr 16, 2026

Uh oh!

github-actions bot left a comment

Uh oh!

github-actions bot Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

LuGuangming commented Apr 11, 2026

What problem does this PR solve?

Release note

Check List (For Author)

Check List (For Reviewer who merge this PR)

Uh oh!

Thearas commented Apr 11, 2026

Uh oh!

morrySnow commented Apr 13, 2026

Uh oh!

github-actions bot commented Apr 13, 2026

Uh oh!

github-actions bot commented Apr 13, 2026

Uh oh!

yiguolei commented Apr 16, 2026

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants