Skip to content

kernel: bypass k_heap_free() in k_free() to avoid scheduler locking#107146

Open
npitre wants to merge 3 commits intozephyrproject-rtos:mainfrom
npitre:fix/k-free-bypass-heap
Open

kernel: bypass k_heap_free() in k_free() to avoid scheduler locking#107146
npitre wants to merge 3 commits intozephyrproject-rtos:mainfrom
npitre:fix/k-free-bypass-heap

Conversation

@npitre
Copy link
Copy Markdown

@npitre npitre commented Apr 11, 2026

Simpler alternative to the approach taken in #106792.

k_free() now bypasses k_heap_free() entirely, going directly to
sys_heap_free() under heap->lock. This is symmetric with
z_alloc_helper() which already bypasses k_heap_alloc() to go
directly to sys_heap_*().

This eliminates scheduler lock involvement in the k_free() path,
avoiding the recursive _sched_spinlock issue when k_free() is
called from halt_thread() during thread abort with
CONFIG_USERSPACE and CONFIG_DYNAMIC_OBJECTS, without requiring
any _sched_locked function variants or extra parameters.

Fixes #106659

@npitre npitre force-pushed the fix/k-free-bypass-heap branch 3 times, most recently from d924960 to 99dcc5b Compare April 12, 2026 17:58
@npitre
Copy link
Copy Markdown
Author

npitre commented Apr 12, 2026

This PR has been significantly reworked. The previous approach introduced struct k_mpool to cleanly separate the k_malloc/k_free path from the synchronized k_heap API. While it provided a clear API separation, it turned into a rabbit hole: the tracing subsystem has typed function signatures for heap operations (including in external modules like Percepio), k_thread_heap_assign() needed renaming, all callers across tests and samples had to migrate, and we uncovered cases where tests used both K_HEAP_DEFINE and k_thread_heap_assign on the same pool.

The new approach is much simpler: k_free() bypasses k_heap_free() entirely, going directly to sys_heap_free() under heap->lock — symmetric with how z_alloc_helper() already bypasses k_heap_alloc(). No scheduler lock involvement, no new types, no API changes.

The one subtlety: since k_free() no longer references k_heap_free(), nothing in kheap.c may have callers. The linker (with --gc-sections and --no-whole-archive on libkernel.a) can discard it entirely, taking the SYS_INIT handler that initializes K_HEAP_DEFINE heaps with it. This is solved with a __used reference to k_heap_init in mempool.c that forces kheap.o into the link.

The k_mpool split remains a worthwhile future direction for a cleaner architecture, but it's a larger effort that should be pursued separately.

@npitre npitre added this to the v4.5.0 milestone Apr 12, 2026
@npitre npitre changed the title RFC: kernel: introduce k_mpool to decouple k_free from scheduler locking kernel: bypass k_heap_free() in k_free() to avoid scheduler locking Apr 12, 2026
@npitre npitre marked this pull request as ready for review April 12, 2026 18:07
peter-mitsis
peter-mitsis previously approved these changes Apr 13, 2026
teburd
teburd previously approved these changes Apr 16, 2026
Nicolas Pitre added 3 commits April 18, 2026 21:11
This partially reverts commit 9cef0da ("kernel: avoid recursive
scheduler lock in k_heap_free path"), keeping only the sched.c changes
(z_unpend_all_locked / z_unpend_all refactoring).

The _sched_locked variants of k_free, k_heap_free, k_msgq_cleanup,
k_stack_cleanup and the sched_locked parameter plumbing through
unref_check were an overcomplicated approach. A simpler fix follows.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
k_free() now goes directly to sys_heap_free() under heap->lock,
bypassing k_heap_free() and its z_unpend_all() call. This is
symmetric with z_alloc_helper() which already bypasses k_heap_alloc()
to go directly to sys_heap_*().

This avoids any scheduler lock involvement in the k_free() path,
eliminating the recursive _sched_spinlock issue when k_free() is
called from halt_thread() during thread abort with CONFIG_USERSPACE
and CONFIG_DYNAMIC_OBJECTS.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
k_free() bypasses k_heap_free() to avoid scheduler lock involvement,
going directly to sys_heap_free() instead. This means nothing in
kheap.c may have any callers, and since it is in a library linked
without --whole-archive, the linker may discard it entirely.

However, kheap.c contains a SYS_INIT handler that initializes all
statically defined k_heap objects (those created with K_HEAP_DEFINE).
Without it, heaps such as the system heap or those used as thread
resource pools via k_thread_heap_assign() are never initialized:
their internal sys_heap pointer remains NULL, causing a crash on
the first allocation.

This can be reproduced without this commit with e.g.:

  west build -b qemu_cortex_a53 tests/kernel/poll

Force kheap.o into the link by adding a __used reference to
k_heap_init in mempool.c. This is enough to pull in kheap.o and
its SYS_INIT registration. Unused functions from kheap.o are
still garbage collected.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
@npitre npitre dismissed stale reviews from teburd and peter-mitsis via efefad5 April 19, 2026 01:32
@npitre npitre force-pushed the fix/k-free-bypass-heap branch from 99dcc5b to efefad5 Compare April 19, 2026 01:32
@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

kernel: Recursive locking of _sched_spinlock in k_thread_abort() with user-space

5 participants