Skip to content

QuicConnFlushRecv: recvQ corrupted when we could not flush all datagrams arrived in one shot #5932

@jianye-chen

Description

@jianye-chen

Describe the bug

Some rare memory pool corruption inside msquic was noticed and it appears QuicConnFlushRecv's implementation may be a problem.

in QuicConnFlushRecv, we noticed this branch of code,

msquic/src/core/connection.c

Lines 5954 to 5968 in 37d6673

ReceiveQueue = Connection->ReceiveQueue;
if (Connection->ReceiveQueueCount > QUIC_MAX_RECEIVE_FLUSH_COUNT) {
FlushedAll = FALSE;
Connection->ReceiveQueueCount -= QUIC_MAX_RECEIVE_FLUSH_COUNT;
QUIC_RX_PACKET* Tail = Connection->ReceiveQueue;
ReceiveQueueCount = 0;
ReceiveQueueByteCount = 0;
while (++ReceiveQueueCount < QUIC_MAX_RECEIVE_FLUSH_COUNT) {
ReceiveQueueByteCount += Tail->BufferLength;
Tail = Connection->ReceiveQueue;
}
Connection->ReceiveQueueByteCount -= ReceiveQueueByteCount;
Connection->ReceiveQueue = (QUIC_RX_PACKET*)Tail->Next;
Tail->Next = NULL;
} else {

We believe it is supposed to,

  1. Take the first QUIC_MAX_RECEIVE_FLUSH_COUNT packets from recvQ
  2. Count the number of bytes we took from recvQ
  3. Update Connection->ReceiveQueue to the remaining packets, update Connection->ReceiveQueueByteCount to the total bytes in remaining packets
  4. Call QuicConnRecvDatagrams with the portion of the list we took from recvQ

However, setting Tail to Connection->ReceiveQueue in each while loop body means we are always looking at the first packet,

Tail = Connection->ReceiveQueue;

  1. We will always detach the head node in recvQ, and miscalculate new Connection->ReceiveQueueByteCount
  2. We will always call QuicConnRecvDatagrams with the single head packet in recvQ, but ReceiveQueueCount is the size of first-packet-length multiplied by 100 (QUIC_MAX_RECEIVE_FLUSH_COUNT).

This line is a no-op today.

Affected OS

  • Windows
  • Linux
  • macOS
  • Other (specify below)

Additional OS information

No response

MsQuic version

main

Steps taken to reproduce bug

memory pool corruption resulted in crashes later down the line (an internal msquic queue has an invalid head node)

Expected behavior

QuicConnFlushRecv should not malfunction when more than QUIC_MAX_RECEIVE_FLUSH_COUNT number of datagrams are queued up in recvQ

Actual outcome

memory pool corruption

Additional details

No response

Metadata

Metadata

Assignees

Labels

Area: CoreRelated to the shared, core protocol logic

Type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions