The HTTP stats extension for DuckDB transparently intercepts all HTTP traffic
and collects request statistics (request counts, bytes sent and received and
request time) on a per-host basis. The collected statistics are surfaced
through DuckDB's EXPLAIN ANALYZE output:
┌──────────────────────────────────────┐
│┌────────────────────────────────────┐│
││ HTTP Stats ││
││ ││
││ community-extensions.duckdb.org ││
││ ─────────────────────────────── ││
││ Total Requests: 2 ││
││ HEAD: 1 GET: 1 ││
││ in: 3.0 KiB out: 0 bytes ││
││ Total Time: 0.45s ││
││ ││
││ s3.eu-central-1.amazonaws.com ││
││ ───────────────────────────── ││
││ Total Requests: 2 ││
││ HEAD: 1 GET: 1 ││
││ in: 1.8 KiB out: 0 bytes ││
││ Total Time: 0.37s ││
│└────────────────────────────────────┘│
└──────────────────────────────────────┘
The extension works alongside the HTTPFS extension, which provides DuckDB with HTTP(S) and S3 file system support using a curl-based HTTP backend. When both extensions are loaded, HTTP stats wraps HTTPFS's HTTP backend to measure traffic while HTTPFS handles the actual transport.
The motivation for building a separate stats extension rather than relying on HTTPFS's built-in profiling output is twofold:
- HTTPFS shows empty counters when querying DuckDB databases stored on S3.
- HTTPFS does not provide per-host breakdown for queries that fetch data from multiple remote servers.
DuckDB core defines an abstract HTTP backend (HTTPUtil) with a pluggable
replacement mechanism. By default, core ships with a minimal httplib-based
implementation that only supports basic GET requests (enough to download and
install extensions). The HTTPFS extension upgrades this by replacing the
built-in backend with a fully-featured curl-based implementation that adds
support for all HTTP methods, HTTPS, S3 and cloud storage.
The HTTP stats extension uses the decorator pattern to wrap whatever HTTPUtil
is currently active, whether that's core's built-in implementation or HTTPFS's
curl-based one. On load, it captures the active HTTP backend, creates a wrapper
around it and installs the wrapper as the new backend. The wrapper delegates
all operations to the original backend but intercepts requests to measure
statistics (count, bytes, time) per host.
To handle extension load order (e.g. HTTPFS loading after HTTP stats and
replacing the HTTP backend), the extension registers an OnExtensionLoaded
callback that detects when the active backend has been replaced and re-wraps
the new one. This ensures HTTP stats always remains the outermost wrapper.
When both extensions are loaded, DuckDB's profiler would normally render stats from both HTTP stats and HTTPFS, leading to duplicate output. To prevent this, HTTP stats replaces HTTPFS's profiling state with a silent subclass that suppresses its output while preserving all other HTTPFS functionality.
┌─────────────────────────────────────────────────────┐
│ DuckDB core │
│ │
│ DBConfig │
│ └─ shared_ptr<HTTPUtil> http_util ───────┐ │
│ │ │
│ HTTPUtil │ │
│ (abstract base + built-in implementation) │ │
│ • httplib-based, GET only │ │
│ • Pluggable via Set/GetHTTPUtil │ │
│ │ │
│ RegisteredStateManager │ │
│ ├─ http_stats → HTTPStatsState │ │
│ │ └─ WriteProfilingInformation → our │ │
│ │ stats box (or nothing if idle) │ │
│ └─ http_state → SilentHTTPState │ │
│ └─ WriteProfilingInformation → noop │ │
│ └─ Proper subclass of HTTPState │ │
└─────────────────────────────────────────────┼───────┘
│
┌───────────────────────────────┘
▼
┌───────────────────────────────────────────┐
│ HTTP stats extension (this project) │
│ │
│ HTTPStatsHTTPUtil : public HTTPUtil │
│ • Wraps the active HTTPUtil │
│ • Intercepts SendRequest │
│ • Measures stats (count, bytes, time) │
│ per host │
│ • Re-wraps on OnExtensionLoaded │
└──────────────┬────────────────────────────┘
│ delegates to
▼
┌──────────────────────────────────────────┐
│ Wrapped HTTPUtil │
│ │
│ Either: │
│ • Core's built-in (httplib) │
│ • HTTPFS's curl-based implementation │
│ • Any other HTTPUtil subclass │
└──────────────────────────────────────────┘
The easiest way to install the HTTP stats extension is from the DuckDB community extensions repository:
INSTALL http_stats FROM community;
LOAD http_stats;- C++11 compatible compiler
- CMake 3.5 or higher
- vcpkg (for dependency management)
- Ninja (recommended for parallelizing the build process)
- ccache (recommended for caching compilation results and faster rebuilds)
git clone --recurse-submodules https://github.com/tlinhart/duckdb-http-stats.git
cd duckdb-http-statsThe --recurse-submodules flag is required to pull the DuckDB core, HTTPFS
extension and extension CI tools submodules. If you already cloned without
submodules:
git submodule update --init --recursiveThe HTTP stats extension itself has no external dependencies but it builds the HTTPFS extension alongside it for testing. The transitive dependencies of our build are managed via vcpkg. Set it up as follows:
git clone https://github.com/microsoft/vcpkg.git
cd vcpkg
git checkout 84bab45d415d22042bd0b9081aea57f362da3f35
./bootstrap-vcpkg.sh -disableMetrics
export VCPKG_TOOLCHAIN_PATH=$(pwd)/scripts/buildsystems/vcpkg.cmakeThe build system will automatically use vcpkg when VCPKG_TOOLCHAIN_PATH is
set. Dependencies are declared in vcpkg.json.
The simplest way to build the extension is using Make:
makeThis will create a release build of both the static and loadable extension.
DuckDB extensions build DuckDB itself as part of the process to provide easy testing and distributing. The build process also compiles the HTTPFS extension as a dependency. To speed up rebuilds significantly it's highly recommended to install Ninja and ccache. The build system automatically detects and uses ccache to cache build artifacts. To parallelize builds using Ninja:
GEN=ninja makeTo limit the number of parallel jobs (if running low on memory):
CMAKE_BUILD_PARALLEL_LEVEL=4 GEN=ninja makeThe main binaries produced by the build are:
build/release/duckdb– DuckDB shell with extension pre-loaded.build/release/test/unittest– test runner with extension linked into the binary.build/release/extension/http_stats/http_stats.duckdb_extension– loadable extension binary as it would be distributed.
The HTTP stats extension is equipped with a test suite under the test
directory. To run tests after the build:
make testTo run the extension code, simply start the built shell with pre-loaded extension:
./build/release/duckdbAlternatively, start the DuckDB shell with -unsigned flag and load the
extension manually:
LOAD 'build/release/extension/http_stats/http_stats.duckdb_extension';See the LICENSE file for details.
Contributions are welcome! This extension tries to follow the conventions and guidelines used by the DuckDB project and its extension ecosystem.
The general process is as follows:
- Fork the repository.
- Create a feature branch.
- Make your changes, including tests and documentation updates.
- Build the extension.
- Run tests.
- Run the linter (
make tidy-check) and formatter (make format-fix). - Commit the changes.
- Push to your fork and submit a pull request.
When submitting a pull request:
- Keep PRs focused and reasonably sized; large PRs are harder to review.
- Clearly describe the problem and solution in the PR description.
- Reference any related issues.
- Ensure all CI checks pass.
- Avoid draft PRs; use issues or discussions for work-in-progress ideas.
For major changes, please open an issue first to discuss the proposed changes.