gpu_execution is the recommended execution path — out-of-core execution with tiered memory management (GPU/host/disk), automatic data partitioning, and spilling. It currently works with Parquet data format.
Clone the Sirius repository:
git clone --recurse-submodules https://github.com/sirius-db/sirius.git
cd sirius
Set up the environment with Pixi and build:
pixi shell
CMAKE_BUILD_PARALLEL_LEVEL=$(nproc) make
Note that if building consumes too much memory, try reducing the CMAKE_BUILD_PARALLEL_LEVEL value.
gpu_execution requires a config file in YAML format. See the Configuration documentation for the full reference, including config file resolution order, all available options, and byte suffixes. An example config file is provided at test/cpp/integration/integration.yaml.
export SIRIUS_CONFIG_FILE=/path/to/sirius.yaml
./build/release/duckdbFrom the DuckDB shell, create views pointing to your Parquet files and run queries with gpu_execution:
-- Create views for parquet data
CREATE VIEW lineitem AS SELECT * FROM read_parquet('/data/lineitem/*.parquet');
CREATE VIEW orders AS SELECT * FROM read_parquet('/data/orders/*.parquet');
CREATE VIEW customer AS SELECT * FROM read_parquet('/data/customer/*.parquet');
-- Run a query on GPU
CALL gpu_execution('SELECT
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price
FROM lineitem
WHERE l_shipdate <= date ''1998-09-02''
GROUP BY l_returnflag, l_linestatus
ORDER BY l_returnflag, l_linestatus');For TPC-H benchmarking, use the provided data generation script:
cd test/tpch_performance
pixi run bash generate_tpch_data.sh 100 # generates SF100 parquet dataThis produces partitioned Parquet files under test_datasets/tpch_parquet_sf100/. Then create views from the DuckDB shell:
CREATE VIEW lineitem AS SELECT * FROM read_parquet('test_datasets/tpch_parquet_sf100/lineitem/*.parquet');
-- repeat for other tables...For your own data, point read_parquet() at any Parquet file or glob:
CREATE VIEW my_table AS SELECT * FROM read_parquet('/path/to/my_data/*.parquet');gpu_execution uses C++ unit tests built with Catch2. Test files are in test/cpp/.
Run all unit tests:
CMAKE_BUILD_PARALLEL_LEVEL=$(nproc) make
build/release/extension/sirius/test/cpp/sirius_unittest
Run tests associated with a specific tag or a specific test:
build/release/extension/sirius/test/cpp/sirius_unittest "[cpu_cache]"
build/release/extension/sirius/test/cpp/sirius_unittest "test_cpu_cache_basic_string_single_col"
Test logs are saved in:
build/release/extension/sirius/test/cpp/log
For in-depth documentation on the gpu_execution engine, see the Super Sirius Documentation.