tree-sitter-wfl

Tree-sitter grammar for WFL (WarpLabs Language) — a domain-specific language for stream processing rules, windowed aggregations, and anomaly detection.

Overview

WFL defines event-driven processing rules that match patterns across data streams within time windows. It supports multi-stage pipelines, complex event joins, scoring, entity tracking, and a built-in test framework.

Language Structure

A WFL file consists of use declarations, rule definitions, and test blocks:

use "path/to/module"

rule my_rule {
    meta { ... }
    events { ... }
    match <...> { ... } -> score(...)
    entity(...)
    yield target@v1(...)
}

test my_test for my_rule {
    input { ... }
    expect { ... }
}

Language Features

Use Declarations

Import external modules:

use "rules/detection.wfl"
use "lib/common.wfl"

Rule Declaration

The core construct defining a stream processing rule:

rule brute_force {
    meta {
        author = "security-team";
        severity = "high";
    }

    events {
        login: auth_stream && status == "failed"
    }

    match <login.src_ip : 5m> {
        on event {
            login.src_ip | count >= 10;
        }
        on close {
            login.src_ip | distinct | count >= 3;
        }
        derive {
            fail_rate = login.src_ip | count / login.src_ip | distinct | count;
        }
    } -> score {
        frequency = login.src_ip | count @0.6;
        diversity = login.username | distinct | count @0.4;
    }

    entity("account", login.username)
    yield(reason = "brute_force", count = login.src_ip | count)
}

Events Block

Declare event sources with optional filter conditions:

events {
    login: auth_stream && status == "failed"
    dns: dns_stream
    http: web_stream && method == "POST"
}

Match Clause

Define windowed processing with match parameters, event/close handlers, and derived fields:

match <login.src_ip : 5m : tumble> {
    on event {
        login.src_ip | count >= 10;
    }
    on close {
        login.src_ip | distinct | count >= 5;
    }
}

Window types:

5m — sliding window (default)
5m : fixed — fixed interval window (non-overlapping)
session(30m) — session window (gap-based)

Duration units: s (seconds), m (minutes), h (hours), d (days).

Pipe chain: Apply transforms and measures to event fields:

// transform | measure comparison value
login.src_ip | distinct | count >= 5;
login.bytes | sum > 1000000;
login.latency | avg > 500;

Transforms: distinct. Measures: count, sum, avg, min, max.

OR Branches

Match steps support alternative branches with ||:

on event {
    fast: login.src_ip && latency < 100 | count >= 20
    || slow: login.src_ip && latency >= 100 | count >= 5;
}

Derive Block

Create computed values for use in scoring and yield:

derive {
    fail_rate = @total_attempts / @unique_users;
    risk_level = if @fail_rate > 0.8 then 1.0 else 0.5;
}

Derived values are referenced with @name syntax.

Score Output

Single score:

-> score(login.src_ip | count * 10)

Weighted multi-factor scoring:

-> score {
    frequency = login.src_ip | count @0.6;
    spread = login.username | distinct | count @0.4;
}

Multi-Stage Pipelines

Chain processing stages with |>:

match <login.src_ip : 5m : tumble> {
    on event { login.src_ip | count >= 10; }
} -> score(login.src_ip | count)
|>
match <: session(30m)> {
    on event { login.src_ip | count >= 3; }
} -> score(login.src_ip | sum)
entity("account", login.username)
yield(reason = "sustained_brute_force")

Join Clause

Enrich events with data from other windows:

match <login.src_ip : 5m : tumble> {
    on event { login.src_ip | count >= 10; }
}
join geo_db snapshot on login.src_ip == geo_db.ip
join threat_intel asof within 24h on login.src_ip == threat_intel.indicator
    && login.dst_ip == threat_intel.target

Join modes: snapshot (point-in-time lookup), asof [within dur] (temporal lookup).

Entity Clause

Track per-entity state for anomaly detection:

entity("account", login.username)
entity("ip_address", fmt("{}", login.src_ip))

Yield Clause

Emit output with named fields:

yield(reason = "brute_force", count = login.src_ip | count, score = @risk_level)
yield alert_stream@v1(severity = "high", source = login.src_ip)

Key Block

Explicit key mapping for multi-source rules with different field names:

match <sip : 5m> {
    key {
        sip = fail.src_ip;
        sip = scan.src_addr;
    }
    on event { ... }
} -> score(...)

Limits Clause

Optional resource budget declaration per rule:

rule example {
    ...
    limits {
        max_memory = "128MB";
        max_instances = 10000;
        max_throttle = "1000/m";
        on_exceed = "throttle";
    }
}

Conv Clause (Post-processing)

Optional post-processing of results:

conv {
    sort(score) | top(10);
    dedup(entity_id);
    where(score > 0.5);
}

Operations: sort, top, dedup, where.

Test Block

Built-in testing framework for rule validation:

test test_brute_force for brute_force {
    input {
        row(login, src_ip = "10.0.0.1", username = "admin", status = "failed");
        row(login, src_ip = "10.0.0.1", username = "root", status = "failed");
        tick(1m);
        row(login, src_ip = "10.0.0.1", username = "admin", status = "failed");
    }
    expect {
        hits >= 1;
        hit[0].score >= 50;
        hit[0].entity_type == "account";
        hit[0].entity_id == "admin";
        hit[0].field("reason") == "brute_force";
        hit[0].close_reason == "timeout";
    }
    options {
        close_trigger = timeout;
        eval_mode = strict;
    }
}

Expressions

Full expression system with operator precedence:

Precedence	Operators	Description
1	`\|\|`	Logical OR
2	`&&`	Logical AND
3	`==` `!=` `<` `>` `<=` `>=` `in` `not in`	Comparison / set membership
4	`+` `-`	Addition
5	`*` `/` `%`	Multiplication
6	`-` (unary)	Negation
7	`\|`	Pipe
8	`.` `[]`	Member access

Ternary: if expr then expr else expr

Variables: $VAR or ${VAR:default_value} for runtime substitution.

Built-in functions: count, sum, avg, min, max, distinct, fmt, baseline, window.has, hit, contains, regex_match, len, lower, upper, time_diff, time_bucket, coalesce, try, collect_set, collect_list, first, last, stddev, percentile.

Usage

Rust

Add to your Cargo.toml:

[dependencies]
tree-sitter = ">=0.22.6"
tree-sitter-wfl = "0.0.1"

let language = tree_sitter_wfl::language();
let mut parser = tree_sitter::Parser::new();
parser.set_language(&language).unwrap();

let source = r#"rule example {
    events { e: stream }
    match <e.id : 5m> {
        on event { e.id | count >= 1; }
    } -> score(1)
    entity("test", e.id)
    yield(reason = "test")
}"#;
let tree = parser.parse(source, None).unwrap();
println!("{}", tree.root_node().to_sexp());

Node.js

const Parser = require("tree-sitter");
const WFL = require("tree-sitter-wfl");

const parser = new Parser();
parser.setLanguage(WFL);

const tree = parser.parse(`rule example {
    events { e: stream }
    match <e.id : 5m> {
        on event { e.id | count >= 1; }
    } -> score(1)
    entity("test", e.id)
    yield(reason = "test")
}`);
console.log(tree.rootNode.toString());

Python

import tree_sitter_wfl

language = tree_sitter_wfl.language()

Go

import tree_sitter_wfl "github.com/tree-sitter/tree-sitter-wfl"

language := tree_sitter.NewLanguage(tree_sitter_wfl.Language())

Swift

Add via Swift Package Manager using Package.swift.

Development

Prerequisites

Node.js (for tree-sitter-cli)
Rust toolchain (for building the Rust binding)

Building

# Install dependencies
npm install

# Generate the parser from grammar.js
npx tree-sitter generate

# Run tests
npx tree-sitter test

# Build the Rust binding
cargo build

# Run Rust tests
cargo test

# Build C library
make

Project Structure

tree-sitter-wfl/
├── grammar.js              # Grammar definition
├── queries/
│   └── highlights.scm      # Syntax highlighting queries
├── bindings/
│   ├── rust/                # Rust binding
│   ├── node/                # Node.js binding
│   ├── python/              # Python binding
│   ├── go/                  # Go binding
│   ├── c/                   # C header and pkg-config
│   └── swift/               # Swift binding
├── src/
│   ├── parser.c             # Generated parser
│   ├── grammar.json         # Generated grammar schema
│   └── node-types.json      # AST node type definitions
├── Cargo.toml               # Rust package manifest
├── package.json             # Node.js package manifest
├── pyproject.toml           # Python package manifest
├── Package.swift            # Swift package manifest
└── Makefile                 # C library build rules

Editor Support

Zed

The queries/highlights.scm file provides syntax highlighting for the Zed editor. See the companion Zed extension for integration.

License

Apache License 2.0 — see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github		.github
bindings		bindings
docs		docs
queries		queries
src		src
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENSE		LICENSE
Makefile		Makefile
Package.swift		Package.swift
README.md		README.md
binding.gyp		binding.gyp
grammar.js		grammar.js
package-lock.json		package-lock.json
package.json		package.json
pyproject.toml		pyproject.toml
setup.py		setup.py
tree-sitter.json		tree-sitter.json

Folders and files

Latest commit

History

Repository files navigation

tree-sitter-wfl

Overview

Language Structure

Language Features

Use Declarations

Rule Declaration

Events Block

Match Clause

OR Branches

Derive Block

Score Output

Multi-Stage Pipelines

Join Clause

Entity Clause

Yield Clause

Key Block

Limits Clause

Conv Clause (Post-processing)

Test Block

Expressions

Usage

Rust

Node.js

Python

Go

Swift

Development

Prerequisites

Building

Project Structure

Editor Support

Zed

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages