Tree-sitter grammar for WFL (WarpLabs Language) — a domain-specific language for stream processing rules, windowed aggregations, and anomaly detection.
WFL defines event-driven processing rules that match patterns across data streams within time windows. It supports multi-stage pipelines, complex event joins, scoring, entity tracking, and a built-in test framework.
A WFL file consists of use declarations, rule definitions, and test blocks:
use "path/to/module"
rule my_rule {
meta { ... }
events { ... }
match <...> { ... } -> score(...)
entity(...)
yield target@v1(...)
}
test my_test for my_rule {
input { ... }
expect { ... }
}
Import external modules:
use "rules/detection.wfl"
use "lib/common.wfl"
The core construct defining a stream processing rule:
rule brute_force {
meta {
author = "security-team";
severity = "high";
}
events {
login: auth_stream && status == "failed"
}
match <login.src_ip : 5m> {
on event {
login.src_ip | count >= 10;
}
on close {
login.src_ip | distinct | count >= 3;
}
derive {
fail_rate = login.src_ip | count / login.src_ip | distinct | count;
}
} -> score {
frequency = login.src_ip | count @0.6;
diversity = login.username | distinct | count @0.4;
}
entity("account", login.username)
yield(reason = "brute_force", count = login.src_ip | count)
}
Declare event sources with optional filter conditions:
events {
login: auth_stream && status == "failed"
dns: dns_stream
http: web_stream && method == "POST"
}
Define windowed processing with match parameters, event/close handlers, and derived fields:
match <login.src_ip : 5m : tumble> {
on event {
login.src_ip | count >= 10;
}
on close {
login.src_ip | distinct | count >= 5;
}
}
Window types:
5m— sliding window (default)5m : fixed— fixed interval window (non-overlapping)session(30m)— session window (gap-based)
Duration units: s (seconds), m (minutes), h (hours), d (days).
Pipe chain: Apply transforms and measures to event fields:
// transform | measure comparison value
login.src_ip | distinct | count >= 5;
login.bytes | sum > 1000000;
login.latency | avg > 500;
Transforms: distinct.
Measures: count, sum, avg, min, max.
Match steps support alternative branches with ||:
on event {
fast: login.src_ip && latency < 100 | count >= 20
|| slow: login.src_ip && latency >= 100 | count >= 5;
}
Create computed values for use in scoring and yield:
derive {
fail_rate = @total_attempts / @unique_users;
risk_level = if @fail_rate > 0.8 then 1.0 else 0.5;
}
Derived values are referenced with @name syntax.
Single score:
-> score(login.src_ip | count * 10)
Weighted multi-factor scoring:
-> score {
frequency = login.src_ip | count @0.6;
spread = login.username | distinct | count @0.4;
}
Chain processing stages with |>:
match <login.src_ip : 5m : tumble> {
on event { login.src_ip | count >= 10; }
} -> score(login.src_ip | count)
|>
match <: session(30m)> {
on event { login.src_ip | count >= 3; }
} -> score(login.src_ip | sum)
entity("account", login.username)
yield(reason = "sustained_brute_force")
Enrich events with data from other windows:
match <login.src_ip : 5m : tumble> {
on event { login.src_ip | count >= 10; }
}
join geo_db snapshot on login.src_ip == geo_db.ip
join threat_intel asof within 24h on login.src_ip == threat_intel.indicator
&& login.dst_ip == threat_intel.target
Join modes: snapshot (point-in-time lookup), asof [within dur] (temporal lookup).
Track per-entity state for anomaly detection:
entity("account", login.username)
entity("ip_address", fmt("{}", login.src_ip))
Emit output with named fields:
yield(reason = "brute_force", count = login.src_ip | count, score = @risk_level)
yield alert_stream@v1(severity = "high", source = login.src_ip)
Explicit key mapping for multi-source rules with different field names:
match <sip : 5m> {
key {
sip = fail.src_ip;
sip = scan.src_addr;
}
on event { ... }
} -> score(...)
Optional resource budget declaration per rule:
rule example {
...
limits {
max_memory = "128MB";
max_instances = 10000;
max_throttle = "1000/m";
on_exceed = "throttle";
}
}
Optional post-processing of results:
conv {
sort(score) | top(10);
dedup(entity_id);
where(score > 0.5);
}
Operations: sort, top, dedup, where.
Built-in testing framework for rule validation:
test test_brute_force for brute_force {
input {
row(login, src_ip = "10.0.0.1", username = "admin", status = "failed");
row(login, src_ip = "10.0.0.1", username = "root", status = "failed");
tick(1m);
row(login, src_ip = "10.0.0.1", username = "admin", status = "failed");
}
expect {
hits >= 1;
hit[0].score >= 50;
hit[0].entity_type == "account";
hit[0].entity_id == "admin";
hit[0].field("reason") == "brute_force";
hit[0].close_reason == "timeout";
}
options {
close_trigger = timeout;
eval_mode = strict;
}
}
Full expression system with operator precedence:
| Precedence | Operators | Description |
|---|---|---|
| 1 | || |
Logical OR |
| 2 | && |
Logical AND |
| 3 | == != < > <= >= in not in |
Comparison / set membership |
| 4 | + - |
Addition |
| 5 | * / % |
Multiplication |
| 6 | - (unary) |
Negation |
| 7 | | |
Pipe |
| 8 | . [] |
Member access |
Ternary: if expr then expr else expr
Variables: $VAR or ${VAR:default_value} for runtime substitution.
Built-in functions: count, sum, avg, min, max, distinct, fmt, baseline, window.has, hit, contains, regex_match, len, lower, upper, time_diff, time_bucket, coalesce, try, collect_set, collect_list, first, last, stddev, percentile.
Add to your Cargo.toml:
[dependencies]
tree-sitter = ">=0.22.6"
tree-sitter-wfl = "0.0.1"let language = tree_sitter_wfl::language();
let mut parser = tree_sitter::Parser::new();
parser.set_language(&language).unwrap();
let source = r#"rule example {
events { e: stream }
match <e.id : 5m> {
on event { e.id | count >= 1; }
} -> score(1)
entity("test", e.id)
yield(reason = "test")
}"#;
let tree = parser.parse(source, None).unwrap();
println!("{}", tree.root_node().to_sexp());const Parser = require("tree-sitter");
const WFL = require("tree-sitter-wfl");
const parser = new Parser();
parser.setLanguage(WFL);
const tree = parser.parse(`rule example {
events { e: stream }
match <e.id : 5m> {
on event { e.id | count >= 1; }
} -> score(1)
entity("test", e.id)
yield(reason = "test")
}`);
console.log(tree.rootNode.toString());import tree_sitter_wfl
language = tree_sitter_wfl.language()import tree_sitter_wfl "github.com/tree-sitter/tree-sitter-wfl"
language := tree_sitter.NewLanguage(tree_sitter_wfl.Language())Add via Swift Package Manager using Package.swift.
- Node.js (for
tree-sitter-cli) - Rust toolchain (for building the Rust binding)
# Install dependencies
npm install
# Generate the parser from grammar.js
npx tree-sitter generate
# Run tests
npx tree-sitter test
# Build the Rust binding
cargo build
# Run Rust tests
cargo test
# Build C library
maketree-sitter-wfl/
├── grammar.js # Grammar definition
├── queries/
│ └── highlights.scm # Syntax highlighting queries
├── bindings/
│ ├── rust/ # Rust binding
│ ├── node/ # Node.js binding
│ ├── python/ # Python binding
│ ├── go/ # Go binding
│ ├── c/ # C header and pkg-config
│ └── swift/ # Swift binding
├── src/
│ ├── parser.c # Generated parser
│ ├── grammar.json # Generated grammar schema
│ └── node-types.json # AST node type definitions
├── Cargo.toml # Rust package manifest
├── package.json # Node.js package manifest
├── pyproject.toml # Python package manifest
├── Package.swift # Swift package manifest
└── Makefile # C library build rules
The queries/highlights.scm file provides syntax highlighting for the Zed editor. See the companion Zed extension for integration.
Apache License 2.0 — see LICENSE for details.