Delphi (Object Pascal) remains one of the most widely used languages for Windows desktop and enterprise applications. With an estimated 1.5–3 million active developers and a strong presence in industries like healthcare, finance, logistics, and government, Delphi projects often involve large, long-lived codebases that benefit significantly from semantic code intelligence.
Many Delphi codebases have grown over decades — making structural understanding, impact analysis, and cross-file navigation exactly the kind of tooling gap CodeGraph is designed to fill.
Adding Delphi support positions CodeGraph as a uniquely valuable tool for a community that has historically been underserved by modern static analysis and AI-assisted development tools.
Full extraction support for .pas, .dpr, .dpk, and .lpr files using the tree-sitter-pascal grammar:
| Feature | NodeKind | Details |
|---|---|---|
| Units / Programs | module |
unit, program, package, library |
| Classes | class |
Including inheritance and interface implementation |
| Records | class |
Treated as classes (consistent with AST structure) |
| Interfaces | interface |
With GUID support |
| Methods | method |
Constructor, destructor, procedures, functions |
| Functions / Procedures | function |
Top-level (non-class) routines |
| Properties | property |
With read/write accessors |
| Fields | field |
Class and record fields |
| Constants | constant |
const declarations |
| Enums | enum |
With enum members |
| Type Aliases | type_alias |
type TFoo = ... |
| Uses / Imports | import |
uses clause extraction |
| Function Calls | — | calls edges for call graph |
| Visibility | — | public, private, protected on methods/fields |
| Static Methods | — | class function / class procedure |
| Containment | — | contains edges (class → method, unit → type, etc.) |
| Inheritance | — | extends / implements edges |
Support for Delphi form files (.dfm for VCL, .fmx for FireMonkey) using a regex-based custom extractor — no tree-sitter grammar exists for this format:
| Feature | NodeKind / EdgeKind | Details |
|---|---|---|
| Components | component |
object Button1: TButton |
| Nested hierarchy | contains |
Panel1 → Button1 |
| Event handlers | references (unresolved) |
OnClick = Button1Click → links UI to Pascal methods |
inherited keyword |
component |
Inherited form components |
| Multi-line properties | — | Correctly skipped during parsing |
| Item collections | — | <item>...</end> blocks correctly handled |
The DFM ↔ PAS linkage via event handlers enables cross-file impact analysis: renaming a method in .pas immediately reveals which UI components reference it.
The implementation follows CodeGraph's established patterns:
- Pascal extraction uses the standard
TreeSitterExtractorwith a Pascal-specificLanguageExtractorconfiguration and avisitPascalNode()hook for AST nodes that require special handling (e.g.,declTypewrappers,defProcimplementation bodies) - DFM/FMX extraction uses a
DfmExtractorclass — analogous toLiquidExtractorandSvelteExtractor— that parses the line-based format with regex - Routing in
extractFromSource()dispatches.dfm/.fmxfiles toDfmExtractorbefore reaching the tree-sitter path tree-sitter-pascalis declared as anoptionalDependency(consistent with all other grammars), pinned to a specific commit for reproducible builds
Testing with a large Delphi codebase (~3,400 files, ~244k nodes) uncovered performance bottlenecks in the reference resolution pipeline. The following fixes benefit all languages, not just Pascal:
| Fix | Scope | Impact |
|---|---|---|
Fuzzy match index — replaced O(n) linear scan with lazily-built case-insensitive Map index |
name-matcher.ts (all languages) |
O(1) lookup per ref instead of iterating all nodes |
| Import mapping cache — cached per-file import mappings instead of re-reading/re-parsing for every ref | import-resolver.ts (all languages) |
Eliminated redundant file I/O during resolution |
Kind cache — pre-populated getNodesByKind results during warm-up |
resolution/index.ts (all languages) |
Avoided repeated DB queries for the same node kinds |
| Pascal built-in filtering — skip known RTL/VCL/FMX identifiers before resolution | resolution/index.ts (Pascal-specific) |
~60 built-in identifiers filtered out early |
Method index for defProc — replaced O(n) find() with Map lookup when linking implementation bodies to declarations |
tree-sitter.ts (Pascal-specific) |
O(1) per implementation body |
Delphi-specific excludes — __history/**, __recovery/**, *.dcu added to default excludes |
types.ts (Pascal-specific) |
Skips Delphi IDE temp files during indexing |
Result: Reference resolution on a large Delphi project dropped from ~30 minutes to ~15 seconds (120x speedup). The general improvements (fuzzy index, import cache, kind cache) will benefit all CodeGraph users.
| File | Change |
|---|---|
src/types.ts |
Added 'pascal' to Language type, file patterns to DEFAULT_CONFIG.include |
src/extraction/grammars.ts |
Grammar loader, extension mappings (.pas, .dpr, .dpk, .lpr, .dfm, .fmx), display name |
src/extraction/tree-sitter.ts |
Pascal LanguageExtractor, visitPascalNode() with 7 helper methods, DfmExtractor class, routing in extractFromSource(), method index |
src/resolution/index.ts |
Pascal built-in filtering, kind cache, cache clearing |
src/resolution/import-resolver.ts |
Import mapping cache |
src/resolution/name-matcher.ts |
Fuzzy match index (case-insensitive Map) |
package.json |
tree-sitter-pascal in optionalDependencies (pinned commit) |
__tests__/extraction.test.ts |
37 new tests covering all Pascal and DFM extraction features |
- 36 new tests, all passing
- 0 regressions — the same 28 pre-existing failures (unrelated: missing Swift/Dart grammars, database path issues, MCP truncation test) are unchanged
- Tests cover: language detection, modules, imports, classes, records, interfaces, methods, visibility, static methods, enums, properties, constants, type aliases, calls, containment, full fixture files (UAuth.pas, UTypes.pas, MainForm.dfm)
The npm package tree-sitter-pascal@0.0.1 is outdated (uses NAN bindings, incompatible with Node.js v24+). The implementation uses the actively maintained GitHub repository (Isopod/tree-sitter-pascal, v0.10.2) with a pinned commit hash for deterministic builds. This is consistent with how @sengac/tree-sitter-dart handles a similar situation.
- Node.js >= 18
- npm
- Git
git clone -b delphi-support https://github.com/omonien/codegraph.git
cd codegraph
npm install
npm run buildnpm linkVerify with:
codegraph --versioncd /path/to/your/delphi-project
codegraph init -i
codegraph indexcodegraph status # Show index statistics
codegraph query "TFormMain" # Search for a symbol
codegraph context "What does TCustomer do?" # Build AI contextcodegraph installThis configures the MCP server, tool permissions, auto-sync hooks, and CLAUDE.md in one step. After that, start Claude Code in the project — CodeGraph tools will be available immediately.
npm unlink -g @colbymchenry/codegraph # Remove global link
rm -rf /path/to/delphi-project/.codegraph # Remove project index