add Pascal/Lazarus support and stabilize LLM semantic extraction by JClarQ · Pull Request #682 · safishamsi/graphify

JClarQ · 2026-05-03T11:32:37Z

Summary

This pull request introduces comprehensive support for FreePascal and Lazarus projects while significantly improving the stability and resilience of the semantic extraction pipeline. It integrates robust AST extraction for Pascal files and implements a JSON repair mechanism to handle truncated or malformed LLM responses effectively.

Key Changes

1. FreePascal & Lazarus Support

Language Detection: Updated detect.py to recognize .pas, .pp, .inc, .lpr, .lfm, and .lpi extensions.
Tree-sitter Integration: Integrated tree-sitter-language-pack via pyproject.toml for dynamic Pascal grammar support.
Pascal Extraction: Implemented unit dependency resolution by extracting module names from uses clauses.
Lazarus Integration: Added support for Lazarus project (.lpi) and form (.lfm) files.

2. Semantic Extraction Stabilization

Adaptive JSON Repair: Added a _repair_json utility to recover partial data from truncated LLM outputs (e.g., gpt-5.4-mini) by automatically closing unclosed strings and brackets.
Robust Parsing: Updated _parse_llm_json to attempt repairs before discarding data, increasing resilience against token limits.
Refined Retry Logic: Modified _extract_with_adaptive_retry to distinguish between recoverable truncation and logic errors, reducing log noise and preventing unnecessary retries.

3. CLI & Backend Improvements

OpenAI Support: Added openai as a supported backend with gpt-5.4-mini as the default model.
Model Overrides: Added support for the GRAPHIFY_MODEL environment variable to allow easy model switching without code changes.
Semantic Update Flag: Implemented the --semantic flag for the update command to allow manual triggering of LLM-based enrichment during project updates.

Verification

Validated AST extraction and dependency mapping for complex Pascal projects.
Verified that the JSON repair utility successfully recovers data from intentionally truncated LLM streams.
Confirmed that the --semantic flag correctly triggers enrichment and merges findings into the knowledge graph.

…ck integration

…c extraction settings

- Implement _repair_json to recover partial data from truncated LLM responses - Update _parse_llm_json to attempt repair before failing - Refine _extract_with_adaptive_retry to silence parse errors during recursive splitting - Ensure errors are only reported when recursion depth is exhausted or for logic errors

Qodo-Free-For-OSS · 2026-05-04T10:50:25Z

Hi, Pascal uses imports are emitted as edges to _make_id(unitName) targets, but no corresponding nodes exist and file node IDs are path-derived, so build_from_json() drops these edges and the new unit dependency extraction doesn’t surface in the final graph.

Severity: action required | Category: correctness

How to fix: Resolve units to file IDs

Agent prompt to fix - you can give this to your LLM of choice:

Issue description

Pascal uses imports are currently dropped because edge targets don’t match any existing node IDs.

Issue Context

File node IDs are path-based, and build_from_json() only keeps edges whose endpoints exist as nodes.

Fix Focus Areas

graphify/extract.py[814-846]

graphify/extract.py[958-969]

graphify/build.py[75-103]

Implementation notes

Update _import_pascal() to resolve unit names to likely file paths (e.g., sibling UnitName.pas / unitname.pp / unitname.inc), and use _make_id(str(resolved_path)) so the target matches the imported unit’s file node ID.

If resolution fails, consider emitting a lightweight placeholder node for the unit name so the import edge survives (optional, but must conform to node schema).

Add a small test fixture covering uses Foo; connecting to foo.pas in the same directory.

Spotted by Qodo code review - free for open-source projects.

JClarQ added 4 commits May 3, 2026 12:50

feat: add FreePascal and Lazarus support with tree-sitter-language-pa…

cb8719a

…ck integration

feat: add OpenAI backend with gpt-5.4-mini support and refine semanti…

1dc1d03

…c extraction settings

feat: implement --semantic flag for the update command

955e91d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add Pascal/Lazarus support and stabilize LLM semantic extraction#682

add Pascal/Lazarus support and stabilize LLM semantic extraction#682
JClarQ wants to merge 4 commits intosafishamsi:v6from
JClarQ:v6

JClarQ commented May 3, 2026

Uh oh!

Qodo-Free-For-OSS commented May 4, 2026

Issue description

Issue Context

Fix Focus Areas

Implementation notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

JClarQ commented May 3, 2026

Summary

Key Changes

1. FreePascal & Lazarus Support

2. Semantic Extraction Stabilization

3. CLI & Backend Improvements

Verification

Uh oh!

Qodo-Free-For-OSS commented May 4, 2026

Issue description

Issue Context

Fix Focus Areas

Implementation notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants