Skip to content

feat: Expose use_structured_output in SimpleKGPipeline constructor #478

@arthurfantaci

Description

@arthurfantaci

Problem

LLMEntityRelationExtractor already supports use_structured_output=True (as of v1.13.0), which provides guaranteed JSON schema conformance for entity/relationship extraction. However, this parameter is not exposed through SimpleKGPipeline's constructor.

Users of SimpleKGPipeline must either:

  1. Drop down to the low-level Pipeline API to set use_structured_output=True on the extractor
  2. Accept prompt-based JSON extraction (which works reliably but lacks schema guarantees)

Proposed Solution

Add use_structured_output: bool = False as a constructor parameter to SimpleKGPipeline, and pass it through to _get_extractor():

# In SimpleKGPipeline.__init__:
self.use_structured_output = use_structured_output

# In SimpleKGPipeline._get_extractor():
return LLMEntityRelationExtractor(
    llm=self.llm,
    prompt_template=...,
    use_structured_output=self.use_structured_output,  # <-- new
)

This is a ~1-line change in _get_extractor() plus the constructor parameter.

Context

We're building a production GraphRAG pipeline using SimpleKGPipeline and would benefit from structured output for more reliable extraction. The current workaround (low-level Pipeline API) adds significant complexity.

Versions

  • neo4j-graphrag: 1.13.0+
  • Python: 3.13

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions