Skip to content

Bug: LLM filter prompt missing 'json' keyword causes OpenAI 400 error #11

@smodee

Description

@smodee

Summary

The filtering stage's LLM filter prompt (bioscancast/filtering/llm_filter.py) does not contain the word "json" anywhere in the message text. When using OpenAI's response_format: {"type": "json_object"} mode, the API requires the word "json" to appear in the messages, and returns a 400 error without it:

openai.BadRequestError: 'messages' must contain the word 'json' in some form,
to use 'response_format' of type 'json_object'.

This means any real OpenAI call through the LLM filter has always been broken. It was not caught earlier because:

  • Offline tests use FakeLLMClient which doesn't enforce this constraint.
  • The filtering pipeline defaults to llm_client=None (fail-closed mode), so the LLM filter path was never exercised in integration tests.
  • The search stage's OpenAIClient works fine because its prompts include "Return JSON: ..." in the text.

How to reproduce

from bioscancast.llm.client import OpenAIClient
from bioscancast.filtering.llm_filter import build_filter_prompt
from bioscancast.filtering.models import ForecastQuestion
from datetime import datetime, timezone

question = ForecastQuestion(
    id="test", text="Will H5N1 spread?",
    created_at=datetime.now(timezone.utc),
)
prompt = build_filter_prompt(question, [{"result_id": "r1", "url": "...", "title": "Test"}])
client = OpenAIClient()
client.generate_json(prompt)  # Raises BadRequestError

Fix

Add "Return your response as JSON matching the output_schema below." to the task description in build_filter_prompt() in bioscancast/filtering/llm_filter.py (line 22).

Already fixed on temp/integration branch — needs to be applied to main.

Files

  • bioscancast/filtering/llm_filter.pybuild_filter_prompt(), line 22

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions