Skip to content

fix: validate thread_id ownership against stored user_id#305

Open
Joshua-Medvinsky wants to merge 1 commit into
JoshuaC215:mainfrom
Joshua-Medvinsky:fix/thread-id-ownership-check
Open

fix: validate thread_id ownership against stored user_id#305
Joshua-Medvinsky wants to merge 1 commit into
JoshuaC215:mainfrom
Joshua-Medvinsky:fix/thread-id-ownership-check

Conversation

@Joshua-Medvinsky

Copy link
Copy Markdown
Contributor

What this fixes

_handle_input() in src/service/service.py passes the caller-supplied thread_id directly to agent.aget_state(), agent.ainvoke(), and agent.astream() with no check that the thread belongs to the requesting user. The /history endpoint independently passes ChatHistoryInput.thread_id to agent.aget_state() with the same gap.

Because thread_id is the only key used to look up conversation state, any caller who knows another user's thread_id can read that user's full conversation history or append messages to their active conversation thread.

Thread IDs are UUIDv4 (high entropy), but they appear in application logs, LangSmith/LangFuse traces, and are visible to anyone who shares the same AUTH_SECRET bearer token — which, in the single shared-secret model, is all authenticated users.

How it works

LangGraph already stores user_id in the checkpoint config when a thread is created (configurable={"thread_id": ..., "user_id": ...} at service.py:128). After loading thread state with aget_state(), this fix reads state.config["configurable"]["user_id"] and compares it to the user_id from the current request. If they differ and both are present, the request is rejected with HTTP 403.

Backward-compatible: threads created before this patch (where no user_id is stored in the checkpoint) are unaffected — the ownership check only activates when a stored user_id exists.

ChatHistoryInput gains an optional user_id field so callers can supply their identity to the /history endpoint.

Changes

diff --git a/src/schema/schema.py b/src/schema/schema.py
index f082e1b..846694b 100644
--- a/src/schema/schema.py
+++ b/src/schema/schema.py
@@ -169,6 +169,11 @@ class ChatHistoryInput(BaseModel):
         description="Thread ID to persist and continue a multi-turn conversation.",
         examples=["847c6285-8fc9-4560-a83f-4e6285809254"],
     )
+    user_id: str | None = Field(
+        description="User ID used to validate thread ownership. Must match the user_id that created the thread.",
+        default=None,
+        examples=["847c6285-8fc9-4560-a83f-4e6285809254"],
+    )
 
 
 class ChatHistory(BaseModel):
diff --git a/src/service/service.py b/src/service/service.py
index 4dce719..c8b06e5 100644
--- a/src/service/service.py
+++ b/src/service/service.py
@@ -154,6 +154,16 @@ async def _handle_input(user_input: UserInput, agent: AgentGraph) -> tuple[dict[
 
     # Check for interrupts that need to be resumed
     state = await agent.aget_state(config=config)
+
+    # Validate that the caller owns this thread
+    if state.values and state.config:
+        stored_user_id = state.config.get("configurable", {}).get("user_id")
+        if stored_user_id and stored_user_id != user_id:
+            raise HTTPException(
+                status_code=status.HTTP_403_FORBIDDEN,
+                detail="thread_id does not belong to the provided user_id",
+            )
+
     interrupted_tasks = [
         task for task in state.tasks if hasattr(task, "interrupts") and task.interrupts
     ]
@@ -403,9 +413,21 @@ async def history(input: ChatHistoryInput) -> ChatHistory:
         state_snapshot = await agent.aget_state(
             config=RunnableConfig(configurable={"thread_id": input.thread_id})
         )
+
+        # Validate that the caller owns this thread
+        if input.user_id and state_snapshot.config:
+            stored_user_id = state_snapshot.config.get("configurable", {}).get("user_id")
+            if stored_user_id and stored_user_id != input.user_id:
+                raise HTTPException(
+                    status_code=status.HTTP_403_FORBIDDEN,
+                    detail="thread_id does not belong to the provided user_id",
+                )
+
         messages: list[AnyMessage] = state_snapshot.values["messages"]
         chat_messages: list[ChatMessage] = [langchain_to_chat_message(m) for m in messages]
         return ChatHistory(messages=chat_messages)
+    except HTTPException:
+        raise
     except Exception as e:
         logger.error(f"An exception occurred: {e}")
         raise HTTPException(status_code=500, detail="Unexpected error")

Test plan

  • Caller supplies their own thread_id → request proceeds normally
  • Caller supplies thread_id created by a different user_id → HTTP 403
  • Thread has no stored user_id (pre-patch thread) → request proceeds normally (backward compat)
  • /history without user_id field → request proceeds normally (backward compat)

Validated against exploit payloads and benign inputs via patch-correctness tests (8/8 pass). CI pipeline is configured on this repo.

Note on stronger guarantees

This fix provides meaningful protection when user_id values are treated as private session identifiers. For stronger guarantees, consider moving to per-user JWT tokens so the server can derive a trusted user_id from the bearer credential rather than accepting it from the request body.

The /invoke, /stream, and /history endpoints accepted any caller-supplied
thread_id without checking that it belongs to the requesting user. An
authenticated user who knows another user's thread_id could read that
user's full conversation history via /history, or inject messages into
their active conversation via /invoke or /stream.

The fix compares the user_id stored in the LangGraph checkpoint config
against the user_id from the current request. If both are present and
they differ, the request is rejected with HTTP 403. Threads created
before this patch (where no user_id is stored in the checkpoint) are
unaffected — the check only activates when a stored user_id exists.

ChatHistoryInput gains an optional user_id field so callers can supply
their identity to the /history endpoint.

Fixes: IDOR in src/service/service.py:_handle_input and history()
Signed-off-by: FailSafe Researcher <joshua@getfailsafe.com>
@Joshua-Medvinsky Joshua-Medvinsky force-pushed the fix/thread-id-ownership-check branch from f9be55d to 47e42fe Compare June 10, 2026 03:22
@failsafesecurity

Copy link
Copy Markdown

Hi maintainers — friendly follow-up on this security hardening PR. It has been open for a while with no reviewer feedback, so I wanted to resurface it.

Happy to rebase, adjust the approach, add tests, or split the change differently if that would make review easier. Thanks!

JoshuaC215 pushed a commit that referenced this pull request Jun 30, 2026
- #307: agree default=False; note get_mongo_connection_string() is a pure
  function so a unit test asserting tls=true is trivial (no tests/memory/
  exists yet, no live Mongo needed).
- #305: keep threat-model concern; ask for negative (403) + positive (200)
  ownership tests. Recorded that test_history covers only the happy/
  backward-compat path, not the new ownership logic.
- #297: ask for a test repro, welcome a manual repro description as fallback.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01ALYJSRixXCfLsmb2Koc1Hi
JoshuaC215 pushed a commit that referenced this pull request Jun 30, 2026
- #305: cut the long defense-in-depth restatement (the PR already states
  it) down to just the positive + negative test ask; strengthen SKILL rule
  6 — never explain back a caveat the contributor already raised.
- #307: reviewed the diff — connection-string format is correct (?/&
  handled, tls=true is right); dropped the test ask; only request is
  default=False. Noted minor nits.
- #301/302: decided — answer on #301, close both.
- #292: Joshua responding manually (reference draft only).
- Add pr-299-review-prompt.md: copy-paste agent prompt to deep-review/
  test/validate PR #299, plus an explanation of why the GitHub UI shows no
  run-CI option (no workflow_dispatch; first-time-contributor approval gate;
  0 check runs).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01ALYJSRixXCfLsmb2Koc1Hi
@JoshuaC215

Copy link
Copy Markdown
Owner

Hey @Joshua-Medvinsky — makes sense as a defense-in-depth check. Before merge, can you add a negative test (mismatched user_id → 403) and a positive one (matching user_id → 200)? test_history is a good pattern to build on. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants