Skip to content

[Bug]: Potential sqlite3.OperationalError database is locked during highly concurrent aput operations #8136

Description

@Jacopos311

Checked other resources

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangGraph rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangGraph (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Related Issues / PRs

None

Reproduction Steps / Example Code (Python)

import asyncio
from langchain_core.runnables import RunnableConfig
from langgraph.checkpoint.base import Checkpoint, CheckpointMetadata
from langgraph.checkpoint.sqlite.aio import AsyncSqliteSaver

async def simulate_concurrent_writes():
    # Initialize the checkpointer in memory
    async with AsyncSqliteSaver.from_conn_string(":memory:") as saver:
        await saver.setup()
        
        # Prepare mock checkpoint data
        config: RunnableConfig = {"configurable": {"thread_id": "test_thread", "checkpoint_ns": ""}}
        checkpoint: Checkpoint = {"v": 1, "id": "1", "ts": "2026-01-01T00:00:00Z", "channel_values": {}, "channel_versions": {}, "versions_seen": {}}
        metadata: CheckpointMetadata = {"source": "input", "step": 1, "writes": {}}
        new_versions = {}

        # Worker simulating concurrent agent writes
        async def worker(worker_id):
            cfg = config.copy()
            cfg["configurable"] = config["configurable"].copy()
            cfg["configurable"]["checkpoint_id"] = f"check_{worker_id}"
            await saver.aput(cfg, checkpoint, metadata, new_versions)

        # Trigger 50 concurrent writes at the exact same time
        # Without BEGIN IMMEDIATE, this highly concurrent scenario can expose
        # the database to "sqlite3.OperationalError: database is locked"
        tasks = [worker(i) for i in range(50)]
        await asyncio.gather(*tasks)

if __name__ == "__main__":
    asyncio.run(simulate_concurrent_writes())

Error Message and Stack Trace (if applicable)

Description

Description

In high-concurrency environments, when multiple async agents attempt to write checkpoints simultaneously using AsyncSqliteSaver.aput, SQLite can occasionally fail with a sqlite3.OperationalError: database is locked.

Proposed Fix

Enforcing a BEGIN IMMEDIATE transaction block within the async with self.lock section inside langgraph/checkpoint/sqlite/aio.py ensures that SQLite acquires a write lock before executing the statement, preventing concurrent write collisions.

System Info

System Information

OS: Windows
OS Version: 10.0.26200
Python Version: 3.11.9 (tags/v3.11.9:de54cf5, Apr 2 2024, 10:12:12) [MSC v.1938 64 bit (AMD64)]

Package Information

langchain_core: 1.4.7
langsmith: 0.8.16
langchain_protocol: 0.0.17

Optional packages not installed

deepagents
deepagents-cli

Other Dependencies

httpx: 0.28.1
jsonpatch: 1.33
orjson: 3.11.9
packaging: 26.2
pydantic: 2.13.4
pyyaml: 6.0.3
requests: 2.32.3
requests-toolbelt: 1.0.0
rich: 15.0.0
tenacity: 9.1.4
typing-extensions: 4.15.0
uuid-utils: 0.16.2
websockets: 15.0
wrapt: 2.2.1
xxhash: 3.7.0
zstandard: 0.25.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingexternal

    Type

    Fields

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions