Skip to content

[Bug]: The mooncake.store module must be imported before mooncake.transferengine; otherwise, a segmentation fault occurs. #1426

@chenhao-stick-to

Description

@chenhao-stick-to

Bug Report

mooncake version:v0.3.8
system:almalinux9

As shown in the figure below, in a multithreaded environment, if we comment out the from mooncake.store import MooncakeDistributedStore inside init_store, and instead import mooncake.engine first followed by mooncake.store, a segmentation fault (as shown in the figure) occurs. However, if we import mooncake.store first, the issue disappears.
This problem arises when using SGLang: when its new rfork feature is enabled, it uses TransferEngine for fast model loading, and later initializes MooncakeStore (via HiCache). At that point, the aforementioned segmentation fault appears. The code snippet above is a minimal reproducible example of this issue.
The root cause seems to be related to some kind of race condition or initialization conflict between the two modules and coro_rpc.
I’ve already verified that this crash is 100% reproducible. There must be some underlying low-level issue here. If there’s already a known fix or workaround, please let me know immediately!!!

Image

Image

Before submitting...

  • Ensure you searched for relevant issues and read the [documentation]

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions