[None][fix] fix mooncake dynamic load in transfer_agent_binding#12181
[None][fix] fix mooncake dynamic load in transfer_agent_binding#12181chuangz0 wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
/bot run |
|
PR_Github #38822 [ run ] triggered by Bot. Commit: |
📝 WalkthroughWalkthroughMooncake support is transitioned from static compile-time linking to lazy/runtime loading via Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
cpp/tensorrt_llm/executor/cache_transmission/nixl_utils/agentBindings.cpp (1)
202-206: Factory-created agents still use the GIL-holding submit binding.After removing the concrete Mooncake binding,
make_transfer_agent("mooncake", ...)returnsBaseTransferAgent, sosubmit_transfer_requests()now goes through the base binding at Line 159 through Line 162. That overload still holds the GIL, unlike theNixlTransferAgentbinding below, so factory-created agents lose that concurrency behavior. IfsubmitTransferRequests()can block on engine work, it should probably mirror the baseTransferStatustreatment here as well.♻️ Suggested follow-up
.def( "submit_transfer_requests", [](kvc::BaseTransferAgent& self, kvc::TransferRequest const& request) { return self.submitTransferRequests(request).release(); }, - nb::arg("request"), nb::rv_policy::take_ownership) + nb::arg("request"), nb::rv_policy::take_ownership, + nb::call_guard<nb::gil_scoped_release>())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tensorrt_llm/executor/cache_transmission/nixl_utils/agentBindings.cpp` around lines 202 - 206, Factory-created agents returned by make_transfer_agent("mooncake", ...) end up as BaseTransferAgent so calls to submit_transfer_requests() use the base binding (submitTransferRequests) which still holds the GIL; add a GIL-releasing binding for the BaseTransferAgent submit_transfer_requests (the same pattern used for NixlTransferAgent's submitTransferRequests) so factory-created agents get the non-blocking concurrency behavior—locate the submit_transfer_requests/submitTransferRequests binding and mirror the NixlTransferAgent GIL-release wrapper for BaseTransferAgent (and ensure TransferStatus/TransferAgent overloads match this treatment).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@cpp/tensorrt_llm/executor/cache_transmission/nixl_utils/agentBindings.cpp`:
- Around line 202-206: Factory-created agents returned by
make_transfer_agent("mooncake", ...) end up as BaseTransferAgent so calls to
submit_transfer_requests() use the base binding (submitTransferRequests) which
still holds the GIL; add a GIL-releasing binding for the BaseTransferAgent
submit_transfer_requests (the same pattern used for NixlTransferAgent's
submitTransferRequests) so factory-created agents get the non-blocking
concurrency behavior—locate the submit_transfer_requests/submitTransferRequests
binding and mirror the NixlTransferAgent GIL-release wrapper for
BaseTransferAgent (and ensure TransferStatus/TransferAgent overloads match this
treatment).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 30d08416-6c43-428a-9638-c8f60a246f2f
📒 Files selected for processing (4)
cpp/tensorrt_llm/executor/cache_transmission/mooncake_utils/transferAgent.cppcpp/tensorrt_llm/executor/cache_transmission/nixl_utils/CMakeLists.txtcpp/tensorrt_llm/executor/cache_transmission/nixl_utils/agentBindings.cpptests/unittest/bindings/test_transfer_agent_bindings.py
💤 Files with no reviewable changes (1)
- cpp/tensorrt_llm/executor/cache_transmission/mooncake_utils/transferAgent.cpp
|
PR_Github #38822 [ run ] completed with state
|
Summary by CodeRabbit
Performance Improvements
New Features
Tests
Description
Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.