Skip to content

Optimize ROCm for send_recv and model_update#424

Open
aaab8b wants to merge 1 commit intoalibaba:mainfrom
aaab8b:rocm-optimizations
Open

Optimize ROCm for send_recv and model_update#424
aaab8b wants to merge 1 commit intoalibaba:mainfrom
aaab8b:rocm-optimizations

Conversation

@aaab8b
Copy link
Copy Markdown
Contributor

@aaab8b aaab8b commented Apr 15, 2026

This PR introduces conditional ROCm optimizations:

  • Conditionally implements double buffering logic for tensor buckets in send_recv_utils.py using current_platform.is_rocm().
  • Adds dist.barrier() to model_update.py to prevent data overwriting before the receiver finishes processing for stability on ROCm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant