Skip to content

[hipblaslt] Fix fails with dtl.yaml and xfp32.yaml on gfx950_mx_rebase#4906

Merged
nakajee merged 2 commits intogfx950_mx_rebasefrom
users/nakajee/gfx950mxfp4_dtl_fix_3
Feb 26, 2026
Merged

[hipblaslt] Fix fails with dtl.yaml and xfp32.yaml on gfx950_mx_rebase#4906
nakajee merged 2 commits intogfx950_mx_rebasefrom
users/nakajee/gfx950mxfp4_dtl_fix_3

Conversation

@nakajee
Copy link
Contributor

@nakajee nakajee commented Feb 26, 2026

Motivation

Fix fails with dtl.yaml and xfp32.yaml on gfx950_mx_rebase branch

Technical Details

  • Fixed merge issue with if kernel["ScheduleIterAlg"] == 3
  • Added int cast for float value const in asm
  • Fixed incorrect parameter for calcLdsBlockSizePerPad()
  • Fixed incorrect local read calculation due to incorrectly applying MX logic to TF32 emulation
  • Fixed incorrect ShiftK code vreg due to missing if condition for TF32 emulation

Test Plan

ran dtl.yaml and xfp32.yaml on gfx950_mx_rebase branch

Test Result

All passed except for a known issue (should be fixed with latest develop branch)

Submission Checklist

@nakajee nakajee requested review from b-shi and msujon-AMD February 26, 2026 01:37
@nakajee nakajee requested a review from a team as a code owner February 26, 2026 01:37
@nakajee nakajee changed the title Fix fails with dtl.yaml [hipblaslt] Fix fails with dtl.yaml Feb 26, 2026
@nakajee nakajee changed the title [hipblaslt] Fix fails with dtl.yaml [hipblaslt] Fix fails with dtl.yaml and xfp32.yaml on gfx950_mx_rebase Feb 26, 2026
Copy link
Contributor

@b-shi b-shi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the fix!

@nakajee nakajee merged commit 1c2fe0e into gfx950_mx_rebase Feb 26, 2026
5 of 6 checks passed
@nakajee nakajee deleted the users/nakajee/gfx950mxfp4_dtl_fix_3 branch February 26, 2026 20:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants