Skip to content

[hipBLASlt][CMS] CMS for TF32 128x128x32 NT#4900

Open
emezh wants to merge 4 commits intodevelopfrom
users/emezh/cms_tf32_128x128x32_nt
Open

[hipBLASlt][CMS] CMS for TF32 128x128x32 NT#4900
emezh wants to merge 4 commits intodevelopfrom
users/emezh/cms_tf32_128x128x32_nt

Conversation

@emezh
Copy link
Contributor

@emezh emezh commented Feb 25, 2026

Motivation

Add CMS for TF32 128x128x32 NT.
Also, fix custom_mainloop_scheduling_tf32.yaml to use BiasTypeArgs: ['s'] for all configs.

Tensile, no CMS vs CMS

MNK = 2048,2048,8192

  • Time: 12% improvement
  • Efficiency: 44.7% --> 66%

Bench, Baseline vs CMS

MNK = 2048,2048,8192

  • Time: no improvement (-15.4%)
  • Efficiency: n/a - baseline and new CMS use different tiles/kernels

MNK = 2048,2048,4096

  • Time: 5% improvement
  • Efficiency: n/a - baseline and new CMS use different tiles/kernels

Test Result

Local tests:

  • custom_mainloop_scheduling_tf32.yaml - pass
  • test_CustomSchedule.py - pass
  • test_CustomSchedule_LayoutAutoDetection.py - pass
  • hipblaslt-test - pass
[==========] 22050 tests from 12 test suites ran. (1645862 ms total)
[  PASSED  ] 22050 tests.
  • tensile with the ranges below - pass
          - Exact: [2048, 2048, 1, 8192]
          - Exact: [2048, 2048, 1, 4096]
          - Range: [[128], [128], [1], [64, 64, 256]]
          - Range: [[128], [128], [1], [1,1,64]]
          - Range: [[128], [128], [1], [32, 64, 256]]

Submission Checklist

AIGECORE-69

@emezh emezh marked this pull request as ready for review February 25, 2026 23:31
@emezh emezh requested a review from a team as a code owner February 25, 2026 23:31
@codecov-commenter
Copy link

Codecov Report

✅ All modified and coverable lines are covered by tests.

❌ Your project status has failed because the head coverage (76.83%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #4900   +/-   ##
========================================
  Coverage    65.94%   65.94%           
========================================
  Files         1718     1718           
  Lines       267197   267197           
  Branches     37045    37045           
========================================
  Hits        176194   176194           
  Misses       75448    75448           
  Partials     15555    15555           
Flag Coverage Δ *Carryforward flag
hipBLAS 90.67% <ø> (ø) Carriedforward from aae6a45
hipBLASLt 43.55% <ø> (ø)
hipCUB 82.38% <ø> (ø) Carriedforward from aae6a45
hipDNN 80.82% <ø> (ø) Carriedforward from aae6a45
hipFFT 55.93% <ø> (ø) Carriedforward from aae6a45
hipRAND 76.12% <ø> (ø) Carriedforward from aae6a45
hipSOLVER 68.81% <ø> (ø) Carriedforward from aae6a45
hipSPARSE 84.70% <ø> (ø) Carriedforward from aae6a45
rocBLAS 47.97% <ø> (ø) Carriedforward from aae6a45
rocFFT 52.91% <ø> (ø) Carriedforward from aae6a45
rocRAND 57.06% <ø> (ø) Carriedforward from aae6a45
rocSOLVER 76.83% <ø> (ø) Carriedforward from aae6a45
rocSPARSE 71.53% <ø> (ø) Carriedforward from aae6a45

*This pull request uses carry forward flags. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@math-ci-webhook
Copy link

perfci run on commit 66fa26f

math-ci run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants