Skip to content

[Hipblaslt] Buffer c++ timing instrumentations to avoid excess overhead#4884

Open
Alex-Vasile wants to merge 1 commit intodevelopfrom
users/alvasile/buffer_timing
Open

[Hipblaslt] Buffer c++ timing instrumentations to avoid excess overhead#4884
Alex-Vasile wants to merge 1 commit intodevelopfrom
users/alvasile/buffer_timing

Conversation

@Alex-Vasile
Copy link
Contributor

Motivation

Current implementation added substantial overhead to timings because of flushes to stderr on each call.

Technical Details

Buffer calls to 1MB and flush when passing that amount, or when exiting.

Test Plan

Test Result

Submission Checklist

Signed-off-by: Alex Vasile <48962821+Alex-Vasile@users.noreply.github.com>
@astrelsky
Copy link
Contributor

Why not just use std::clog instead of reinventing the wheel?

It's std::cerr except buffered. As long as you're not misusing std::endl, which flushes the buffer, it should be fine.

@math-ci-webhook
Copy link

perfci run on commit 7a0a87c

math-ci run

@codecov-commenter
Copy link

Codecov Report

✅ All modified and coverable lines are covered by tests.

❌ Your project status has failed because the head coverage (76.83%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #4884   +/-   ##
========================================
  Coverage    65.35%   65.35%           
========================================
  Files         1718     1718           
  Lines       267217   267217           
  Branches     37047    37047           
========================================
  Hits        174637   174637           
  Misses       77072    77072           
  Partials     15508    15508           
Flag Coverage Δ *Carryforward flag
hipBLAS 90.67% <ø> (ø) Carriedforward from 8e4652b
hipBLASLt 43.55% <ø> (ø)
hipCUB 81.98% <ø> (ø) Carriedforward from 8e4652b
hipDNN 80.74% <ø> (ø) Carriedforward from 8e4652b
hipFFT 55.93% <ø> (ø) Carriedforward from 8e4652b
hipRAND 76.12% <ø> (ø) Carriedforward from 8e4652b
hipSOLVER 68.81% <ø> (ø) Carriedforward from 8e4652b
hipSPARSE 84.70% <ø> (ø) Carriedforward from 8e4652b
rocBLAS 47.97% <ø> (ø) Carriedforward from 8e4652b
rocFFT 47.75% <ø> (ø) Carriedforward from 8e4652b
rocRAND 57.06% <ø> (ø) Carriedforward from 8e4652b
rocSOLVER 76.83% <ø> (ø) Carriedforward from 8e4652b
rocSPARSE 71.53% <ø> (ø) Carriedforward from 8e4652b

*This pull request uses carry forward flags. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Alex-Vasile
Copy link
Contributor Author

Why not just use std::clog instead of reinventing the wheel?

It's std::cerr except buffered. As long as you're not misusing std::endl, which flushes the buffer, it should be fine.

Thanks for the message. I hadn't tried it since I hadn't heard of it. (I do very little C++ work)

I have tried using it by replacing:

std::cerr << "TIMING:" << m_category << ":" << duration.count() << std::endl;

with

std::clog << "TIMING:" << m_category << ':' << duration.count() << '\n';

But do not actually see a speedup. Both add the same ~12s overhead.

@astrelsky
Copy link
Contributor

astrelsky commented Feb 25, 2026

Why not just use std::clog instead of reinventing the wheel?

It's std::cerr except buffered. As long as you're not misusing std::endl, which flushes the buffer, it should be fine.

Thanks for the message. I hadn't tried it since I hadn't heard of it. (I do very little C++ work)

I have tried using it by replacing:

std::cerr << "TIMING:" << m_category << ":" << duration.count() << std::endl;

with

std::clog << "TIMING:" << m_category << ':' << duration.count() << '\n';

But do not actually see a speedup. Both add the same ~12s overhead.

That's weird, it might be due to some implementation detail of std::clog causing it to be flushed too often. Buffer size and the underlying type is implementation defined except that it must be derived from std::streambuf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants