-
Notifications
You must be signed in to change notification settings - Fork 132
IBM Feature Improvements and Speedup #1157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
danieljvickers
wants to merge
80
commits into
MFlowCode:master
Choose a base branch
from
danieljvickers:gpu-optimizations
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+2,749
−1,791
Open
Changes from all commits
Commits
Show all changes
80 commits
Select commit
Hold shift + click to select a range
0a304a2
Moved all of the IB marker calculation to the GPU without copy
danieljvickers e8c778e
Added profiling and increased maximum num IBs to 1000
danieljvickers 5d885bb
Performance tuning complted for marker generation
danieljvickers 85889c0
Bindary search for IB index region beginning for reduced IB marker co…
danieljvickers f0085c9
ghost points are now computed on the GPU
danieljvickers bfcc593
image points computed on the GPU for x4 performance in that subroutine
danieljvickers 3c4b6dd
Merge branch 'master' into gpu-optimizations
danieljvickers 5201ee7
Need WAY more parameters in the case file... We should probably do so…
danieljvickers 622edb0
Extended the binary search reduction to all 3D IB geometries
danieljvickers 0a089ce
Extended area reduction to all non-model IBs
danieljvickers 804a286
Intermittent commit for GPU STLs
danieljvickers 64bc348
Ib markers computed on GPU working
danieljvickers bc972ca
Passes STL tests with GPU compute for IB markers (not added levelset …
danieljvickers a1769d0
Moved mdoel-specific code to the model file for cleanliness
danieljvickers 5e58655
STLs appear to be working on the GPU with NVHPC!
danieljvickers 6cc7acc
STLs ran on GPU in 3D!
danieljvickers 5fc31be
Merge branch 'master' into gpu-optimizations
danieljvickers 4dc9072
Missed on comparison that NVHPC allows, but GNU does not
danieljvickers 0fd5492
Resolved issues with GPU arrays allocation for 3D STLs
danieljvickers 6edad1b
Finished GPU implementation of all subroutines. Final check before re…
danieljvickers 1e5326a
Refactored interpolation out of the code and replaced it with a proje…
danieljvickers c542668
Merge branch 'master' into gpu-optimizations
danieljvickers f42982e
The code really wanted me to run formatting
danieljvickers 0099975
Some more line deletions for the model normal vector
danieljvickers 8709609
Renamed model subroutines to use the s_ prefix for consistency with t…
danieljvickers 04e82b6
Deleted some old constants taht don't need to be here anymore
danieljvickers 87d40f8
Fixed STL IBM tests having models smaller than the grid resolution
danieljvickers 4e83c95
Regenerated golden files after fixing failure to interpolate bug
danieljvickers 040884d
Adding decimal points to make lint happy
danieljvickers 3071dbd
Fixed concave shapes
danieljvickers 1deb65a
Moved ray tracing to only include cell centers, making it consistent …
danieljvickers 6b73840
Did not rerun after finding bug. Commiting now
danieljvickers 231a65c
Better marker generation after swapping to cell centered approach req…
danieljvickers 175eacb
Merge branch 'gpu-optimizations' of github.com:danieljvickers/MFC int…
danieljvickers b9b6749
Testing if this causes the seg fault
danieljvickers 8f371b5
Formatting
danieljvickers fdc3839
Found out that along edges/vertices I had the normal vector backwards
danieljvickers 9e4c7ad
Merge branch 'gpu-optimizations' of github.com:danieljvickers/MFC int…
danieljvickers d23d5f1
Now that we are accurately computing the Ib markers, it is clean that…
danieljvickers 9418565
Forgot to remove print
danieljvickers 96b8e46
Never saved change to skipped case
danieljvickers faaa0a7
Merge branch 'master' into gpu-optimizations
sbryngelson 8681b59
Further model space reduction to improve search
danieljvickers 5989738
Changes for multi-node compute
danieljvickers 2c43100
Fixed compiler seg fault
danieljvickers 5b65847
Fixed model not compiling
danieljvickers e88cb81
fixed error in OpenMP test for 3D distances
danieljvickers 88ad74e
Fixed 3D STL test on Cray
7821bf9
Remove additional test variable
81a7897
Merge branch 'master' into gpu-optimizations
danieljvickers 51ca379
Merge branch 'master' into gpu-optimizations
danieljvickers 277d634
Merge branch 'master' into gpu-optimizations
sbryngelson 93970fb
Merge branch 'master' into gpu-optimizations
danieljvickers a3dbbff
Remove global constant from copyin
danieljvickers cd3c0fe
Merge branch 'gpu-optimizations' of github.com:danieljvickers/MFC int…
danieljvickers f23d93e
Merge branch 'master' into gpu-optimizations
danieljvickers 84cc40e
Fixed issue with openMP build not having access to the correctly allo…
danieljvickers ae00ed6
Spelling change
danieljvickers 7abaffc
need to protect public statement on GNU
76e72dc
Some initial periodic framework
danieljvickers 42f35e8
Applying ib markers periodically
danieljvickers c208699
Finished periodicity and added another optimization
danieljvickers 82f1372
Added parallel sequential to the decope periodicity subroutine
danieljvickers 370aa4c
Merge branch 'master' into gpu-optimizations
sbryngelson 3c6e196
Added periodic boundaries
danieljvickers 6c3a125
Merge branch 'gpu-optimizations' of github.com:danieljvickers/MFC-Dan…
danieljvickers c8ee2d0
Formatting. Yes, spencer, I actually ran formatting last time. The pr…
danieljvickers f49c405
Fixed 2D IB seg faults
danieljvickers 4c1d933
Fixed moving 2D segfault
danieljvickers 2e0b039
Removed a debugging sequential GPU parallelism slowing down the code
danieljvickers 193b0cd
Test to see what happens if we disable the build cache on frontier
2601fc8
Added caching back in and fixed problem with calling GPU build enviro…
224a7b6
Resolve merge conflict with master: combine compiler_flag + dynamic mode
sbryngelson 03a5ce1
Fix frontier_amd/build.sh to use dynamic module mode like frontier/
sbryngelson 3d6dea4
Addressing PR comments
danieljvickers 1883b9b
NVHPC compilation errors
danieljvickers 5b1aa96
Swapped IB marker output to match the original instead of the project…
danieljvickers 53ebfb7
Added a 2 MPI rank sphere and a periodic circle case
danieljvickers 03e65ed
Remvoed accidental extra cases and generated data for cases I missed
danieljvickers 2ace4c0
Merge branch 'master' into gpu-optimizations
sbryngelson File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decouple IB scaling from global patch limit.
Line 26 increases
num_patches_maxto 1000, which expands all patch-sized static structures globally, not just IB capacity. This creates avoidable memory pressure and can impact startup/runtime stability. Consider keepingnum_patches_maxat its prior scope and introducing a dedicated IB limit constant (e.g.,num_ibs_max).Suggested constant split