Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
0a304a2
Moved all of the IB marker calculation to the GPU without copy
danieljvickers Feb 16, 2026
e8c778e
Added profiling and increased maximum num IBs to 1000
danieljvickers Feb 16, 2026
5d885bb
Performance tuning complted for marker generation
danieljvickers Feb 16, 2026
85889c0
Bindary search for IB index region beginning for reduced IB marker co…
danieljvickers Feb 16, 2026
f0085c9
ghost points are now computed on the GPU
danieljvickers Feb 17, 2026
bfcc593
image points computed on the GPU for x4 performance in that subroutine
danieljvickers Feb 17, 2026
3c4b6dd
Merge branch 'master' into gpu-optimizations
danieljvickers Feb 17, 2026
5201ee7
Need WAY more parameters in the case file... We should probably do so…
danieljvickers Feb 17, 2026
622edb0
Extended the binary search reduction to all 3D IB geometries
danieljvickers Feb 17, 2026
0a089ce
Extended area reduction to all non-model IBs
danieljvickers Feb 17, 2026
804a286
Intermittent commit for GPU STLs
danieljvickers Feb 18, 2026
64bc348
Ib markers computed on GPU working
danieljvickers Feb 18, 2026
bc972ca
Passes STL tests with GPU compute for IB markers (not added levelset …
danieljvickers Feb 18, 2026
a1769d0
Moved mdoel-specific code to the model file for cleanliness
danieljvickers Feb 19, 2026
5e58655
STLs appear to be working on the GPU with NVHPC!
danieljvickers Feb 19, 2026
6cc7acc
STLs ran on GPU in 3D!
danieljvickers Feb 19, 2026
5fc31be
Merge branch 'master' into gpu-optimizations
danieljvickers Feb 20, 2026
4dc9072
Missed on comparison that NVHPC allows, but GNU does not
danieljvickers Feb 20, 2026
0fd5492
Resolved issues with GPU arrays allocation for 3D STLs
danieljvickers Feb 20, 2026
6edad1b
Finished GPU implementation of all subroutines. Final check before re…
danieljvickers Feb 20, 2026
1e5326a
Refactored interpolation out of the code and replaced it with a proje…
danieljvickers Feb 21, 2026
c542668
Merge branch 'master' into gpu-optimizations
danieljvickers Feb 21, 2026
f42982e
The code really wanted me to run formatting
danieljvickers Feb 21, 2026
0099975
Some more line deletions for the model normal vector
danieljvickers Feb 21, 2026
8709609
Renamed model subroutines to use the s_ prefix for consistency with t…
danieljvickers Feb 21, 2026
04e82b6
Deleted some old constants taht don't need to be here anymore
danieljvickers Feb 21, 2026
87d40f8
Fixed STL IBM tests having models smaller than the grid resolution
danieljvickers Feb 21, 2026
4e83c95
Regenerated golden files after fixing failure to interpolate bug
danieljvickers Feb 21, 2026
040884d
Adding decimal points to make lint happy
danieljvickers Feb 21, 2026
3071dbd
Fixed concave shapes
danieljvickers Feb 21, 2026
1deb65a
Moved ray tracing to only include cell centers, making it consistent …
danieljvickers Feb 21, 2026
6b73840
Did not rerun after finding bug. Commiting now
danieljvickers Feb 21, 2026
231a65c
Better marker generation after swapping to cell centered approach req…
danieljvickers Feb 21, 2026
175eacb
Merge branch 'gpu-optimizations' of github.com:danieljvickers/MFC int…
danieljvickers Feb 21, 2026
b9b6749
Testing if this causes the seg fault
danieljvickers Feb 21, 2026
8f371b5
Formatting
danieljvickers Feb 21, 2026
fdc3839
Found out that along edges/vertices I had the normal vector backwards
danieljvickers Feb 22, 2026
9e4c7ad
Merge branch 'gpu-optimizations' of github.com:danieljvickers/MFC int…
danieljvickers Feb 22, 2026
d23d5f1
Now that we are accurately computing the Ib markers, it is clean that…
danieljvickers Feb 22, 2026
9418565
Forgot to remove print
danieljvickers Feb 22, 2026
96b8e46
Never saved change to skipped case
danieljvickers Feb 22, 2026
faaa0a7
Merge branch 'master' into gpu-optimizations
sbryngelson Feb 22, 2026
8681b59
Further model space reduction to improve search
danieljvickers Feb 23, 2026
5989738
Changes for multi-node compute
danieljvickers Feb 23, 2026
2c43100
Fixed compiler seg fault
danieljvickers Feb 23, 2026
5b65847
Fixed model not compiling
danieljvickers Feb 23, 2026
e88cb81
fixed error in OpenMP test for 3D distances
danieljvickers Feb 23, 2026
88ad74e
Fixed 3D STL test on Cray
Feb 23, 2026
7821bf9
Remove additional test variable
Feb 23, 2026
81a7897
Merge branch 'master' into gpu-optimizations
danieljvickers Feb 24, 2026
51ca379
Merge branch 'master' into gpu-optimizations
danieljvickers Feb 24, 2026
277d634
Merge branch 'master' into gpu-optimizations
sbryngelson Feb 24, 2026
93970fb
Merge branch 'master' into gpu-optimizations
danieljvickers Feb 24, 2026
a3dbbff
Remove global constant from copyin
danieljvickers Feb 25, 2026
cd3c0fe
Merge branch 'gpu-optimizations' of github.com:danieljvickers/MFC int…
danieljvickers Feb 25, 2026
f23d93e
Merge branch 'master' into gpu-optimizations
danieljvickers Feb 25, 2026
84cc40e
Fixed issue with openMP build not having access to the correctly allo…
danieljvickers Feb 25, 2026
ae00ed6
Spelling change
danieljvickers Feb 25, 2026
7abaffc
need to protect public statement on GNU
Feb 25, 2026
76e72dc
Some initial periodic framework
danieljvickers Feb 26, 2026
42f35e8
Applying ib markers periodically
danieljvickers Feb 27, 2026
c208699
Finished periodicity and added another optimization
danieljvickers Feb 27, 2026
82f1372
Added parallel sequential to the decope periodicity subroutine
danieljvickers Feb 27, 2026
370aa4c
Merge branch 'master' into gpu-optimizations
sbryngelson Feb 27, 2026
3c6e196
Added periodic boundaries
danieljvickers Feb 27, 2026
6c3a125
Merge branch 'gpu-optimizations' of github.com:danieljvickers/MFC-Dan…
danieljvickers Feb 27, 2026
c8ee2d0
Formatting. Yes, spencer, I actually ran formatting last time. The pr…
danieljvickers Feb 27, 2026
f49c405
Fixed 2D IB seg faults
danieljvickers Feb 28, 2026
4c1d933
Fixed moving 2D segfault
danieljvickers Feb 28, 2026
2e0b039
Removed a debugging sequential GPU parallelism slowing down the code
danieljvickers Feb 28, 2026
193b0cd
Test to see what happens if we disable the build cache on frontier
Feb 28, 2026
2601fc8
Added caching back in and fixed problem with calling GPU build enviro…
Feb 28, 2026
224a7b6
Resolve merge conflict with master: combine compiler_flag + dynamic mode
sbryngelson Feb 28, 2026
03a5ce1
Fix frontier_amd/build.sh to use dynamic module mode like frontier/
sbryngelson Feb 28, 2026
3d6dea4
Addressing PR comments
danieljvickers Feb 28, 2026
1883b9b
NVHPC compilation errors
danieljvickers Feb 28, 2026
5b1aa96
Swapped IB marker output to match the original instead of the project…
danieljvickers Feb 28, 2026
53ebfb7
Added a 2 MPI rank sphere and a periodic circle case
danieljvickers Feb 28, 2026
03e65ed
Remvoed accidental extra cases and generated data for cases I missed
danieljvickers Mar 1, 2026
2ace4c0
Merge branch 'master' into gpu-optimizations
sbryngelson Mar 1, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/frontier/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ if [ "$job_device" = "gpu" ]; then
fi
fi

. ./mfc.sh load -c $compiler_flag -m g
. ./mfc.sh load -c $compiler_flag -m $([ "$job_device" = "gpu" ] && echo "g" || echo "c")

# Only set up build cache for test suite, not benchmarks
if [ "$run_bench" != "bench" ]; then
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/frontier_amd/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ if [ "$job_device" = "gpu" ]; then
fi
fi

. ./mfc.sh load -c $compiler_flag -m g
. ./mfc.sh load -c $compiler_flag -m $([ "$job_device" = "gpu" ] && echo "g" || echo "c")

# Only set up build cache for test suite, not benchmarks
if [ "$run_bench" != "bench" ]; then
Expand Down
6 changes: 3 additions & 3 deletions docs/documentation/case.md
Original file line number Diff line number Diff line change
Expand Up @@ -347,11 +347,11 @@ Additional details on this specification can be found in [The Naca Airfoil Serie

- Please see [Patch Parameters](#sec-patches) for the descriptions of `model_filepath`, `model_scale`, `model_rotate`, `model_translate`, `model_spc`, and `model_threshold`.

- `moving_ibm` sets the method by which movement will be applied to the immersed boundary. Using 0 will result in no movement. Using 1 will result 1-way coupling where the boundary moves at a constant rate and applied forces to the fluid based upon it's own motion. In 1-way coupling, the fluid does not apply forces back onto the IB.
- `moving_ibm` sets the method by which movement will be applied to the immersed boundary. Using 0 will result in no movement. Using 1 will result 1-way coupling where the boundary moves at a constant rate and applied forces to the fluid based upon it's own motion. In 1-way coupling, the fluid does not apply forces back onto the IB. Using 2 will result in 2-way coupling, where the boundary pushes on the fluid and the fluid pushes back on the boundary via pressure and viscous forces. If external forces are applied, the boundary will also experience those forces.

- `vel(i)` is the initial linear velocity of the IB in the x, y, z direction for i=1, 2, 3. When `moving_ibm` equals 1, this velocity is constant.
- `vel(i)` is the initial linear velocity of the IB in the x, y, z direction for i=1, 2, 3. When `moving_ibm` equals 2, this velocity is just the starting speed of the object, which will then accelerate due to external forces. If `moving_ibm` equals 1, then this is constant if it is a number, or can be described analytically with an expression.

- `angular_vel(i)` is the initial angular velocity of the IB about the x, y, z axes for i=1, 2, 3 in radians per second. When `moving_ibm` equals 1, this angular velocity is constant.
- `angular_vel(i)` is the initial angular velocity of the IB about the x, y, z axes for i=1, 2, 3 in radians per second. When `moving_ibm` equals 2, this rotation rate is just the starting rate of the object, which will then change due to external torques. If `moving_ibm` equals 1, then this is constant if it is a number, or can be described analytically with an expression.

### 5. Fluid Material's {#sec-fluid-materials}

Expand Down
10 changes: 10 additions & 0 deletions src/common/include/parallel_macros.fpp
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,16 @@
#endif
#:enddef

#:def END_GPU_ATOMIC_CAPTURE()
#:set acc_end_directive = '!$acc end atomic'
#:set omp_end_directive = '!$omp end atomic'
#if defined(MFC_OpenACC)
$:acc_end_directive
#elif defined(MFC_OpenMP)
$:omp_end_directive
#endif
#:enddef

#:def GPU_UPDATE(host=None, device=None, extraAccArgs=None, extraOmpArgs=None)
#:set acc_code = ACC_UPDATE(host=host, device=device, extraAccArgs=extraAccArgs)
#:set omp_code = OMP_UPDATE(host=host, device=device, extraOmpArgs=extraOmpArgs)
Expand Down
6 changes: 1 addition & 5 deletions src/common/m_constants.fpp
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ module m_constants
integer, parameter :: fourier_rings = 5 !< Fourier filter ring limit
integer, parameter :: num_fluids_max = 10 !< Maximum number of fluids in the simulation
integer, parameter :: num_probes_max = 10 !< Maximum number of flow probes in the simulation
integer, parameter :: num_patches_max = 10
integer, parameter :: num_patches_max = 1000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Decouple IB scaling from global patch limit.

Line 26 increases num_patches_max to 1000, which expands all patch-sized static structures globally, not just IB capacity. This creates avoidable memory pressure and can impact startup/runtime stability. Consider keeping num_patches_max at its prior scope and introducing a dedicated IB limit constant (e.g., num_ibs_max).

Suggested constant split
-    integer, parameter :: num_patches_max = 1000
+    integer, parameter :: num_patches_max = 10
+    integer, parameter :: num_ibs_max = 1000
Based on learnings: Code Review Priorities (in order): (1) Correctness, (2) Precision discipline, (3) Memory management, (4) MPI correctness, (5) GPU code, (6) Physics consistency, (7) Compiler portability.

integer, parameter :: num_bc_patches_max = 10
integer, parameter :: pathlen_max = 400
integer, parameter :: nnode = 4 !< Number of QBMM nodes
Expand All @@ -50,14 +50,10 @@ module m_constants
real(wp), parameter :: dflt_T_guess = 1200._wp ! Default guess for temperature (when a previous value is not available)

! IBM+STL interpolation constants
integer, parameter :: Ifactor_2D = 50 !< Multiple factor of the ratio (edge to cell width) for interpolation along edges for 2D models
integer, parameter :: Ifactor_3D = 5 !< Multiple factor of the ratio (edge to cell width) for interpolation along edges for 3D models
integer, parameter :: Ifactor_bary_3D = 20 !< Multiple factor of the ratio (triangle area to cell face area) for interpolation on triangle facets for 3D models
integer, parameter :: num_ray = 20 !< Default number of rays traced per cell
real(wp), parameter :: ray_tracing_threshold = 0.9_wp !< Threshold above which the cell is marked as the model patch
real(wp), parameter :: threshold_vector_zero = 1.e-10_wp !< Threshold to treat the component of a vector to be zero
real(wp), parameter :: threshold_edge_zero = 1.e-10_wp !< Threshold to treat two edges to be overlapped
real(wp), parameter :: threshold_bary = 1.e-1_wp !< Threshold to interpolate a barycentric facet
real(wp), parameter :: initial_distance_buffer = 1.e12_wp !< Initialized levelset distance for the shortest path pair algorithm

! Lagrange bubbles constants
Expand Down
9 changes: 8 additions & 1 deletion src/common/m_derived_types.fpp
Original file line number Diff line number Diff line change
Expand Up @@ -183,12 +183,18 @@ module m_derived_types
end type t_model

type :: t_model_array
! Original CPU-side fields (unchanged)
type(t_model), allocatable :: model
real(wp), allocatable, dimension(:, :, :) :: boundary_v
real(wp), allocatable, dimension(:, :) :: interpolated_boundary_v
integer :: boundary_edge_count
integer :: total_vertices
logical :: interpolate
integer :: interpolate

! GPU-friendly flattened arrays
integer :: ntrs ! copy of model%ntrs
real(wp), allocatable, dimension(:, :, :) :: trs_v ! (3, 3, ntrs) - triangle vertices
real(wp), allocatable, dimension(:, :) :: trs_n ! (3, ntrs) - triangle normals
end type t_model_array

!> Derived type adding initial condition (ic) patch parameters as attributes
Expand Down Expand Up @@ -450,6 +456,7 @@ module m_derived_types
real(wp), dimension(1:3) :: levelset_norm
logical :: slip
integer, dimension(3) :: DB
integer :: x_periodicity, y_periodicity, z_periodicity
end type ghost_point

!> Species parameters
Expand Down
2 changes: 2 additions & 0 deletions src/common/m_helper.fpp
Original file line number Diff line number Diff line change
Expand Up @@ -333,6 +333,8 @@ contains
!! @return The cross product of the two vectors.
pure function f_cross(a, b) result(c)

$:GPU_ROUTINE(parallelism='[seq]')

real(wp), dimension(3), intent(in) :: a, b
real(wp), dimension(3) :: c

Expand Down
Loading
Loading