r.sim: Parallelize dx/dy derivatives computation using OpenMP#7094
r.sim: Parallelize dx/dy derivatives computation using OpenMP#7094petrasovaa merged 7 commits intoOSGeo:mainfrom
Conversation
|
Can you please add the diff into your description to add the time measuring code? |
|
Hi @wenzeslaus ! |
|
Another thing, do the current tests actually cover the two cases for dx-dy and for enabled/disabled parallelism? |
It doesn't cover external dx/dy for parallel I think but it doesn't need dx-dy parallelization at that point of time because it is being calculated externally for that particular cause and for the internal calculation we are using dx-dy so yeah for that we have the parallelization as of now. |
That's good ... could you give some more context on this ? |
|
I've removed the redundant include and added test_nodxdy_parallel, which specifically exercises the parallel derivatives calculation (nprocs=2) without external input maps. |
petrasovaa
left a comment
There was a problem hiding this comment.
Thanks @Abhi-d-gr8, finally got to it!

Overview
This PR parallelizes the dx/dy slope derivatives computation in
simlib/derivatives.c(Horn 3×3 method), addressing #7039.The main simulation already supports multiple cores, but the dx/dy computation was still serial. This change enables that step to use available CPU cores as well.
What changed
parallel foron the outer row loop.default(none)with explicitshared(...)andprivate(...)scoping.Performance Evaluation
To isolate the impact of this change, timing was performed inside
derivatives()only, excluding raster I/O and the main simulation loop.This avoids masking the effect, since total
r.sim.waterruntime is dominated by the simulation phase.Test Setup
omp_get_wtime()aroundderivatives()onlyResults (Apple M3 Air)
OMP_NUM_THREADS=1: ~0.19 s (average)OMP_NUM_THREADS=8: ~0.06 s (average)This confirms that the dx/dy computation scales across cores as intended.
As expected, the overall
r.sim.waterruntime improvement is smaller sincederivatives()is only one stage of the workflow.Build / OpenMP Notes
To observe multi-threaded behavior in
derivatives():GRASS must be configured with OpenMP support:
The compiler must support OpenMP (e.g.,
libompon macOS).The source includes guarded OpenMP headers:
Control the number of threads via:
If OpenMP is not available, the code compiles and runs serially without any behavior or numerical changes.
Reproducible Benchmark Command
For reference, this is the exact command used for the interleaved benchmark:
Timing instrumentation (added for evaluation)
To isolate the performance of
derivatives()(not masked by the main simulation),I temporarily added the following OpenMP-guarded timing block:It used for benchmarking and does not affect numerical behavior.