Batch API by amontoison · Pull Request #540 · JuliaSmoothOptimizers/NLPModels.jl

amontoison · 2026-02-04T21:53:39Z

github-actions · 2026-02-04T22:26:03Z

Package name	latest	stable
ADNLPModels
AdaptiveRegularization
AmplNLReader
BundleAdjustmentModels
CUTEst
CaNNOLeS
DCISolver
FletcherPenaltySolver
FluxNLPModels
JSOSolvers
JSOSuite
LLSModels
ManualNLPModels
NLPModelsIpopt
NLPModelsJuMP
NLPModelsKnitro
NLPModelsModifiers
NLPModelsTest
NLSProblems
PDENLPModels
PartiallySeparableNLPModels
PartiallySeparableSolvers
Percival
QuadraticModels
RegularizedOptimization
RegularizedProblems
SolverBenchmark
SolverTest
SolverTools

test/nlp/utils.jl

amontoison · 2026-02-04T22:29:52Z

Michael, I finished what I wanted.
You can do a pass on the tests when you have time.

klamike · 2026-02-05T01:45:42Z

@amontoison probably this is why the VI is not in the regular meta: klamike@c54ded6 we can't infer it.

klamike · 2026-02-05T02:35:34Z

A more realistic example, batched QuadraticModel where we vary only the RHS: JuliaSmoothOptimizers/QuadraticModels.jl@main...klamike:QuadraticModels.jl:mk/rhsbatch

amontoison · 2026-02-05T02:46:54Z

Amazing Michael!

amontoison · 2026-02-05T03:05:19Z

Should we hardcode VI = Vector{Float64} like in the non-batch case?
Did this API is what we need in MadIPM.jl or should we adjust a few things?

klamike · 2026-02-05T03:27:09Z

Should we hardcode VI = Vector{Float64} like in the non-batch case?

I think so, yes, to be consistent with the regular API. Both can be updated at the same time later (maybe only in 0.22)

Did this API is what we need in MadIPM.jl or should we adjust a few things?

We probably want to define some more meta functions like get_nvar, get_x0.. I'll try to integrate the batch API into the MadIPM UniformBatch over the few days and get back

klamike · 2026-02-08T18:07:14Z

@amontoison what do you think about having an API for updating the nbatch? and maybe some optional get_nlpmodel_at_index returning an AbstractNLPModel`?

amontoison · 2026-02-08T18:41:05Z

@klamike I don't unserstand what you mean by updating the nbatch.
Do you want to dynamically increase or reduce the batch?
For get_nlpmodel_at_index, you suppose that the models are independent but with ExaModels.jl all models will be compiled in one expression and we can't split them.

klamike · 2026-02-08T22:27:18Z

Yes I meant updating the nbatch throughout the solve. At the NLPModels level it would just change the size of the buffers in the out-of-place API. The motivation is to have some way of skipping evaluating batch elements that have already converged, for batch models where that can be done efficiently. Models with non-independent elements can override it to error of course. Similar story for the get_nlpmodel_at_index!.

In the ExaModels case, of course it depends on how you do the batching. When I added parameters to ExaModels I specifically made the lower level functions all take the parameter vector as an input, to make it possible to implement the batching the way I did in BatchNLPKernels. It is based on a single ExaModel and does the same number of kernel launches for the batch evaluation as ExaModels would for a regular model, just with 2D grids. Since the base parametric ExaModel is built "unbatched", it is trivial to implement the set_nbatch and get_nlpmodel_at_index!.

klamike · 2026-02-08T22:46:44Z

Actually the set_nbatch should really be set_active_batch_idx which would itself update the nbatch... I need to think on it some more, it feels more like a solver concern than an NLPModels concern. Might make more sense to implement it on some DynamicBatchNLPModel wrapper that lives in e.g. MadNLP. At least in the meantime, we can just force full-batch all the time here.

I do still think the get_nlpmodel_at_index! would be useful. It would simplify the incremental implementation of batch versions of existing solvers.

amontoison · 2026-02-09T03:30:44Z

@klamike Be free to add what you need in the APi on NLPModels.jl with this PR.

klamike · 2026-02-10T22:47:25Z

I think it's good to go, I was overcomplicating things. I got the MadIPM UniformBatch + RHSBatchQuadraticModel working locally, will clean it up and push to the MadIPM PR soon.

amontoison · 2026-02-11T00:03:08Z

@klamike Do you have any benchmark with RHSBatchQuadraticModel or anything related to batch MadIPM?
I give a talk at Los Alamos National Laboratory tomorrow and happy to include any numerical results!

klamike · 2026-02-13T19:56:41Z

Sorry about that, I missed your message.. The latest result is on a batch of 128 9241_pegase DCOPF, ~6.5x faster over sequential, and ~1.85x faster over multi-threading with 8 threads (comparing 1 task/mini-batch vs 1 task/problem)

amontoison · 2026-02-14T07:37:13Z

Sorry about that, I missed your message.. The latest result is on a batch of 128 9241_pegase DCOPF, ~6.5x faster over sequential, and ~1.85x faster over multi-threading with 8 threads (comparing 1 task/mini-batch vs 1 task/problem)

Good 🥇
For MadIPM.jl, should we provide jfix, jlow, ... as a boolean vector of size ncon * nbatch instead of only local indices?
It hard to get the coefficient for each batch with the current version.

Also should I add a prefix batch_ for the availability flags in AbstractBatchNLPModelMeta ?

@michel2323 worked on BatchExaModel today, and except the function batch_obj, we just need to rename the routines to follow this batch API.
--> exanauts/ExaModels.jl#216

klamike · 2026-02-15T20:12:49Z

For the jfix etc, maybe it would be more consistent with the non-batch API to have a Vector{Vector{Int}}?

Regarding the batch_ prefix, I think it is fine as is. I would instead suggest that we change the expected field for meta to be batch_meta or bmeta, in case one wants to implement both the regular and the batch APIs in one type.

BatchExaModel looks very nice!

sshin23 · 2026-02-15T23:23:53Z

Which GPU did you run the benchmark on @klamike?

klamike · 2026-02-15T23:26:05Z

I believe it was RTX 6000 Pro Blackwell

sshin23 · 2026-02-15T23:50:33Z

I see. For case 9241, practical GPU throughput is about 15 CPU threads? I wonder how optimized is the batched solver. E.g., is the symbolic factorization reused in each solve? Also, fp64 flops have not bottlenecked us so far, but it might be on this regime. I wonder how it performs on B100/H100 gpus Get Outlook for iOS<https://aka.ms/o0ukef>

…

________________________________ From: Michael Klamkin ***@***.***> Sent: Sunday, February 15, 2026 6:26:27 PM To: JuliaSmoothOptimizers/NLPModels.jl ***@***.***> Cc: Sungho Shin ***@***.***>; Comment ***@***.***> Subject: Re: [JuliaSmoothOptimizers/NLPModels.jl] Batch API (PR #540) [https://avatars.githubusercontent.com/u/17013474?s=20&v=4]klamike left a comment (JuliaSmoothOptimizers/NLPModels.jl#540)<#540 (comment)> I believe it was RTX 6000 Pro Blackwell — Reply to this email directly, view it on GitHub<#540 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AG62JSDFO3WUWL5BU2TO4F34MD6CHAVCNFSM6AAAAACUAOHAI2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTSMBVGQZTOMRSHE>. You are receiving this because you commented.Message ID: ***@***.***>

klamike · 2026-02-16T00:53:24Z

We are indeed reusing the symbolic factorization, using cuDSS's uniform batch feature. But the overall performance of the solver is not quite optimized, currently a lot of time is spent on packing/unpacking from individual to batched buffers since that made it simpler to batch-ify incrementally. Now that (most of) the solver is batched, I plan to revisit this.

I think we have H100 and H200 but no B100. I'll give it a try soon, with more CPU threads too.

src/nlp/batch_meta.jl

amontoison · 2026-02-24T03:39:52Z

@klamike We discussed about the batch API with Sungho last week and we converged to a storage where everything is a multi-dimensional array and the last dimension is the number of batch.
Even if it is equivalent, it may be easier to implement kernels this way (pure syntax sugar).
We can easily call vec on the array if needed.
What do you think?

I already updated CUDSS.jl last weekend for that:

klamike · 2026-02-24T03:48:18Z

I like that approach! But for the KKT system nzval, I think strided vector is actually nicer (we can reuse the transfer kernels by just building a batched map). Matrix is definitely more natural for a user-facing API. (so it makes sense to have both in CUDSS).

As I am working on this version, I have come across several kernels which can be written exactly the same (differs only in argument types), or only one or two words different (e.g. same mapreduce but with a dims=1), between batched and unbatched. I'm trying to make it match as closely as possible so we can eventually just have one kernel, shared between batched and unbatched. For that, keeping things as Matrix is very helpful. The name change you made from batch_obj -> obj is nice for that too.

Co-authored-by: Michael Klamkin <klamike@users.noreply.github.com>

amontoison · 2026-02-24T19:32:15Z

@klamike Do you anything else before that I merge the PR?
I just need to polish the documentation.

klamike mentioned this pull request Feb 4, 2026

Batch API #521

Closed

amontoison force-pushed the am/batch_api branch from b5cf68c to 3c4b31b Compare February 4, 2026 21:55

amontoison commented Feb 4, 2026

View reviewed changes

test/nlp/utils.jl Outdated Show resolved Hide resolved

amontoison force-pushed the am/batch_api branch from 65dcf55 to a20cc30 Compare February 5, 2026 04:27

amontoison force-pushed the am/batch_api branch from 26093b5 to 5da0386 Compare February 14, 2026 07:30

amontoison mentioned this pull request Feb 23, 2026

Use functions as properties may not necessarily be defined when copy constructing meta #548

Merged

klamike reviewed Feb 24, 2026

View reviewed changes

src/nlp/batch_meta.jl Show resolved Hide resolved

klamike and others added 22 commits February 24, 2026 00:21

simplify inputs/outputs

7ecaaf2

back to old API

e07a4fd

Polish the batch API

786b01e

Add BatchNLPModelMeta

539ee4f

Update batch_api.jl

d9c6ac9

Trim a little bit the content of the PR

ee763e9

Update batch_meta.jl

49b4269

Apply suggestion from @amontoison

487d53a

rm new simplenlpmodel methods

e2364f6

nits

c97d64c

typos

8ba159a

bugs

ed1bdc2

simple batch model

b35d114

tests

31d9cd8

VI -> Vector{Int}

d8009b3

Fix the tests

9d55284

Support dense API

d4c5cae

Support dense API

19c24f3

Remove the prefix batch_ and rely on multiple dispatch

b67f04e

Remove a few things in the batch meta

570618e

Update src/nlp/batch_meta.jl

0b6e25b

Co-authored-by: Michael Klamkin <klamike@users.noreply.github.com>

Update the API

edcec16

amontoison force-pushed the am/batch_api branch from 997a31c to edcec16 Compare February 24, 2026 19:18

amontoison added 4 commits February 24, 2026 22:31

[documentation] Add a page about batch API

3694eff

Finalize the documentation

a2fe4eb

Finalize the documentation

474a78d

Fix the tests

634dbc0

amontoison merged commit 257de26 into main Feb 25, 2026
74 checks passed

amontoison deleted the am/batch_api branch February 25, 2026 05:39

Conversation

amontoison commented Feb 4, 2026

Uh oh!

github-actions bot commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

amontoison commented Feb 4, 2026

Uh oh!

klamike commented Feb 5, 2026

Uh oh!

klamike commented Feb 5, 2026

Uh oh!

amontoison commented Feb 5, 2026

Uh oh!

amontoison commented Feb 5, 2026

Uh oh!

klamike commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

klamike commented Feb 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amontoison commented Feb 8, 2026

Uh oh!

klamike commented Feb 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

klamike commented Feb 8, 2026

Uh oh!

amontoison commented Feb 9, 2026

Uh oh!

klamike commented Feb 10, 2026

Uh oh!

amontoison commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

klamike commented Feb 13, 2026

Uh oh!

amontoison commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

klamike commented Feb 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sshin23 commented Feb 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

klamike commented Feb 15, 2026

Uh oh!

sshin23 commented Feb 15, 2026 via email

Uh oh!

klamike commented Feb 16, 2026

Uh oh!

Uh oh!

amontoison commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

klamike commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amontoison commented Feb 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Feb 4, 2026 •

edited

Loading

klamike commented Feb 5, 2026 •

edited

Loading

klamike commented Feb 8, 2026 •

edited

Loading

klamike commented Feb 8, 2026 •

edited

Loading

amontoison commented Feb 11, 2026 •

edited

Loading

amontoison commented Feb 14, 2026 •

edited

Loading

klamike commented Feb 15, 2026 •

edited

Loading

sshin23 commented Feb 15, 2026 •

edited

Loading

amontoison commented Feb 24, 2026 •

edited

Loading

klamike commented Feb 24, 2026 •

edited

Loading