Fix severe memory leak during HMC thermalization and skip sweeps by ThinkOffApp · Pull Request #1 · ThinkOffApp/multiquark-lattice-qcd

ThinkOffApp · 2026-03-10T20:04:12Z

This PR fixes a critical memory leak in the su2_2q_signal_scan.py script. The C++ lattice objects allocated via the action staple within the multihit/skip sweeps were not being garbage collected during the long thermalization loops, leading to OOM crashes. The PR actively calls the clear_gpt_caches() function and gc.collect() inside the thermalization and skip sweep loops to maintain a flat memory profile.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1436606cea

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-10T20:07:36Z

            save_checkpoint("thermalization", i + 1, meas_start)
+
+        # Free memory aggressively during thermalization sweeps
+        clear_gpt_caches()


Stop invalidating stencil caches on every sweep

Calling clear_gpt_caches() inside the thermalization loop clears default_staple_cache/default_exp_cache before the next one_sweep, so GPT must rebuild cached transport/stencil kernels on every sweep instead of reusing them (gpt/lib/gpt/qcd/gauge/stencil/staple.py:37-56, gpt/lib/gpt/core/foundation/lattice/matrix/exp.py:188-203). In long runs with large ntherm, this turns a one-time setup into per-sweep overhead and can drastically slow experiments; cache clearing should be batched less frequently rather than done every iteration.

Useful? React with 👍 / 👎.

fix: resolve memory leak during HMC sweeps by clearing C++ caches

1436606

chatgpt-codex-connector Bot reviewed Mar 10, 2026

View reviewed changes

ThinkOffApp mentioned this pull request Apr 26, 2026

metal: precision guard + dispatch hardening cleanup batch #2

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix severe memory leak during HMC thermalization and skip sweeps#1

Fix severe memory leak during HMC thermalization and skip sweeps#1
ThinkOffApp wants to merge 1 commit intomainfrom
fix-memory-leak

ThinkOffApp commented Mar 10, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ThinkOffApp commented Mar 10, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant