Skip to content

Add Fortran/Fypp static analysis linter#1193

Draft
sbryngelson wants to merge 8 commits intoMFlowCode:masterfrom
sbryngelson:feat/source-linter
Draft

Add Fortran/Fypp static analysis linter#1193
sbryngelson wants to merge 8 commits intoMFlowCode:masterfrom
sbryngelson:feat/source-linter

Conversation

@sbryngelson
Copy link
Member

@sbryngelson sbryngelson commented Feb 21, 2026

Summary

Adds a new Python-based static analysis linter (toolchain/mfc/lint_source.py) that catches patterns known to cause silent bugs in Fortran/Fypp source code. Integrated into both the local precheck hook and CI.

Checks

Check What it catches Example
Fypp list duplicates Duplicate entries in #:for VAR in [...] lists bc_x%ve2 listed twice in MPI broadcast, silently skipping bc_x%ve3
Duplicate consecutive lines Identical adjacent non-trivial lines (≥40 chars) Doubled R3bar accumulation, repeated subroutine argument
Hardcoded byte size int(8._wp, ...) patterns that assume 8-byte reals WP_MOK = int(8._wp, MPI_OFFSET_KIND) breaks in single precision

All three checks are motivated by real bugs found in the codebase.

Integration

  • toolchain/bootstrap/precheck.sh: Added as step 5/6 (bumped doc check to 6/6), runs during ./mfc.sh precheck and the pre-commit hook
  • .github/workflows/lint-source.yml: Added as a new CI step alongside existing grep-based checks
  • CMakeLists.txt: Added -Wconversion to GNU debug builds, tracked in cleanliness CI

Dependencies

This PR will fail precheck/CI until the following PRs merge first:

The source bug fixes were intentionally moved out of this PR and into their respective dedicated fix PRs. This PR now contains only the linter infrastructure. Once #1187 and #1241 are merged, this PR will pass cleanly.

Test plan

🤖 Generated with Claude Code

Closes #1211

Copilot AI review requested due to automatic review settings February 21, 2026 03:54
@codeant-ai codeant-ai bot added the size:L This PR changes 100-499 lines, ignoring generated files label Feb 21, 2026
@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 21, 2026

Nitpicks 🔍

🔒 No security issues identified
⚡ Recommended areas for review

  • File I/O robustness
    The code calls Path.read_text(encoding="utf-8") without handling decoding errors. A non-UTF-8 file (or an unexpected byte sequence) will raise UnicodeDecodeError and crash the linter (causing precheck/CI to fail). Consider defensively handling read errors or falling back to a safe decoding strategy.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a Python-based static analysis linter (lint_source.py) to detect common Fortran/Fypp code issues that cause silent bugs. The linter checks for duplicate entries in Fypp loop lists, duplicate consecutive source lines, and hardcoded byte size assumptions that break single-precision builds.

Changes:

  • Added toolchain/mfc/lint_source.py with three static analysis checks for Fortran/Fypp code
  • Integrated the linter into the precheck hook as step 5/6
  • Added CI workflow step to run Fortran/Fypp static analysis

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
toolchain/mfc/lint_source.py New static analysis linter with checks for Fypp list duplicates, duplicate consecutive lines, and hardcoded byte sizes
toolchain/bootstrap/precheck.sh Updated step numbering to 6 total steps and added Fortran/Fypp analysis as step 5
.github/workflows/lint-source.yml Added new CI step to run the static analysis linter

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 3 files

Confidence score: 5/5

  • Overall low risk since the only issue is a low-severity linting edge case, not a runtime or user-facing change.
  • Most notable concern: toolchain/mfc/lint_source.py only checks quoted list entries, so duplicate non-string items in #:for ... in [...] lists can slip through, reducing lint coverage.
  • Pay close attention to toolchain/mfc/lint_source.py - duplicate detection ignores non-string list items.
Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="toolchain/mfc/lint_source.py">

<violation number="1" location="toolchain/mfc/lint_source.py:62">
P2: This only checks quoted list entries, so duplicate numeric/tuple items in `#:for ... in [...]` lists (e.g., `[1, 2, 3]`) are silently ignored. The linter should parse the full list so duplicates in non-string entries are also detected.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
toolchain/mfc/lint_source.py (1)

42-43: _check_single_fypp_list — quote-matching regex allows mixed delimiters.

r"['\"]([^'\"]*)['\"]" matches any opening quote paired with any closing quote, so 'foo" is a valid match. In the MFC Fypp codebase, list entries are uniformly 'single-quoted', so no false positives have been seen. Consider using two separate patterns or a back-reference to enforce consistent quoting:

♻️ Proposed fix using alternation with back-reference
-    entries = re.findall(r"['\"]([^'\"]*)['\"]", list_content)
+    entries = re.findall(r"'([^']*)'|\"([^\"]*)\"", list_content)
+    entries = [a or b for a, b in entries]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@toolchain/mfc/lint_source.py` around lines 42 - 43, The regex in
_check_single_fypp_list that builds entries using re.findall currently allows
mismatched opening/closing quotes (e.g., "'foo\""); update the pattern used to
extract list_content entries to enforce matching delimiters — either restrict to
single quotes (since Fypp uses single-quoted entries) or use a
back-reference/alternation so the opening quote is the same as the closing
quote; modify the entries assignment (the variable entries derived from
list_content in _check_single_fypp_list) to use the corrected regex and ensure
subsequent logic that iterates over entries continues to work with the new
extraction.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@toolchain/mfc/lint_source.py`:
- Around line 126-152: The check_hardcoded_byte_size function is flagging
patterns inside inline Fortran comments because byte_re is run against the whole
stripped line; to fix, before testing byte_re.search, remove the inline comment
portion by taking the substring up to the first unquoted '!' (i.e., split on '!'
and use the left part) so only actual code is scanned; update the loop in
check_hardcoded_byte_size to compute a code_only variable from line (respecting
full-line comments via _is_comment_or_blank) and run byte_re.search on that
code_only string (references: check_hardcoded_byte_size, _is_comment_or_blank,
byte_re, lines loop).

---

Nitpick comments:
In `@toolchain/mfc/lint_source.py`:
- Around line 42-43: The regex in _check_single_fypp_list that builds entries
using re.findall currently allows mismatched opening/closing quotes (e.g.,
"'foo\""); update the pattern used to extract list_content entries to enforce
matching delimiters — either restrict to single quotes (since Fypp uses
single-quoted entries) or use a back-reference/alternation so the opening quote
is the same as the closing quote; modify the entries assignment (the variable
entries derived from list_content in _check_single_fypp_list) to use the
corrected regex and ensure subsequent logic that iterates over entries continues
to work with the new extraction.

@codecov
Copy link

codecov bot commented Feb 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 44.04%. Comparing base (ab5082e) to head (bd96007).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1193   +/-   ##
=======================================
  Coverage   44.04%   44.04%           
=======================================
  Files          70       70           
  Lines       20499    20499           
  Branches     1993     1993           
=======================================
  Hits         9028     9028           
  Misses      10330    10330           
  Partials     1141     1141           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@codeant-ai codeant-ai bot added size:L This PR changes 100-499 lines, ignoring generated files and removed size:L This PR changes 100-499 lines, ignoring generated files labels Feb 21, 2026
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file (changes from recent commits).

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="toolchain/mfc/lint_source.py">

<violation number="1" location="toolchain/mfc/lint_source.py:145">
P2: The inline comment stripping uses `split("!")`, which also cuts `!` inside string literals. That can hide real `int(8._wp, ...)` usages on lines that include a string with `!`, leading to false negatives. Use a comment-stripping approach that ignores `!` inside quotes.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@sbryngelson sbryngelson marked this pull request as draft February 22, 2026 14:53
@MFlowCode MFlowCode deleted a comment from github-actions bot Feb 23, 2026
@sbryngelson sbryngelson force-pushed the feat/source-linter branch 3 times, most recently from 576996d to 10e0984 Compare February 24, 2026 16:49
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from coderabbitai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from cubic-dev-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from cubic-dev-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from github-actions bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from github-actions bot Feb 26, 2026
sbryngelson and others added 2 commits February 28, 2026 14:44
Adds toolchain/mfc/lint_source.py with three checks that catch
copy-paste and precision bugs in Fortran/Fypp source:

- Duplicate entries in Fypp #:for lists (e.g. broadcast lists)
- Identical adjacent source lines (duplicated accumulations, args)
- Hardcoded int(8._wp, ...) that assumes 8-byte reals

Integrated into precheck.sh (step 5/6) and lint-source.yml CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds -Wconversion to the GNU debug compiler flags in CMakeLists.txt.
This catches implicit type coercions (e.g. integer = real truncation)
at compile time.

Adds a Conversion Warnings Diff step and count to cleanliness.yml,
following the same delta-from-master pattern as the existing checks.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
sbryngelson and others added 5 commits February 28, 2026 14:44
Extract inner parsing logic of check_fypp_list_duplicates into
_check_single_fypp_list helper to resolve R0914 (too-many-locals),
R1702 (too-many-nested-blocks), and R1716 (chained-comparison).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
byte_re was run against the full stripped line, so a comment like
  ! old code: int(8._wp, kind=i64)
would be flagged even though the pattern is inside a Fortran ! comment.
Strip the inline comment portion before searching.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This PR adds a 6th lint check (Fortran/Fypp analysis). Update
CLAUDE.md and common-pitfalls.md to match.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- m_mpi_proxy.fpp: fix duplicate 'bc_x%ve2' → 'bc_x%ve3' in MPI
  broadcast Fypp list; bc_x%ve3 was never broadcast to all ranks
- m_assign_variables.fpp: remove duplicate R3bar accumulation line
  in QBMM path; bubble radius cubed was summed twice per iteration
- m_rhs.fpp: fix cylindrical viscous boundary call passing
  dq_prim_dy_vf twice; 4th arg should be dq_prim_dz_vf (z-gradient)
- m_data_input.f90, m_data_output.fpp, m_start_up.fpp (all targets):
  replace hardcoded WP_MOK = int(8._wp,...) with storage_size(0._stp)/8
  for correct mixed-precision MPI-IO offset computation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The hardcoded byte-size linter should recommend storage_size(0._stp)/8,
not 0._wp, since field I/O uses storage precision. In mixed-precision
builds (wp=double, stp=half) the current message would misdirect.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@MFlowCode MFlowCode deleted a comment from github-actions bot Feb 28, 2026
The linter will intentionally fail until PRs MFlowCode#1187 (WP_MOK),
MFlowCode#1241 (R3bar duplicate), and this PR's dq_prim_dy fix (now in
MFlowCode#1187) merge first.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link

github-actions bot commented Mar 1, 2026

Claude Code Review

Head SHA: bd96007
Files changed: 7

  • .claude/rules/common-pitfalls.md
  • .github/workflows/cleanliness.yml
  • .github/workflows/lint-source.yml
  • CLAUDE.md
  • CMakeLists.txt
  • toolchain/bootstrap/precheck.sh
  • toolchain/mfc/lint_source.py (new)

Summary

  • Adds a Python static-analysis linter (lint_source.py) that catches three real classes of silent bugs: Fypp list duplicates, duplicate consecutive source lines, and hardcoded int(8._wp, ...) byte-size assumptions.
  • Integrates the linter as step 5/6 in precheck.sh and as a new CI step in lint-source.yml.
  • Adds -Wconversion to GNU debug builds and tracks it in the cleanliness workflow.
  • Counter and documentation updates (CLAUDE.md, common-pitfalls.md) correctly reflect the new 6-check count.

Findings

1. PR description–diff mismatch (toolchain/mfc/lint_source.py)
The PR body states "Companion Fortran fixes included — fixes for the 13 issues the linter finds on master today" (bc_x%ve2→ve3, dq_prim_dy→dz, WP_MOK hardcoded size, duplicate R3bar line). However, the actual diff contains no Fortran source changes — only 7 non-Fortran files. If the linter is supposed to pass precheck on this branch, either (a) those fixes are already on master, or (b) the linter CI step will fail green only because the Fortran files aren't checked out. The description should be updated to accurately reflect what is or isn't in this PR; otherwise reviewers may assume bug-fixes are present when they are not.

2. Comment-stripping heuristic (toolchain/mfc/lint_source.py:155)

if byte_re.search(stripped.split("!")[0]):

Splitting on the first ! to strip inline comments will incorrectly truncate lines where ! appears inside a string literal (e.g., call foo("msg!here")). For the specific pattern being checked (int(8._wp), this is a harmless false-negative rather than a false-positive, but it is a fragile approach. A regex-based comment stripper (skip ! inside quotes) would be more robust if this heuristic is ever reused by future checks.

3. Duplicate-line check: line.strip() vs comment-aware strip (lint_source.py:98–111)
prev_stripped and stripped include inline comments verbatim. Two lines that are logically identical but differ only in a trailing ! comment won't be flagged — this is the correct, safe behavior. However, two lines with the same trailing comment (common in generated/template-style code) will be flagged even if they are intentionally identical. The 40-char floor mitigates most noise, but teams should be aware of this if they hit false positives in repetitive initialisation blocks.

4. precheck.sh step 5 suppresses nothing on success (toolchain/bootstrap/precheck.sh:130–135)
Every other step uses > /dev/null 2>&1 to suppress output on success, then prints only the ok/error banner. The new step 5 does not suppress lint_source.py output:

if python3 toolchain/mfc/lint_source.py 2>&1; then
    ok "Fortran/Fypp analysis passed."

When the check passes the script prints nothing, so this works fine today. If lint_source.py ever gains verbose progress output, the terminal will become noisy on clean runs. Low risk now, worth noting for consistency.


Improvement opportunities

  • Unit tests for the linter: A small pytest fixture with synthetic .fpp snippets would make it easy to verify the three checks without scanning the full src/ tree. This would also guard against regressions if thresholds (e.g., MIN_DUP_LINE_LEN) are changed.
  • _check_single_fypp_list& stripping scope: list_content.replace("&", "") strips all ampersands from the extracted list content. A continuation & at the end of a raw line segment will already be present inside the accumulated full string before the bracket is sliced out, so this is currently correct. A targeted strip (only trailing & per segment) would be more defensive if the function is extended to handle other list shapes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files

Development

Successfully merging this pull request may close these issues.

Add Fortran/Fypp static analysis linter to catch copy-paste bugs

2 participants