Skip to content

afpd: force-close stale forks during FPDelete to unblock deletion#2851

Draft
andylemin wants to merge 1 commit intomainfrom
delete-fix-stale-forks
Draft

afpd: force-close stale forks during FPDelete to unblock deletion#2851
andylemin wants to merge 1 commit intomainfrom
delete-fix-stale-forks

Conversation

@andylemin
Copy link
Copy Markdown
Contributor

@andylemin andylemin commented Apr 4, 2026

Non-compliant AFP clients (e.g. Preview.app) sometimes leave open forks after completing operations, causing FPDelete to return AFPERR_BUSY when trying to delete after viewing.

Add of_close_stale_forks() which iterates all open forks matching the target file (by dev/ino) and force-closes any fork that passes both safety checks:

  • Not DIRTY (no pending AD header metadata flush)
  • No outstanding locks on the shared adouble

AFPFORK_MODIFIED is intentionally not checked since the file is being deleted and any data written via FPWrite is already on disk.

In afp_delete(), when of_findname() detects open forks, attempt stale fork cleanup before returning AFPERR_BUSY. If all forks are closed, re-verify with of_findname() and proceed to deletion. If any fork remains active (dirty or locked), return AFPERR_BUSY as before.

This is safe because:

  • No data loss: file is being deleted, written data is on disk
  • No lock violations: explicit lock check before force-close
  • Per-FD refcounting respected: of_closefork() properly decrements
  • Scope limited to FPDelete: normal fork lifecycle unaffected
  • Post-close behavior identical to session disconnect cleanup

Extensive investigation and debugging (2 weeks of testing with logs and Wireshark - I found it hard to believe Apple are the ones who violate the spec sometimes) was performed to validate that this is NOT a Netatalk issue.
Tracing AFP operations from some clients (such as macOS Ventura) using specific applications, it was proven that the AFP spec is sometimes violated by clients.
The spec states if a client opens a file for Read, and opens it for Writing later, the client must still close both open's - not just the most privileged open.

As there is only a single file descriptor on the client, it is understandable that sometimes this tracking fails and only one close is received for the client's fd (instead of an exact close for each original open), leaving a stale fork on the server - subsequently blocking deletions.

This forced-cleanup does not violate the AFP specification.

@andylemin andylemin requested a review from rdmark as a code owner April 4, 2026 11:16
@andylemin andylemin force-pushed the delete-fix-stale-forks branch from de34a85 to af33cd5 Compare April 4, 2026 11:20
@andylemin andylemin force-pushed the delete-fix-stale-forks branch from af33cd5 to 66bbc2c Compare April 4, 2026 11:26
@andylemin andylemin force-pushed the delete-fix-stale-forks branch from 66bbc2c to 5c07511 Compare April 4, 2026 11:49
Comment thread etc/afpd/ofork.c
Copy link
Copy Markdown
Member

@rdmark rdmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does Preview.app have an actual built-in AFP client?

do you know of other macOS apps that do as well?

@andylemin
Copy link
Copy Markdown
Contributor Author

does Preview.app have an actual built-in AFP client?

do you know of other macOS apps that do as well?

Well this is one of those changes that has affected me personally for a long time.

It seems to be related to the Ventura (and some other versions - not tested all) kernel module itself, with any application that peeks or previews (quick view) data (including Preview, image magick CLI on terminal, and Photoshop) - many tools used by photographers.

I suspect these kinds of apps open read only, and the kernel module upgrades to open an extra Write mode on behalf of the app, but does not close all when the app closes.

Needs some testing before merge

@andylemin andylemin force-pushed the delete-fix-stale-forks branch from 5c07511 to 85bd5e3 Compare April 5, 2026 03:54
@augmentcode
Copy link
Copy Markdown

augmentcode bot commented Apr 5, 2026

augment review

…ling

When a client deletes a file via FPDelete, afpd now automatically
force-closes stale open forks that are not dirty and hold no locks,
allowing the delete to succeed without requiring the client to
explicitly close every fork first.

For forks holding byte-range locks, a three-tier safety check applies:

  1. Dirty forks (pending metadata flush) — always block deletion.
  2. Write-locked forks (F_WRLCK) — always block deletion, as write
     locks indicate a potentially active writer.
  3. Read-locked forks (F_RDLCK only) — controlled by the new
     'close stale rlocks' config option (default: no). When enabled,
     read-locked stale forks are force-closed since the file is being
     destroyed and read locks only guard against concurrent writers.

Implementation:
- Add adf_has_wrlocks() helper to ad_lock.c to distinguish read-only
  locked forks from write-locked forks by scanning the adf_lock array
  for F_WRLCK entries.
- Add of_close_stale_forks() to ofork.c, called from afp_delete() in
  filedir.c when open forks are detected on the target file.
- Add OPTION_CLOSE_STALE_RLOCKS flag and 'close stale rlocks' config
  option (default: no) parsed via getoption_bool() in netatalk_conf.c.

New afp.conf option:
  close stale rlocks = BOOLEAN (default: no) [Global]
Docker env var: AFP_CLOSE_STALE_RLOCKS (default: no)
@andylemin andylemin force-pushed the delete-fix-stale-forks branch from 0e01b31 to 409f413 Compare April 5, 2026 12:31
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud bot commented Apr 5, 2026

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 5, 2026

🔥 Spectest (AFP 3.4) - Flamegraph (AFP_ASSERT active)

Commit: 409f413572efaf708980d3cb10027b4a5f18db1e
Profiling: On-CPU sampling @ 999 Hz, DWARF call-graph, x86_64
Build: debugoptimized (-O2 -g -fno-omit-frame-pointer)
Runtime: 66s, Stacks: 995, SVG size: 492K

🔥 Open interactive Flamegraph (SVG)

Flamegraph preview

📥 Download from artifacts →

🔝 Top 10 leaf functions
Function Samples
_raw_spin_unlock_irqrestore 323323323
do_syscall_64 129129129
[ld-musl-x86_64.so.1] 126126126
dircache_remove_children 81081081
srso_alias_safe_ret 36036036
file_close_fd 27027027
finish_task_switch.isra.0 25025025
x64_sys_call 24024024
hash_scan_next 21021021
find_get_block_common 19019019

@andylemin andylemin marked this pull request as draft April 5, 2026 14:32
@augmentcode
Copy link
Copy Markdown

augmentcode bot commented Apr 6, 2026

🤖 Augment PR Summary

Summary: This PR makes FPDelete more tolerant of non-compliant AFP clients that leave extra forks open, which can otherwise cause AFPERR_BUSY after viewing files.

Changes:

  • Adds of_close_stale_forks() to scan open forks by dev/ino and force-close those considered safe.
  • Updates afp_delete() to attempt stale-fork cleanup before returning AFPERR_BUSY, then re-checks for remaining forks.
  • Improves logging in of_alloc() and adds summary logs when closing forks on volume/session shutdown.

Technical Notes: Stale forks are skipped when flagged DIRTY or when the shared AppleDouble state indicates outstanding locks, aiming to avoid metadata loss or lock violations.

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 1 suggestion posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread etc/afpd/ofork.c Outdated
* The zero-length resource fork check in of_closefork() may also
* unlink the AppleDouble file early. This is safe because the
* delete path handles ENOENT gracefully via netatalk_unlinkat(). */
of_closefork(obj, of);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of_close_stale_forks() force-closes forks based only on AFPFORK_DIRTY and lockcount, but it can still close a fork that was opened with write access (AFPFORK_ACCWR) if it currently has no locks/dirty flag. That would allow FPDelete to proceed by terminating an active writer, which seems different from the intended “stale read fork” cleanup behavior.

Severity: high

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants