Skip to content

common: fix macOS cache path segfault when HOME is unset#22263

Open
Geramy wants to merge 1 commit intoggml-org:masterfrom
Geramy:geramy/macos-daemon-service-nohome-fix
Open

common: fix macOS cache path segfault when HOME is unset#22263
Geramy wants to merge 1 commit intoggml-org:masterfrom
Geramy:geramy/macos-daemon-service-nohome-fix

Conversation

@Geramy
Copy link
Copy Markdown

@Geramy Geramy commented Apr 22, 2026

On macOS, fs_get_cache_directory crashes with SIGSEGV when
HOME is not set (for example, when llama.cpp is launched by
a LaunchDaemon). std::getenv returns NULL and the following
std::string concatenation dereferences it.

This adds the same guard the Linux branch already has: check
getenv("HOME") first, and fall back to getpwuid(getuid()) if
it is unset. The pwd.h include guard is extended to APPLE
so getpwuid is available.

Closes #22229

std::getenv("HOME") returns NULL in environments such as
LaunchDaemons, which crashes std::string construction. Add
a NULL check and fall back to getpwuid, matching the Linux
branch in fs_get_cache_directory.
@Geramy Geramy requested a review from a team as a code owner April 22, 2026 19:31
Geramy added a commit to lemonade-sdk/lemonade that referenced this pull request Apr 22, 2026
- llamacpp_server.cpp: annotate the HOME-fallback block with a link to
  ggml-org/llama.cpp#22263 ("common: fix macOS cache path segfault when
  HOME is unset"). Remove this workaround once that PR merges and the
  metal version pinned in backend_versions.json includes it.
- server_env_vars.py: restore the IS_MACOS skip on test_flm_args. FLM
  is an NPU-only backend and is genuinely unavailable on macOS, so this
  skip is platform truth, not a CI workaround — it was removed in
  34eb4db alongside the CI-only skips, but it should have stayed.
pull Bot pushed a commit to bhardwajRahul/lemonade that referenced this pull request Apr 22, 2026
…atch HOME so its not null and doesn't throws a segfault in llama.cpp (lemonade-sdk#1708)

* macOS: use posix_spawn instead of fork+exec for child processes

Problem: lemond spawns llama-server via fork()+execvp(). On macOS, fork()
leaves the child with corrupted Mach-port and XPC-bootstrap state that
execvp() does not reset. llama.cpp b8884+ now runs a ggml-metal probe at
startup that calls [MTLDevice newLibraryWithSource:] — which routes
through MTLCompilerService XPC — and dies on the broken channel before
the model is opened. Direct terminal runs work; only lemond-spawned
children fail (~130ms, exit code -1).

Fix: on __APPLE__, replace fork()+execvp() with posix_spawn. Preserves
pipe/working-dir semantics. Adds POSIX_SPAWN_CLOEXEC_DEFAULT to avoid
leaking lemond FDs into the child, and POSIX_SPAWN_SETSIGDEF to reset
inherited SIG_IGN dispositions. Linux and Windows paths unchanged.

* bump llamacpp metal to b8884; remove macOS skipIfs on env-var tests

- backend_versions.json: metal b8460 -> b8884 (latest llama.cpp release,
  paired with the posix_spawn spawn-path fix so macOS can actually
  tolerate b8884's ggml-metal probe).
- server_env_vars.py: drop the five @unittest.skipIf(IS_MACOS, ...)
  decorators on test_llamacpp_backend / _args, test_whispercpp_backend /
  _args, and test_flm_args so the env-var snapshot checks run on macOS
  too. The setUpClass already guards the matching env vars with
  `if not IS_MACOS` — those tests will now report real failures if the
  env path diverges on macOS instead of silently skipping.

* macOS: set HOME in llama-server child env when unset

llama.cpp b8884+ libllama-common calls getenv("HOME") in
fs_get_cache_directory during CLI arg parsing and feeds the result
straight into std::string without a NULL check, so llama-server segfaults
before the model loads whenever HOME is unset (EXC_BAD_ACCESS /
SIGSEGV at 0x0 in _platform_strlen via std::string::insert, caller
hf_cache::migrate_old_cache_to_hf_cache).

LaunchDaemons installed under /Library/LaunchDaemons/ only inherit the
EnvironmentVariables declared in their plist; the lemond plist sets
HF_HOME and PATH but not HOME, so every child spawned by lemond hit the
crash. Terminal/sudo spawns preserve HOME and were unaffected, which is
why the bug only surfaced under the installed daemon.

Fix: when spawning llama-server on __APPLE__, check if HOME is set in
the parent env; if not, fall back to getpwuid(getuid())->pw_dir (or
/var/root) and pass it through env_vars. No plist change, no launchd
reconfiguration — just a guaranteed HOME in the child.

* note upstream fix for HOME crash; restore FLM macOS skip

- llamacpp_server.cpp: annotate the HOME-fallback block with a link to
  ggml-org/llama.cpp#22263 ("common: fix macOS cache path segfault when
  HOME is unset"). Remove this workaround once that PR merges and the
  metal version pinned in backend_versions.json includes it.
- server_env_vars.py: restore the IS_MACOS skip on test_flm_args. FLM
  is an NPU-only backend and is genuinely unavailable on macOS, so this
  skip is platform truth, not a CI workaround — it was removed in
  34eb4db alongside the CI-only skips, but it should have stayed.

* restore IS_MACOS skipIfs in server_env_vars tests

Revert the removal of @unittest.skipIf(IS_MACOS, ...) on
test_llamacpp_backend / _args and test_whispercpp_backend / _args.
These tests rely on env vars that setUpClass only sets when not IS_MACOS,
so running them on macOS produces noise, not coverage. Put them back the
way they were before this branch touched the file.
@ngxson ngxson requested a review from angt April 23, 2026 08:54
@angt
Copy link
Copy Markdown
Member

angt commented Apr 23, 2026

I’m not sure we should handle the missing HOME the same way as Linux and POSIX friends. With LaunchDaemon, it seems the correct place should be simply /Library/Caches/llama.cpp.

@Geramy
Copy link
Copy Markdown
Author

Geramy commented Apr 23, 2026

Yeah I wasn't too sure myself, Apple does everything differently so that makes sense I'll go see if I can find any Apple documentation for a concrete statement if they have any and make the change, thanks!

@angt
Copy link
Copy Markdown
Member

angt commented Apr 24, 2026

Yeah I wasn't too sure myself, Apple does everything differently so that makes sense I'll go see if I can find any Apple documentation for a concrete statement if they have any and make the change, thanks!

I think the best way would be for lemonade to set up LLAMA_CACHE and HF_HUB_CACHE (or HF_HOME/hub) and not rely on this code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Misc. bug: Installing llama.cpp on macOS through lemonade-idk

2 participants