Conversation
| # 4) HuggingFace authentication for the gated uma-s-1p1 model. | ||
| if [ "$SKIP_HF_LOGIN" -eq 0 ]; then | ||
| if $COMMAND_PKG run -n "$ENV_NAME" huggingface-cli whoami &>/dev/null; then | ||
| echo "✔️ Already authenticated to HuggingFace." |
There was a problem hiding this comment.
The workflow is not clear to me. Are we not able to utilize the HF_TOKEN if the user has it, or parse it into the script?
| PYCODE | ||
|
|
||
| # 4) HuggingFace authentication for the gated uma-s-1p1 model. | ||
| if [ "$SKIP_HF_LOGIN" -eq 0 ]; then |
There was a problem hiding this comment.
SKIP_HF_LOGIN is potentially confusing, are we skipping it for testing or since the user is logged in?
| - setuptools | ||
| - pip | ||
| - pip: | ||
| # fairchem-core pulls in a CUDA-enabled PyTorch by default on Linux. |
There was a problem hiding this comment.
We should probably let them know that. Also, can we allow users to choose which device they are using? like uma-cpu and uma-gpu?
| return MOPAC(**kwargs) | ||
|
|
||
|
|
||
| elif name in ('uma', 'fairchem'): |
There was a problem hiding this comment.
Does fairchem has a model that is not uma? Do we want to leave the option to call fairchem in the input?
| # A TS search needs a saddle-point optimizer; UMA ships none, so use Sella. | ||
| from sella import Sella | ||
| opt_class = Sella | ||
| opt = Sella(atoms, order=1 if is_ts else 0, logfile=logfile) |
There was a problem hiding this comment.
We use sella when is_ts == True, why does order != 1 always?
| save_current_geometry(output, atoms, xyz) | ||
|
|
||
| if job_type == 'irc': | ||
| from sella import IRC # VERIFY the Sella IRC API in the installed sella |
There was a problem hiding this comment.
Can you drop the comment? did we verify the IRC API?
| opt_class = engine_dict.get(engine_name, BFGS) | ||
| opt = opt_class(atoms, logfile=os.path.join(os.path.dirname(input_path), 'opt.log')) | ||
|
|
||
| opt = engine_dict.get(engine_name, BFGS)(atoms, logfile=logfile) |
There was a problem hiding this comment.
I think the original implementation was more readable.
| if is_atom or is_triplet_o2: | ||
| label = self.species[0].label if self.species else 'species' | ||
| logger.warning(f'Computing a UMA single point for {label} (an isolated atom or triplet O2). ' | ||
| f'UMA absolute energies are unreliable for these under-represented species; ' |
There was a problem hiding this comment.
Can you add a source for that please? Is that like a known issue?
| """ | ||
| scan_coords = self.data.get('scan_coords') | ||
| if scan_coords: | ||
| return [xyz if isinstance(xyz, dict) else str_to_xyz(xyz) for xyz in scan_coords] |
There was a problem hiding this comment.
Amazing! I missed that we didn't knew how to interact with scans from ase. Can you please add tests to these functions too?
b628acc to
2549a2c
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #895 +/- ##
==========================================
+ Coverage 63.02% 63.14% +0.11%
==========================================
Files 114 114
Lines 38178 38218 +40
Branches 9990 9999 +9
==========================================
+ Hits 24063 24132 +69
+ Misses 11227 11194 -33
- Partials 2888 2892 +4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Add devtools/uma_environment.yml and devtools/install_uma.sh: a documented, user-driven setup script that wraps every step (create uma_env, verify fairchem/Sella/ASE imports, interactive HuggingFace login for the gated model, and the runtime exports for invoking UMA from arc_env). Supports --test to run the model-dependent unit tests and --skip-hf-login. Wire 'make install-uma' and .PHONY, and document UMA setup in installation.rst. Deliberately NOT added to devtools/install_all.sh / install-ci: the UMA model is gated (Meta license + HuggingFace token) and heavy, so it is a manual setup; CI coverage comes from the env-independent UMA tests.
Build UMA (Meta FAIR fairchem-core, uma-s-1p1, task omol) on top of the generic ASE adapter (PR #836) instead of a standalone adapter: - ase_script.py: add a uma/fairchem branch to get_calculator; set total charge and spin (=multiplicity) on atoms.info (omol conditioning); use Sella order=1 for TS saddle-point searches when is_ts; add an irc job type via Sella IRC. - ase.py: derive the calculator from the level method (so method='uma' works with no args), resolve UMA defaults (latest model, omol, cpu) via determine_settings, pass is_ts and irc_direction to the script, and warn on a UMA single point for an isolated atom or triplet O2 (unreliable absolute energy). - settings.py: UMA_PYTHON=find_executable('uma_env'), ASE_CALCULATORS_ENV['uma'], and UMA_LATEST_MODEL. - level.py: route method 'uma'/'uma-s-1'/'uma-s-1p1' to the 'ase' software. - yaml.py: implement parse_irc_traj and parse_1d_scan_coords so UMA IRC/scan outputs round-trip. Rotor scans run through ARC's directed_scan (constrained opt), already supported by the ASE adapter. fairchem/Sella-IRC API points only confirmable inside uma_env are marked with # VERIFY. Adds env-independent unit tests (routing, calculator/settings resolution, input writing, sp warning, output round-trip) plus skip-guarded model tests.
Made `warn_if_unreliable_uma_sp` a bolean function, check the return value. Doesn't interfere with current implementation
alongd
left a comment
There was a problem hiding this comment.
Looks good, I added a minor comment. Did you check it with UMA? How's the performance?
|
|
||
| def test_yaml_parser(self): | ||
| """Test the YAMLParser adapter for all its parse methods.""" | ||
| import tempfile |
There was a problem hiding this comment.
please put imports at module level, unless we cannot for some reason
There was a problem hiding this comment.
Pull request overview
Adds optional UMA (Meta FAIR fairchem-core) infrastructure on top of the existing ASE job adapter integration, including environment/setup tooling and ARC-side routing/parsing/tests to run UMA via ASE.
Changes:
- Add
make install-umaplusdevtools/install_uma.shanddevtools/uma_environment.ymlfor user-driven UMA setup (gated model; not CI). - Route UMA levels (
method='uma',uma-s-1,uma-s-1p1) through the ASE adapter and add UMA-specific defaults/warnings. - Extend YAML parsing + tests for scan/IRC keys and add UMA-via-ASE adapter tests (with gated model tests behind an opt-in flag).
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| Makefile | Adds install-uma target for manual UMA setup. |
| docs/source/installation.rst | Documents optional UMA installation and usage. |
| devtools/uma_environment.yml | Defines the conda env skeleton for uma_env. |
| devtools/install_uma.sh | Automates UMA env creation, dependency install, HF login, and optional tests. |
| arc/settings/settings.py | Registers UMA env discovery and defines UMA_LATEST_MODEL. |
| arc/parser/parser_test.py | Adds YAML parser tests covering new scan/IRC-related keys. |
| arc/parser/adapters/yaml.py | Implements YAML parsing for 1D scan coords and IRC trajectories. |
| arc/level.py | Adds UMA-to-ASE routing in software deduction. |
| arc/job/adapters/uma_test.py | Adds UMA-via-ASE wiring tests + opt-in model tests. |
| arc/job/adapters/scripts/ase_script.py | Adds fairchem/UMA calculator support and IRC execution via Sella. |
| arc/job/adapters/ase_adapter.py | Adds UMA calculator detection, default settings, and a reliability warning for UMA SPs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # UMA (run via the ASE adapter; 'uma' resolves to the latest model) | ||
| if self.method in ('uma', 'uma-s-1', 'uma-s-1p1'): | ||
| self.software = 'ase' | ||
|
|
| atoms = Atoms(symbols=xyz['symbols'], positions=xyz['coords']) | ||
| atoms.info.update({'charge': charge, 'spin': multiplicity}) # UMA (omol) conditions on these | ||
| calc = get_calculator(settings, charge, multiplicity) |
| ARC_ENV_PY="$($COMMAND_PKG run -n arc_env python -c 'import sys; print(sys.executable)')" | ||
| ARC_ENV_PREFIX="$(dirname "$(dirname "$ARC_ENV_PY")")" | ||
| BABEL_VERSION_DIR="$(ls -d "$ARC_ENV_PREFIX"/lib/openbabel/*/ 2>/dev/null | head -1)" | ||
|
|
||
| export_block() { | ||
| echo "export BABEL_LIBDIR=${BABEL_VERSION_DIR%/}" | ||
| echo "export BABEL_DATADIR=${ARC_ENV_PREFIX}/share/openbabel/$(basename "${BABEL_VERSION_DIR%/}")" | ||
| echo "export PYTHONPATH=${ARC_DIR}:\$PYTHONPATH" | ||
| } |
| if [ "$RUN_TESTS" -eq 1 ]; then | ||
| echo ">>> Running the UMA model-dependent unit tests (first run downloads the model; this is slow)..." | ||
| export BABEL_LIBDIR="${BABEL_VERSION_DIR%/}" | ||
| export BABEL_DATADIR="${ARC_ENV_PREFIX}/share/openbabel/$(basename "${BABEL_VERSION_DIR%/}")" | ||
| export PYTHONPATH="${ARC_DIR}:${PYTHONPATH}" | ||
| UMA_RUN_MODEL=1 "$ARC_ENV_PY" -m pytest "$ARC_DIR/arc/job/adapters/uma_test.py" -v | ||
| fi |
Currently sits on top of #836