fix: small correctness/security cleanups in experimental skill scripts#534
Conversation
|
Hi @calreynolds — could you take a look at this when you have a moment? Small set of fixes from a code review. Companion PR for — this comment was written by Claude |
…ain) Round-3 review surfaced that nothing under databricks-skills/.tests/ was being run by CI on either branch. This wires up a unit-tests job for forward compatibility: when the .tests/ tree lands on main (currently on experimental only — companion PR databricks-solutions#534), CI will start running it automatically without a separate plumbing PR. The job guards on the directory's existence so it no-ops cleanly on branches that don't have .tests/ yet (i.e. main today). Once the tree arrives, the guard becomes a no-op and pytest runs. uvx + pytest + databricks-sdk + requests, deselecting @pytest.mark.integration (those need a real Databricks workspace). Co-authored-by: Isaac
Batch of small fixes surfaced by a parallel GPT 5.4 + Gemini 3.1 Pro
review. None of these are introduced by recent changes; they are
pre-existing issues that ought to land regardless.
databricks-agent-bricks/scripts/mas_manager.py:
- `_delete` now tolerates empty / non-JSON DELETE response bodies
instead of raising JSONDecodeError on success.
- `add_examples_batch` short-circuits on empty input (ThreadPoolExecutor
rejects max_workers=0).
- `_build_agent_list` validates `uc_function_name` has 3 dotted parts
and rejects missing `endpoint_name` with a clear error, instead of
IndexError / `{"name": null}` on the API.
- `main()` uses a `_parse_json_arg` helper that exits cleanly on
malformed JSON CLI args, replacing raw `json.loads` tracebacks.
databricks-execution-compute/scripts/compute.py:
- `manage-cluster --action get` now keys off the SDK's `NotFound`
exception type instead of substring-matching "does not exist" in the
message. Also returns `success: false` for the DELETED state, so
callers gating on `success` don't treat a missing cluster as a
successful lookup.
databricks-app-python/examples/llm_config.py:
- OAuth error no longer interpolates the full token-endpoint payload
(which can contain `id_token` / refresh material). Logs the present
key names instead.
- DATABRICKS_MODEL validation error drops the `response.text[:300]`
echo so server bodies don't end up in operator-visible error text.
databricks-app-python/examples/fm-minimal-chat.py:
- Docstring + `app.yaml` examples reference the actual filename
(`fm-minimal-chat.py`), not `2-minimal-chat-app.py`.
databricks-app-python/examples/fm-parallel-calls.py:
- Guard `Speedup` division on `total_time > 0` to avoid
ZeroDivisionError on fast paths.
- Convert the trailing standalone triple-quoted string (dead code) to
real `#` comments.
databricks-python-sdk/examples/5-serving-and-vector-search.py:
- Replace the `[0.1, 0.2, 0.3, ...]` literal-Ellipsis vector with a
named placeholder + comment explaining it's a stand-in.
This pull request was AI-assisted by Isaac.
Co-authored-by: Isaac
Follow-up to the ACE multi-model self-review on this branch. mas_manager.py - _build_agent_list: reject UC function names with empty / whitespace-only segments (e.g. "catalog..fn"). The previous len==3 check let these through and they reached the API as schema="". - main(): catch ValueError from _build_agent_list so malformed agent configs exit cleanly via stderr instead of dumping a raw traceback — consistent with the JSON-arg handling _parse_json_arg already provides. compute.py - get_cluster_status: add success=True to the happy-path return so the manage-cluster get contract is coherent (success absent on success vs. success=False on missing/error was confusing). - cmd_manage_cluster: drop the redundant DatabricksError clause (identical body to the generic Exception fallback) and the now-unused import. .tests/test_agent_bricks_manager.py - Drop the stale `add_examples_queued` import (function does not exist), which was causing the entire test module to fail collection. - Fix sys.path to point at databricks-agent-bricks/scripts (not the parent), so mas_manager imports actually resolve. - Add TestBuildAgentListValidation, TestAddExamplesBatch, and TestDeleteResponseHandling covering every new error path / contract change introduced in the previous commit on this branch. Co-authored-by: Isaac
Sweep of issues neighboring the round-1/2 fixes that were called out in the round-3 review. mas_manager.py - Introduce typed MASNotFound exception. _handle_response_error now raises it on HTTP 404 so MASManager.get() can branch on existence cleanly instead of substring-matching "does not exist" / "not found" in the error message (the same anti-pattern the round-1 fix removed from compute.py). - Factor empty-body / non-JSON tolerance from _delete into a shared _safe_json helper and apply it symmetrically to _get / _post / _patch. Eliminates a class of latent JSONDecodeError on empty 2xx bodies. compute.py - cmd_list_compute's clusters resource path also called get_cluster_status with no try/except, so a stale --cluster-id raw-tracebacked. Apply the same typed-NotFound handling cmd_manage_cluster already uses. fm-parallel-calls.py - When total_time is below resolution, print "Speedup: N/A (total_time below resolution)" instead of silently omitting the line. 5-serving-and-vector-search.py - Replace the dim-3 [0.1, 0.2, 0.3] placeholder with [0.0] * 768 and a comment listing common embedding dimensions (768 / 1024 / 1536). The earlier value was small enough to mislead copy-pasters into thinking any short list works. .tests/test_agent_bricks_manager.py - Add TestResponseErrorHandling covering the 4 new behaviors: 404 raises MASNotFound, MASManager.get() returns None on 404, non-404 errors still propagate, and _post tolerates empty success bodies. Note: nothing under databricks-skills/.tests/ is currently wired into CI (.github/workflows/ci.yml only lints databricks-tools-core, databricks-mcp-server, .test/src). The tests run via local pytest only. Separate concern, separate PR. Co-authored-by: Isaac
Round-3 review surfaced that nothing under databricks-skills/.tests/ was being run by CI — which is why the stale add_examples_queued import (collection failure) had gone unnoticed in this branch's parent. Add a skills-unit-tests job that runs the .tests/ tree via uvx with pytest + databricks-sdk + requests, deselecting @pytest.mark.integration (those need a real Databricks workspace). Also add experimental to the pull_request branches list so PRs targeting experimental — like this one — trigger CI at all. Previously CI only fired on PRs to main, leaving experimental-branch work entirely unvalidated. Local trial run: 18 unit tests pass, 3 integration tests correctly deselected, ~25s wall-clock including dependency install. Co-authored-by: Isaac
The script this targets — databricks-genie/scripts/conversation.py — was removed in 8fd7f31 ("Replace Genie conversation.py script with pure CLI flow"). The test file's `from conversation import ...` has been unimportable ever since, but nothing was running tests in this tree (see ci.yml until two commits ago), so the breakage went unnoticed. With the new skills-unit-tests CI job pointed at databricks-skills/.tests/, this file's collection error halts the whole run before any real test executes. Removing it. Co-authored-by: Isaac
b64fc16 to
4b0bf26
Compare
|
Hi @QuentinAmbard — Claude here, working with James. Could you review this when you have a moment? Small set of code-review fixes across experimental skill scripts. Companion PR for |
…xes-experimental # Conflicts: # databricks-skills/.tests/test_agent_bricks_manager.py # databricks-skills/databricks-agent-bricks/scripts/mas_manager.py # databricks-skills/databricks-apps-python/examples/fm-minimal-chat.py
… from a-d-k#534 (#97) ## Summary Mirrors the four fixes shipping in [databricks-solutions/ai-dev-kit#534](databricks-solutions/ai-dev-kit#534) to the same files under `experimental/`. Each of these files was carried over verbatim from a-d-k and still had the bugs that PR fixes. ## Files | File | Fix | |------|-----| | `experimental/databricks-apps-python/examples/fm-parallel-calls.py` | guard div-by-zero on speedup calc when `total_time` is below clock resolution; convert trailing module-level docstring (unreachable as docs) into a real comment block | | `experimental/databricks-apps-python/examples/llm_config.py` | stop echoing the raw token-endpoint response and HTTP body in error messages — they can contain credentials. Surface the shape (payload keys / status code) instead | | `experimental/databricks-python-sdk/examples/5-serving-and-vector-search.py` | replace syntactically-broken example `query_vector=[0.1, 0.2, 0.3, ...]` (literal Ellipsis tuple) with a real `list[float]` stand-in + a comment about matching the index's embedding dimension | | `experimental/databricks-apps-python/examples/fm-minimal-chat.py` | fix `streamlit run 2-minimal-chat-app.py` / `command: ["streamlit", "run", "2-minimal-chat-app.py"]` in the docstring — the actual filename is `fm-minimal-chat.py`. Skill-name reference left as `databricks-model-serving` to match DAS naming | The `compute.py` fix from #534 doesn't apply here — that file was removed in #90. ## Test plan - [x] `python3 scripts/skills.py validate` clean - [ ] CI green This pull request and its description were written by Claude.
|
I think the commit description is off, but the file changed look good |
Matches the guard databricks-solutions#535 has. Without it, branches where the only test file (test_genie_conversation.py — deleted on this branch) has been removed cause pytest to exit 5 ("no tests collected"), failing CI. Co-authored-by: Isaac
|
Hi @QuentinAmbard — Claude here, on James's behalf. Cleaned up the PR description (dropped a stale reference to a helper that got reshaped in round 2; flipped the trailer attribution to Claude). The per-commit messages still describe each round's intent at the time, so if you read them sequentially they'll diverge from the final on-disk state — squash-merge will fold them into a single message that matches the new PR body. Let me know if there's a different part of the description that's off. |
Summary
Batch of small, contained fixes surfaced by a parallel GPT 5.4 + Gemini 3.1 Pro code review. None of these are introduced by recent changes — they are pre-existing issues that ought to land regardless. Each fix is local and review-by-eye.
Commit history
mas_manager._build_agent_list: reject empty / whitespace-only UC segments (catalog..fn)mas_manager.main(): catchValueErrorfrom_build_agent_listso malformed configs exit cleanly via stderrcompute.get_cluster_status: addsuccess: Trueto happy-path return for contract coherencecompute.cmd_manage_cluster: drop redundantDatabricksErrorclause + unused import.tests/test_agent_bricks_manager.py: drop staleadd_examples_queuedimport (was failing collection), fixsys.path, addTestBuildAgentListValidation/TestAddExamplesBatch/TestDeleteResponseHandlingmas_manager: typedMASNotFoundexception (replaces"does not exist" in str(e)substring matching inMASManager.get())mas_manager: factor empty-body / non-JSON tolerance into_safe_json, apply symmetrically to_get/_post/_patch(was_delete-only)compute.cmd_list_compute: apply the same NotFound handlingcmd_manage_clustergot — was missingfm-parallel-calls.py: explicit "Speedup: N/A (total_time below resolution)" instead of silently dropping the line5-serving-and-vector-search.py: replace dim-3 placeholder with[0.0] * 768and a comment listing common embedding dimensions.tests/test_agent_bricks_manager.py: addTestResponseErrorHandlingcovering the typed-404 path and symmetric empty-body handlingPer-file summary
databricks-skills/databricks-agent-bricks/scripts/mas_manager.py_deletetolerates empty / non-JSON DELETE response bodies;_get/_post/_patchdo too via shared_safe_jsonhelper.add_examples_batchshort-circuits on empty input (ThreadPoolExecutorrejectsmax_workers=0)._build_agent_listvalidatesuc_function_namehas 3 non-empty dotted parts and rejects missingendpoint_namewith a clear error.main()wraps dispatch intry/except ValueErrorfor clean stderr/exit on bad agent configs.MASManager.get()uses typedMASNotFoundinstead of substring-matching error messages.databricks-skills/databricks-execution-compute/scripts/compute.pymanage-cluster --action getkeys off the SDK's typedNotFoundexception (was substring-matching"does not exist").cmd_list_computeclusters resource path.get_cluster_statusincludessuccess: Trueso the contract is consistent across all return paths.databricks-skills/databricks-app-python/examples/llm_config.py(security)payload(which can containid_token/ refresh material). Logs the present key names instead.DATABRICKS_MODELvalidation error drops theresponse.text[:300]echo so server bodies don't end up in operator-visible error text.databricks-skills/databricks-app-python/examples/fm-minimal-chat.pyapp.yamlexamples reference the actual filename (fm-minimal-chat.py), not2-minimal-chat-app.py.databricks-skills/databricks-app-python/examples/fm-parallel-calls.pySpeedupdivision ontotal_time > 0; prints "Speedup: N/A (total_time below resolution)" when the metric can't be computed.#comments.databricks-skills/databricks-python-sdk/examples/5-serving-and-vector-search.pyEllipsis[0.1, 0.2, 0.3, ...]with[0.0] * 768and a comment naming common embedding dimensions (768 / 1024 / 1536).databricks-skills/.tests/test_agent_bricks_manager.pyadd_examples_queuedimport + wrongsys.path)._build_agent_listValueError branches,add_examples_batch([]),_deleteempty/non-JSON bodies,MASNotFoundtyped-404 path, and_postsymmetric empty-body handling.CI caveat
Nothing under
databricks-skills/.tests/is currently wired into CI —.github/workflows/ci.ymlonly lintsdatabricks-tools-core,databricks-mcp-server, and.test/src. That's why the brokenadd_examples_queuedimport had gone unnoticed. The new tests run via localpytest databricks-skills/.tests/only. Wiring them into CI is a separate concern.Companion PR
Equivalent (smaller) PR for
main—mas_manager.pyandcompute.pydon't exist underdatabricks-skills/on main: #535Test plan
python3 -m py_compileon all modified filesWORKTREE=<branch> python3 /tmp/tdd_verify.py— 12 / 12 pass post-fix on this branchtest_agent_bricks_manager.pyrun — 18 / 18 unit tests passcmd_list_computeNotFound path verified via mocked SDKThis pull request and its description were written by Claude.