Skip to content

fix: support for new docking version#453

Merged
sg-s merged 14 commits intomainfrom
bulk-docking-dp-support
Mar 2, 2026
Merged

fix: support for new docking version#453
sg-s merged 14 commits intomainfrom
bulk-docking-dp-support

Conversation

@sg-s
Copy link
Collaborator

@sg-s sg-s commented Feb 25, 2026

No description provided.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the SDK and its local mock server/test suite to support the newer docking/pocketfinder tooling and the Data Platform “result-explorer” flow, including new client wrappers and schema changes.

Changes:

  • Introduce new platform wrappers (Results, Projects) and rename/refocus the old “Data” wrapper as Entities (plus add DeepOriginClient.health() and delete_json()).
  • Update docking/pocketfinder integration to newer tool versions and newer docking userInputs schema (ligands array, IDs).
  • Extend the mock server + fixtures and add/adjust tests to cover result-explorer queries and entity deletion.

Reviewed changes

Copilot reviewed 32 out of 32 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/platform/client.py Swaps data wrapper for entities/results/projects, adds combined health() and delete_json()
src/platform/entities.py Renames DataEntities, adds generic get() + delete(), reorganizes ligand/protein helpers
src/platform/results.py New result-explorer wrapper with pagination + convenience methods (get_poses, get_pockets, with_ligands)
src/platform/projects.py New projects wrapper (list())
src/platform/job_viz_functions.py Updates docking viz/name functions to new userInputs ligand schema
src/functions/docking.py Bumps docking function version
src/functions/pocket_finder.py Bumps pocketfinder function version
src/drug_discovery/docking.py Updates docking workflow to sync IDs, new inputs schema, and adds result-explorer retrieval path
src/drug_discovery/utils.py Adjusts docking tool-run outputs setup
src/drug_discovery/structures/protein.py Switches to client.entities, adds register(), updates sync() semantics, adds optional PDB validation flag
src/drug_discovery/structures/ligand.py Adds from_id(), splits register() vs sync(), adds LigandSet.from_docking_result(), switches to client.entities
src/drug_discovery/structures/pocket.py Adds explicit pocket properties, id, JSON key mapping, richer repr, and Pocket.from_result() via result-explorer
tests/mock_server/server.py Adds in-memory result-explorer store and loads result-explorer fixtures
tests/mock_server/routers/tools.py Adds /tools/health and injects function outputs into result-explorer store
tests/mock_server/routers/data_platform.py Adds result-explorer search endpoint, delete entity endpoint, and updates protein creation behavior
tests/mock_server/routers/billing.py Adds /billing/health
tests/fixtures/result-explorer-bulk-docking-08BSPN61NYVE3.json Adds result-explorer fixture data for bulk-docking
tests/test_results.py New tests for client.results.get_for() behavior
tests/test_projects.py New tests for client.projects.list()
tests/test_entities.py Refactors tests from client.data to client.entities, adds delete tests
tests/test_functions_with_data_platform.py Adds fixtures for registered protein/ligand and validates result-explorer-derived pockets/poses
tests/test_job_list.py Updates DTO helpers + userInputs schema for docking visualization/tests
tests/test_pocket.py Updates expectations to new Pocket attributes + id behavior
tests/test_ligand.py Adds lv1 tests for ligand register/from_id
tests/test_ligand_set.py Adds lv1 test for batch ligand creation via sync
mkdocs.yaml Updates nav to entities/results/projects docs
docs/platform/ref/entities.md New Entities API docs
docs/platform/ref/results.md New Results API docs
docs/platform/ref/projects.md New Projects API docs
docs/platform/ref/data.md Removes old Data API docs page
docs/dev/mock-server.md Expands mock server architecture and result-explorer injection documentation
.vscode/settings.json Adds spelling dictionary entry

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +122 to +144
"""Apply ``{"field": {"eq": value}}`` style filters to a list of records.

Each filter key maps to a dict with an ``eq`` operator. The value is
compared against top-level record fields *and* nested ``data`` fields.

Args:
records: List of record dicts to filter.
filter_dict: Mapping of field names to ``{"eq": value}`` dicts.

Returns:
Filtered list of records.
"""
results = list(records)
for key, condition in filter_dict.items():
if not isinstance(condition, dict) or "eq" not in condition:
continue
expected = condition["eq"]
results = [
r
for r in results
if r.get(key) == expected
or (isinstance(r.get("data"), dict) and r["data"].get(key) == expected)
]
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_apply_eq_filters only implements {field: {eq: ...}} and silently ignores other operators. client.results.get_for() uses {in: [...]} when tool_id / ligand_id are lists (e.g. Results.get_poses() defaults to a tool_id list), so the mock server will return unfiltered result-explorer records. With a session-scoped mock server, this can mix pockets + poses across tests and break consumers like LigandSet.from_docking_result() (attempting to load non-SDF file paths). Please add support for at least the in operator (and ideally validate/raise on unknown operators) while keeping the nested-data matching behavior.

Suggested change
"""Apply ``{"field": {"eq": value}}`` style filters to a list of records.
Each filter key maps to a dict with an ``eq`` operator. The value is
compared against top-level record fields *and* nested ``data`` fields.
Args:
records: List of record dicts to filter.
filter_dict: Mapping of field names to ``{"eq": value}`` dicts.
Returns:
Filtered list of records.
"""
results = list(records)
for key, condition in filter_dict.items():
if not isinstance(condition, dict) or "eq" not in condition:
continue
expected = condition["eq"]
results = [
r
for r in results
if r.get(key) == expected
or (isinstance(r.get("data"), dict) and r["data"].get(key) == expected)
]
"""Apply equality-style filters to a list of records.
Supports ``{"field": {"eq": value}}`` and ``{"field": {"in": [values]}}``
style filters. Each filter key maps to a dict with exactly one supported
operator. Values are compared against top-level record fields *and*
nested ``data`` fields when present.
Args:
records: List of record dicts to filter.
filter_dict: Mapping of field names to operator dicts, e.g.
``{"field": {"eq": value}}`` or ``{"field": {"in": [values]}}``.
Returns:
Filtered list of records.
Raises:
ValueError: If a filter condition uses an unknown operator, provides
more than one operator, or has an invalid shape.
"""
results = list(records)
allowed_ops = {"eq", "in"}
for key, condition in filter_dict.items():
if not isinstance(condition, dict):
raise ValueError(
f"Invalid filter for field '{key}': expected dict of operators, "
f"got {type(condition).__name__}"
)
op_keys = set(condition.keys()) & allowed_ops
unknown_ops = set(condition.keys()) - allowed_ops
if unknown_ops:
raise ValueError(
f"Unsupported filter operator(s) for field '{key}': "
f\"{', '.join(sorted(unknown_ops))}\"
)
if len(op_keys) != 1:
raise ValueError(
f"Filter for field '{key}' must specify exactly one of "
f\"{', '.join(sorted(allowed_ops))}, got {', '.join(sorted(op_keys)) or 'none'}\"
)
op = next(iter(op_keys))
if op == "eq":
expected = condition["eq"]
def _matches(record: dict[str, Any]) -> bool:
if record.get(key) == expected:
return True
data = record.get("data")
return isinstance(data, dict) and data.get(key) == expected
else: # op == "in"
values = condition["in"]
try:
iterator = iter(values)
except TypeError as exc:
raise ValueError(
f"Filter 'in' value for field '{key}' must be iterable"
) from exc
# Materialize to support multiple passes and membership checks.
value_set = set(iterator)
def _matches(record: dict[str, Any]) -> bool:
if record.get(key) in value_set:
return True
data = record.get("data")
return isinstance(data, dict) and data.get(key) in value_set
results = [r for r in results if _matches(r)]

Copilot uses AI. Check for mistakes.
)
if inputs and "smiles_list" in inputs:
total_ligands += len(inputs["smiles_list"])
total_ligands += len(single_job._attributes["userInputs"]["ligands"])
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_viz_func_docking now unconditionally indexes single_job._attributes["userInputs"]["ligands"]. If _attributes is None, userInputs is missing, or the older smiles_list format is present, this will raise (and break JobList rendering). Please restore defensive access here and support both input schemas (smiles_list and ligands).

Suggested change
total_ligands += len(single_job._attributes["userInputs"]["ligands"])
# Defensive access to input schema: support both "ligands" and legacy "smiles_list"
ligands_count = 0
attrs = getattr(single_job, "_attributes", None) or {}
user_inputs = attrs.get("userInputs") or {}
if isinstance(user_inputs, dict):
ligands = user_inputs.get("ligands")
if isinstance(ligands, list):
ligands_count = len(ligands)
else:
smiles_list = user_inputs.get("smiles_list")
if isinstance(smiles_list, list):
ligands_count = len(smiles_list)
total_ligands += ligands_count

Copilot uses AI. Check for mistakes.
Comment on lines +404 to +409
for job in jobs_list:
this_smiles = [
ligand["smiles"] for ligand in job._attributes["userInputs"]["ligands"]
]

unique_smiles.update(this_smiles)
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_name_func_docking now assumes job._attributes["userInputs"]["ligands"] always exists. That will raise for jobs missing userInputs / _attributes, and it also drops support for the legacy smiles_list format (which you still handle elsewhere, e.g. Docking.get_jobs_df). Consider handling both schemas and skipping jobs without usable inputs so the name function is robust.

Copilot uses AI. Check for mistakes.
Comment on lines +102 to +107
try:
df = self.get_results_using_id()
if df is not None:
remote_paths = df["file_path"].dropna().unique().tolist()
except Exception:
pass
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_poses() swallows all exceptions from get_results_using_id() (except Exception: pass), which can hide real problems (auth/config errors, schema changes, etc.) and make debugging very difficult. Consider catching only the expected failure modes (e.g., HTTP/status errors / DeepOriginException) and either logging a brief warning or re-raising unexpected exceptions.

Copilot uses AI. Check for mistakes.
Comment on lines +199 to +212
"""Retrieve docking results as a flat DataFrame via the result-explorer API.

Queries the result-explorer endpoint filtered by protein ID and tool ID,
then flattens nested ``data`` fields into a single tabular structure.

Args:
limit: Maximum number of result records to fetch. If None, the API
default applies.

Returns:
DataFrame with columns ``id``, ``tool_id``, ``tool_version``,
``ligand_id``, ``protein_id``, ``binding_energy``, ``pose_score``,
and ``file_path``. Returns None if no results were found.

Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring mismatch: get_results_using_id() documents a limit argument, but the method signature doesn't accept limit and the implementation always fetches all records. Please either add the parameter and plumb it through to client.results.get_poses(limit=...), or remove/update the docstring section so it matches the actual API.

Copilot uses AI. Check for mistakes.
Comment on lines 333 to +338
self.executions = Executions(self)
self.organizations = Organizations(self)
self.billing = Billing(self)
self.data = Data(self)
self.entities = Entities(self)
self.results = Results(self)
self.projects = Projects(self)
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeepOriginClient replaces the public client.data attribute with client.entities / client.results / client.projects. This is a breaking change for downstream users (and the PR title suggests a patch-level fix). Consider keeping a backward-compatible data alias (possibly deprecated) that forwards to the new wrappers, or at least providing a clear deprecation path/error message so existing integrations don’t fail at runtime without guidance.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 33 out of 33 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +204 to +207
Args:
limit: Maximum number of result records to fetch. If None, the API
default applies.

Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The get_results_using_id() docstring lists a limit argument, but the method signature doesn’t accept it and the implementation always fetches the default from client.results.get_poses(). Update the docstring (or add the parameter) to keep the public API documentation accurate.

Suggested change
Args:
limit: Maximum number of result records to fetch. If None, the API
default applies.

Copilot uses AI. Check for mistakes.
Comment on lines 329 to +338
self.tools = Tools(self)
self.functions = Functions(self)
self.clusters = Clusters(self)
self.files = Files(self)
self.executions = Executions(self)
self.organizations = Organizations(self)
self.billing = Billing(self)
self.data = Data(self)
self.entities = Entities(self)
self.results = Results(self)
self.projects = Projects(self)
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeepOriginClient no longer exposes client.data (it’s replaced by client.entities, client.results, and client.projects). This is a breaking change for existing SDK users and older docs/examples. Consider keeping a backwards-compatible data alias (possibly deprecated with a warning) or documenting this as a major-version change.

Copilot uses AI. Check for mistakes.
Comment on lines 350 to 352
for single_job in jobs_list:
inputs = (
single_job._attributes.get("userInputs") if single_job._attributes else None
)
if inputs and "smiles_list" in inputs:
total_ligands += len(inputs["smiles_list"])
total_ligands += len(single_job._attributes["userInputs"]["ligands"])

Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_viz_func_docking now indexes single_job._attributes["userInputs"]["ligands"] directly. This will raise TypeError/KeyError for jobs without _attributes, for older bulk-docking payloads that used userInputs.smiles_list, or for partial DTOs returned by some endpoints. Consider supporting both schemas (new ligands array and legacy smiles_list) and using .get() guards so rendering a JobList can’t crash the UI.

Copilot uses AI. Check for mistakes.
Comment on lines +391 to 392
box_size = float(2 * np.cbrt(pocket.volume))
box_size = [box_size, box_size, box_size]
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When pocket is provided, Docking.run() computes box_size from pocket.volume, but Pocket.volume is optional and Pocket.from_pdb_file() leaves it as None. This will raise a TypeError (np.cbrt(None)) for valid Pocket instances that don’t have volume populated. Handle the None case explicitly (e.g., require an explicit box_size, compute from coordinates, or fall back to legacy pocket.props["volume"] when present) and raise a clear error if it can’t be derived.

Suggested change
box_size = float(2 * np.cbrt(pocket.volume))
box_size = [box_size, box_size, box_size]
# If no explicit box_size is provided, try to infer it from the pocket
if box_size is None:
volume = pocket.volume
# Fallback to legacy volume storage if needed
if volume is None and hasattr(pocket, "props"):
try:
volume = pocket.props.get("volume")
except AttributeError:
# pocket.props is not a dict-like object; ignore and let volume remain None
volume = None
if volume is None:
raise DeepOriginException(
"Could not determine docking box size from the provided pocket: "
"neither pocket.volume nor pocket.props['volume'] is set. "
"Please provide an explicit box_size or populate the pocket volume."
) from None
side_length = float(2 * np.cbrt(volume))
box_size = [side_length, side_length, side_length]

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 33 out of 33 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (1)

tests/test_entities.py:31

  • This test hardcodes the org_key segment as "/data-platform/deeporigin/..." when asserting the DELETE URL. If the test environment uses a different org_key, this will fail even if the client is correct. Use client.org_key (or derive from the instantiated client) in the expected URL instead of hardcoding "deeporigin".

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@sg-s sg-s merged commit 4845b61 into main Mar 2, 2026
19 of 22 checks passed
@sg-s sg-s deleted the bulk-docking-dp-support branch March 2, 2026 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants