Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the SDK and its local mock server/test suite to support the newer docking/pocketfinder tooling and the Data Platform “result-explorer” flow, including new client wrappers and schema changes.
Changes:
- Introduce new platform wrappers (
Results,Projects) and rename/refocus the old “Data” wrapper asEntities(plus addDeepOriginClient.health()anddelete_json()). - Update docking/pocketfinder integration to newer tool versions and newer docking
userInputsschema (ligands array, IDs). - Extend the mock server + fixtures and add/adjust tests to cover result-explorer queries and entity deletion.
Reviewed changes
Copilot reviewed 32 out of 32 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
src/platform/client.py |
Swaps data wrapper for entities/results/projects, adds combined health() and delete_json() |
src/platform/entities.py |
Renames Data → Entities, adds generic get() + delete(), reorganizes ligand/protein helpers |
src/platform/results.py |
New result-explorer wrapper with pagination + convenience methods (get_poses, get_pockets, with_ligands) |
src/platform/projects.py |
New projects wrapper (list()) |
src/platform/job_viz_functions.py |
Updates docking viz/name functions to new userInputs ligand schema |
src/functions/docking.py |
Bumps docking function version |
src/functions/pocket_finder.py |
Bumps pocketfinder function version |
src/drug_discovery/docking.py |
Updates docking workflow to sync IDs, new inputs schema, and adds result-explorer retrieval path |
src/drug_discovery/utils.py |
Adjusts docking tool-run outputs setup |
src/drug_discovery/structures/protein.py |
Switches to client.entities, adds register(), updates sync() semantics, adds optional PDB validation flag |
src/drug_discovery/structures/ligand.py |
Adds from_id(), splits register() vs sync(), adds LigandSet.from_docking_result(), switches to client.entities |
src/drug_discovery/structures/pocket.py |
Adds explicit pocket properties, id, JSON key mapping, richer repr, and Pocket.from_result() via result-explorer |
tests/mock_server/server.py |
Adds in-memory result-explorer store and loads result-explorer fixtures |
tests/mock_server/routers/tools.py |
Adds /tools/health and injects function outputs into result-explorer store |
tests/mock_server/routers/data_platform.py |
Adds result-explorer search endpoint, delete entity endpoint, and updates protein creation behavior |
tests/mock_server/routers/billing.py |
Adds /billing/health |
tests/fixtures/result-explorer-bulk-docking-08BSPN61NYVE3.json |
Adds result-explorer fixture data for bulk-docking |
tests/test_results.py |
New tests for client.results.get_for() behavior |
tests/test_projects.py |
New tests for client.projects.list() |
tests/test_entities.py |
Refactors tests from client.data to client.entities, adds delete tests |
tests/test_functions_with_data_platform.py |
Adds fixtures for registered protein/ligand and validates result-explorer-derived pockets/poses |
tests/test_job_list.py |
Updates DTO helpers + userInputs schema for docking visualization/tests |
tests/test_pocket.py |
Updates expectations to new Pocket attributes + id behavior |
tests/test_ligand.py |
Adds lv1 tests for ligand register/from_id |
tests/test_ligand_set.py |
Adds lv1 test for batch ligand creation via sync |
mkdocs.yaml |
Updates nav to entities/results/projects docs |
docs/platform/ref/entities.md |
New Entities API docs |
docs/platform/ref/results.md |
New Results API docs |
docs/platform/ref/projects.md |
New Projects API docs |
docs/platform/ref/data.md |
Removes old Data API docs page |
docs/dev/mock-server.md |
Expands mock server architecture and result-explorer injection documentation |
.vscode/settings.json |
Adds spelling dictionary entry |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| """Apply ``{"field": {"eq": value}}`` style filters to a list of records. | ||
|
|
||
| Each filter key maps to a dict with an ``eq`` operator. The value is | ||
| compared against top-level record fields *and* nested ``data`` fields. | ||
|
|
||
| Args: | ||
| records: List of record dicts to filter. | ||
| filter_dict: Mapping of field names to ``{"eq": value}`` dicts. | ||
|
|
||
| Returns: | ||
| Filtered list of records. | ||
| """ | ||
| results = list(records) | ||
| for key, condition in filter_dict.items(): | ||
| if not isinstance(condition, dict) or "eq" not in condition: | ||
| continue | ||
| expected = condition["eq"] | ||
| results = [ | ||
| r | ||
| for r in results | ||
| if r.get(key) == expected | ||
| or (isinstance(r.get("data"), dict) and r["data"].get(key) == expected) | ||
| ] |
There was a problem hiding this comment.
_apply_eq_filters only implements {field: {eq: ...}} and silently ignores other operators. client.results.get_for() uses {in: [...]} when tool_id / ligand_id are lists (e.g. Results.get_poses() defaults to a tool_id list), so the mock server will return unfiltered result-explorer records. With a session-scoped mock server, this can mix pockets + poses across tests and break consumers like LigandSet.from_docking_result() (attempting to load non-SDF file paths). Please add support for at least the in operator (and ideally validate/raise on unknown operators) while keeping the nested-data matching behavior.
| """Apply ``{"field": {"eq": value}}`` style filters to a list of records. | |
| Each filter key maps to a dict with an ``eq`` operator. The value is | |
| compared against top-level record fields *and* nested ``data`` fields. | |
| Args: | |
| records: List of record dicts to filter. | |
| filter_dict: Mapping of field names to ``{"eq": value}`` dicts. | |
| Returns: | |
| Filtered list of records. | |
| """ | |
| results = list(records) | |
| for key, condition in filter_dict.items(): | |
| if not isinstance(condition, dict) or "eq" not in condition: | |
| continue | |
| expected = condition["eq"] | |
| results = [ | |
| r | |
| for r in results | |
| if r.get(key) == expected | |
| or (isinstance(r.get("data"), dict) and r["data"].get(key) == expected) | |
| ] | |
| """Apply equality-style filters to a list of records. | |
| Supports ``{"field": {"eq": value}}`` and ``{"field": {"in": [values]}}`` | |
| style filters. Each filter key maps to a dict with exactly one supported | |
| operator. Values are compared against top-level record fields *and* | |
| nested ``data`` fields when present. | |
| Args: | |
| records: List of record dicts to filter. | |
| filter_dict: Mapping of field names to operator dicts, e.g. | |
| ``{"field": {"eq": value}}`` or ``{"field": {"in": [values]}}``. | |
| Returns: | |
| Filtered list of records. | |
| Raises: | |
| ValueError: If a filter condition uses an unknown operator, provides | |
| more than one operator, or has an invalid shape. | |
| """ | |
| results = list(records) | |
| allowed_ops = {"eq", "in"} | |
| for key, condition in filter_dict.items(): | |
| if not isinstance(condition, dict): | |
| raise ValueError( | |
| f"Invalid filter for field '{key}': expected dict of operators, " | |
| f"got {type(condition).__name__}" | |
| ) | |
| op_keys = set(condition.keys()) & allowed_ops | |
| unknown_ops = set(condition.keys()) - allowed_ops | |
| if unknown_ops: | |
| raise ValueError( | |
| f"Unsupported filter operator(s) for field '{key}': " | |
| f\"{', '.join(sorted(unknown_ops))}\" | |
| ) | |
| if len(op_keys) != 1: | |
| raise ValueError( | |
| f"Filter for field '{key}' must specify exactly one of " | |
| f\"{', '.join(sorted(allowed_ops))}, got {', '.join(sorted(op_keys)) or 'none'}\" | |
| ) | |
| op = next(iter(op_keys)) | |
| if op == "eq": | |
| expected = condition["eq"] | |
| def _matches(record: dict[str, Any]) -> bool: | |
| if record.get(key) == expected: | |
| return True | |
| data = record.get("data") | |
| return isinstance(data, dict) and data.get(key) == expected | |
| else: # op == "in" | |
| values = condition["in"] | |
| try: | |
| iterator = iter(values) | |
| except TypeError as exc: | |
| raise ValueError( | |
| f"Filter 'in' value for field '{key}' must be iterable" | |
| ) from exc | |
| # Materialize to support multiple passes and membership checks. | |
| value_set = set(iterator) | |
| def _matches(record: dict[str, Any]) -> bool: | |
| if record.get(key) in value_set: | |
| return True | |
| data = record.get("data") | |
| return isinstance(data, dict) and data.get(key) in value_set | |
| results = [r for r in results if _matches(r)] |
| ) | ||
| if inputs and "smiles_list" in inputs: | ||
| total_ligands += len(inputs["smiles_list"]) | ||
| total_ligands += len(single_job._attributes["userInputs"]["ligands"]) |
There was a problem hiding this comment.
_viz_func_docking now unconditionally indexes single_job._attributes["userInputs"]["ligands"]. If _attributes is None, userInputs is missing, or the older smiles_list format is present, this will raise (and break JobList rendering). Please restore defensive access here and support both input schemas (smiles_list and ligands).
| total_ligands += len(single_job._attributes["userInputs"]["ligands"]) | |
| # Defensive access to input schema: support both "ligands" and legacy "smiles_list" | |
| ligands_count = 0 | |
| attrs = getattr(single_job, "_attributes", None) or {} | |
| user_inputs = attrs.get("userInputs") or {} | |
| if isinstance(user_inputs, dict): | |
| ligands = user_inputs.get("ligands") | |
| if isinstance(ligands, list): | |
| ligands_count = len(ligands) | |
| else: | |
| smiles_list = user_inputs.get("smiles_list") | |
| if isinstance(smiles_list, list): | |
| ligands_count = len(smiles_list) | |
| total_ligands += ligands_count |
| for job in jobs_list: | ||
| this_smiles = [ | ||
| ligand["smiles"] for ligand in job._attributes["userInputs"]["ligands"] | ||
| ] | ||
|
|
||
| unique_smiles.update(this_smiles) |
There was a problem hiding this comment.
_name_func_docking now assumes job._attributes["userInputs"]["ligands"] always exists. That will raise for jobs missing userInputs / _attributes, and it also drops support for the legacy smiles_list format (which you still handle elsewhere, e.g. Docking.get_jobs_df). Consider handling both schemas and skipping jobs without usable inputs so the name function is robust.
| try: | ||
| df = self.get_results_using_id() | ||
| if df is not None: | ||
| remote_paths = df["file_path"].dropna().unique().tolist() | ||
| except Exception: | ||
| pass |
There was a problem hiding this comment.
get_poses() swallows all exceptions from get_results_using_id() (except Exception: pass), which can hide real problems (auth/config errors, schema changes, etc.) and make debugging very difficult. Consider catching only the expected failure modes (e.g., HTTP/status errors / DeepOriginException) and either logging a brief warning or re-raising unexpected exceptions.
| """Retrieve docking results as a flat DataFrame via the result-explorer API. | ||
|
|
||
| Queries the result-explorer endpoint filtered by protein ID and tool ID, | ||
| then flattens nested ``data`` fields into a single tabular structure. | ||
|
|
||
| Args: | ||
| limit: Maximum number of result records to fetch. If None, the API | ||
| default applies. | ||
|
|
||
| Returns: | ||
| DataFrame with columns ``id``, ``tool_id``, ``tool_version``, | ||
| ``ligand_id``, ``protein_id``, ``binding_energy``, ``pose_score``, | ||
| and ``file_path``. Returns None if no results were found. | ||
|
|
There was a problem hiding this comment.
Docstring mismatch: get_results_using_id() documents a limit argument, but the method signature doesn't accept limit and the implementation always fetches all records. Please either add the parameter and plumb it through to client.results.get_poses(limit=...), or remove/update the docstring section so it matches the actual API.
| self.executions = Executions(self) | ||
| self.organizations = Organizations(self) | ||
| self.billing = Billing(self) | ||
| self.data = Data(self) | ||
| self.entities = Entities(self) | ||
| self.results = Results(self) | ||
| self.projects = Projects(self) |
There was a problem hiding this comment.
DeepOriginClient replaces the public client.data attribute with client.entities / client.results / client.projects. This is a breaking change for downstream users (and the PR title suggests a patch-level fix). Consider keeping a backward-compatible data alias (possibly deprecated) that forwards to the new wrappers, or at least providing a clear deprecation path/error message so existing integrations don’t fail at runtime without guidance.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 33 out of 33 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Args: | ||
| limit: Maximum number of result records to fetch. If None, the API | ||
| default applies. | ||
|
|
There was a problem hiding this comment.
The get_results_using_id() docstring lists a limit argument, but the method signature doesn’t accept it and the implementation always fetches the default from client.results.get_poses(). Update the docstring (or add the parameter) to keep the public API documentation accurate.
| Args: | |
| limit: Maximum number of result records to fetch. If None, the API | |
| default applies. |
| self.tools = Tools(self) | ||
| self.functions = Functions(self) | ||
| self.clusters = Clusters(self) | ||
| self.files = Files(self) | ||
| self.executions = Executions(self) | ||
| self.organizations = Organizations(self) | ||
| self.billing = Billing(self) | ||
| self.data = Data(self) | ||
| self.entities = Entities(self) | ||
| self.results = Results(self) | ||
| self.projects = Projects(self) |
There was a problem hiding this comment.
DeepOriginClient no longer exposes client.data (it’s replaced by client.entities, client.results, and client.projects). This is a breaking change for existing SDK users and older docs/examples. Consider keeping a backwards-compatible data alias (possibly deprecated with a warning) or documenting this as a major-version change.
| for single_job in jobs_list: | ||
| inputs = ( | ||
| single_job._attributes.get("userInputs") if single_job._attributes else None | ||
| ) | ||
| if inputs and "smiles_list" in inputs: | ||
| total_ligands += len(inputs["smiles_list"]) | ||
| total_ligands += len(single_job._attributes["userInputs"]["ligands"]) | ||
|
|
There was a problem hiding this comment.
_viz_func_docking now indexes single_job._attributes["userInputs"]["ligands"] directly. This will raise TypeError/KeyError for jobs without _attributes, for older bulk-docking payloads that used userInputs.smiles_list, or for partial DTOs returned by some endpoints. Consider supporting both schemas (new ligands array and legacy smiles_list) and using .get() guards so rendering a JobList can’t crash the UI.
| box_size = float(2 * np.cbrt(pocket.volume)) | ||
| box_size = [box_size, box_size, box_size] |
There was a problem hiding this comment.
When pocket is provided, Docking.run() computes box_size from pocket.volume, but Pocket.volume is optional and Pocket.from_pdb_file() leaves it as None. This will raise a TypeError (np.cbrt(None)) for valid Pocket instances that don’t have volume populated. Handle the None case explicitly (e.g., require an explicit box_size, compute from coordinates, or fall back to legacy pocket.props["volume"] when present) and raise a clear error if it can’t be derived.
| box_size = float(2 * np.cbrt(pocket.volume)) | |
| box_size = [box_size, box_size, box_size] | |
| # If no explicit box_size is provided, try to infer it from the pocket | |
| if box_size is None: | |
| volume = pocket.volume | |
| # Fallback to legacy volume storage if needed | |
| if volume is None and hasattr(pocket, "props"): | |
| try: | |
| volume = pocket.props.get("volume") | |
| except AttributeError: | |
| # pocket.props is not a dict-like object; ignore and let volume remain None | |
| volume = None | |
| if volume is None: | |
| raise DeepOriginException( | |
| "Could not determine docking box size from the provided pocket: " | |
| "neither pocket.volume nor pocket.props['volume'] is set. " | |
| "Please provide an explicit box_size or populate the pocket volume." | |
| ) from None | |
| side_length = float(2 * np.cbrt(volume)) | |
| box_size = [side_length, side_length, side_length] |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 33 out of 33 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (1)
tests/test_entities.py:31
- This test hardcodes the org_key segment as "/data-platform/deeporigin/..." when asserting the DELETE URL. If the test environment uses a different org_key, this will fail even if the client is correct. Use client.org_key (or derive from the instantiated client) in the expected URL instead of hardcoding "deeporigin".
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
No description provided.