fix: support for new docking version by sg-s · Pull Request #453 · deeporiginbio/do-dd-client

sg-s · 2026-02-25T17:20:00Z

No description provided.

Copilot

Pull request overview

This PR updates the SDK and its local mock server/test suite to support the newer docking/pocketfinder tooling and the Data Platform “result-explorer” flow, including new client wrappers and schema changes.

Changes:

Introduce new platform wrappers (Results, Projects) and rename/refocus the old “Data” wrapper as Entities (plus add DeepOriginClient.health() and delete_json()).
Update docking/pocketfinder integration to newer tool versions and newer docking userInputs schema (ligands array, IDs).
Extend the mock server + fixtures and add/adjust tests to cover result-explorer queries and entity deletion.

Reviewed changes

Copilot reviewed 32 out of 32 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`src/platform/client.py`	Swaps `data` wrapper for `entities/results/projects`, adds combined `health()` and `delete_json()`
`src/platform/entities.py`	Renames `Data` → `Entities`, adds generic `get()` + `delete()`, reorganizes ligand/protein helpers
`src/platform/results.py`	New result-explorer wrapper with pagination + convenience methods (`get_poses`, `get_pockets`, `with_ligands`)
`src/platform/projects.py`	New projects wrapper (`list()`)
`src/platform/job_viz_functions.py`	Updates docking viz/name functions to new `userInputs` ligand schema
`src/functions/docking.py`	Bumps docking function version
`src/functions/pocket_finder.py`	Bumps pocketfinder function version
`src/drug_discovery/docking.py`	Updates docking workflow to sync IDs, new inputs schema, and adds result-explorer retrieval path
`src/drug_discovery/utils.py`	Adjusts docking tool-run outputs setup
`src/drug_discovery/structures/protein.py`	Switches to `client.entities`, adds `register()`, updates `sync()` semantics, adds optional PDB validation flag
`src/drug_discovery/structures/ligand.py`	Adds `from_id()`, splits `register()` vs `sync()`, adds `LigandSet.from_docking_result()`, switches to `client.entities`
`src/drug_discovery/structures/pocket.py`	Adds explicit pocket properties, `id`, JSON key mapping, richer repr, and `Pocket.from_result()` via result-explorer
`tests/mock_server/server.py`	Adds in-memory result-explorer store and loads result-explorer fixtures
`tests/mock_server/routers/tools.py`	Adds `/tools/health` and injects function outputs into result-explorer store
`tests/mock_server/routers/data_platform.py`	Adds result-explorer search endpoint, delete entity endpoint, and updates protein creation behavior
`tests/mock_server/routers/billing.py`	Adds `/billing/health`
`tests/fixtures/result-explorer-bulk-docking-08BSPN61NYVE3.json`	Adds result-explorer fixture data for bulk-docking
`tests/test_results.py`	New tests for `client.results.get_for()` behavior
`tests/test_projects.py`	New tests for `client.projects.list()`
`tests/test_entities.py`	Refactors tests from `client.data` to `client.entities`, adds delete tests
`tests/test_functions_with_data_platform.py`	Adds fixtures for registered protein/ligand and validates result-explorer-derived pockets/poses
`tests/test_job_list.py`	Updates DTO helpers + userInputs schema for docking visualization/tests
`tests/test_pocket.py`	Updates expectations to new Pocket attributes + `id` behavior
`tests/test_ligand.py`	Adds lv1 tests for ligand register/from_id
`tests/test_ligand_set.py`	Adds lv1 test for batch ligand creation via sync
`mkdocs.yaml`	Updates nav to `entities/results/projects` docs
`docs/platform/ref/entities.md`	New Entities API docs
`docs/platform/ref/results.md`	New Results API docs
`docs/platform/ref/projects.md`	New Projects API docs
`docs/platform/ref/data.md`	Removes old Data API docs page
`docs/dev/mock-server.md`	Expands mock server architecture and result-explorer injection documentation
`.vscode/settings.json`	Adds spelling dictionary entry

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-02T18:00:32Z

tests/mock_server/routers/data_platform.py

+    """Apply ``{"field": {"eq": value}}`` style filters to a list of records.
+
+    Each filter key maps to a dict with an ``eq`` operator. The value is
+    compared against top-level record fields *and* nested ``data`` fields.
+
+    Args:
+        records: List of record dicts to filter.
+        filter_dict: Mapping of field names to ``{"eq": value}`` dicts.
+
+    Returns:
+        Filtered list of records.
+    """
+    results = list(records)
+    for key, condition in filter_dict.items():
+        if not isinstance(condition, dict) or "eq" not in condition:
+            continue
+        expected = condition["eq"]
+        results = [
+            r
+            for r in results
+            if r.get(key) == expected
+            or (isinstance(r.get("data"), dict) and r["data"].get(key) == expected)
+        ]


_apply_eq_filters only implements {field: {eq: ...}} and silently ignores other operators. client.results.get_for() uses {in: [...]} when tool_id / ligand_id are lists (e.g. Results.get_poses() defaults to a tool_id list), so the mock server will return unfiltered result-explorer records. With a session-scoped mock server, this can mix pockets + poses across tests and break consumers like LigandSet.from_docking_result() (attempting to load non-SDF file paths). Please add support for at least the in operator (and ideally validate/raise on unknown operators) while keeping the nested-data matching behavior.

Suggested change

"""Apply ``{"field": {"eq": value}}`` style filters to a list of records.

Each filter key maps to a dict with an ``eq`` operator. The value is

compared against top-level record fields *and* nested ``data`` fields.

Args:

records: List of record dicts to filter.

filter_dict: Mapping of field names to ``{"eq": value}`` dicts.

Returns:

Filtered list of records.

"""

results = list(records)

for key, condition in filter_dict.items():

if not isinstance(condition, dict) or "eq" not in condition:

continue

expected = condition["eq"]

results = [

r

for r in results

if r.get(key) == expected

or (isinstance(r.get("data"), dict) and r["data"].get(key) == expected)

]

"""Apply equality-style filters to a list of records.

Supports ``{"field": {"eq": value}}`` and ``{"field": {"in": [values]}}``

style filters. Each filter key maps to a dict with exactly one supported

operator. Values are compared against top-level record fields *and*

nested ``data`` fields when present.

Args:

records: List of record dicts to filter.

filter_dict: Mapping of field names to operator dicts, e.g.

``{"field": {"eq": value}}`` or ``{"field": {"in": [values]}}``.

Returns:

Filtered list of records.

Raises:

ValueError: If a filter condition uses an unknown operator, provides

more than one operator, or has an invalid shape.

"""

results = list(records)

allowed_ops = {"eq", "in"}

for key, condition in filter_dict.items():

if not isinstance(condition, dict):

raise ValueError(

f"Invalid filter for field '{key}': expected dict of operators, "

f"got {type(condition).__name__}"

)

op_keys = set(condition.keys()) & allowed_ops

unknown_ops = set(condition.keys()) - allowed_ops

if unknown_ops:

raise ValueError(

f"Unsupported filter operator(s) for field '{key}': "

f\"{', '.join(sorted(unknown_ops))}\"

)

if len(op_keys) != 1:

raise ValueError(

f"Filter for field '{key}' must specify exactly one of "

f\"{', '.join(sorted(allowed_ops))}, got {', '.join(sorted(op_keys)) or 'none'}\"

)

op = next(iter(op_keys))

if op == "eq":

expected = condition["eq"]

def _matches(record: dict[str, Any]) -> bool:

if record.get(key) == expected:

return True

data = record.get("data")

return isinstance(data, dict) and data.get(key) == expected

else: # op == "in"

values = condition["in"]

try:

iterator = iter(values)

except TypeError as exc:

raise ValueError(

f"Filter 'in' value for field '{key}' must be iterable"

) from exc

# Materialize to support multiple passes and membership checks.

value_set = set(iterator)

def _matches(record: dict[str, Any]) -> bool:

if record.get(key) in value_set:

return True

data = record.get("data")

return isinstance(data, dict) and data.get(key) in value_set

results = [r for r in results if _matches(r)]

Copilot · 2026-03-02T18:00:33Z

src/platform/job_viz_functions.py

-        )
-        if inputs and "smiles_list" in inputs:
-            total_ligands += len(inputs["smiles_list"])
+        total_ligands += len(single_job._attributes["userInputs"]["ligands"])


_viz_func_docking now unconditionally indexes single_job._attributes["userInputs"]["ligands"]. If _attributes is None, userInputs is missing, or the older smiles_list format is present, this will raise (and break JobList rendering). Please restore defensive access here and support both input schemas (smiles_list and ligands).

Suggested change

total_ligands += len(single_job._attributes["userInputs"]["ligands"])

# Defensive access to input schema: support both "ligands" and legacy "smiles_list"

ligands_count = 0

attrs = getattr(single_job, "_attributes", None) or {}

user_inputs = attrs.get("userInputs") or {}

if isinstance(user_inputs, dict):

ligands = user_inputs.get("ligands")

if isinstance(ligands, list):

ligands_count = len(ligands)

else:

smiles_list = user_inputs.get("smiles_list")

if isinstance(smiles_list, list):

ligands_count = len(smiles_list)

total_ligands += ligands_count

Copilot · 2026-03-02T18:00:33Z

src/platform/job_viz_functions.py

+    for job in jobs_list:
+        this_smiles = [
+            ligand["smiles"] for ligand in job._attributes["userInputs"]["ligands"]
+        ]
+
+        unique_smiles.update(this_smiles)


_name_func_docking now assumes job._attributes["userInputs"]["ligands"] always exists. That will raise for jobs missing userInputs / _attributes, and it also drops support for the legacy smiles_list format (which you still handle elsewhere, e.g. Docking.get_jobs_df). Consider handling both schemas and skipping jobs without usable inputs so the name function is robust.

Copilot · 2026-03-02T18:00:34Z

src/drug_discovery/docking.py

+        try:
+            df = self.get_results_using_id()
+            if df is not None:
+                remote_paths = df["file_path"].dropna().unique().tolist()
+        except Exception:
+            pass


get_poses() swallows all exceptions from get_results_using_id() (except Exception: pass), which can hide real problems (auth/config errors, schema changes, etc.) and make debugging very difficult. Consider catching only the expected failure modes (e.g., HTTP/status errors / DeepOriginException) and either logging a brief warning or re-raising unexpected exceptions.

Copilot · 2026-03-02T18:00:34Z

src/drug_discovery/docking.py

+        """Retrieve docking results as a flat DataFrame via the result-explorer API.
+
+        Queries the result-explorer endpoint filtered by protein ID and tool ID,
+        then flattens nested ``data`` fields into a single tabular structure.
+
+        Args:
+            limit: Maximum number of result records to fetch. If None, the API
+                default applies.
+
+        Returns:
+            DataFrame with columns ``id``, ``tool_id``, ``tool_version``,
+            ``ligand_id``, ``protein_id``, ``binding_energy``, ``pose_score``,
+            and ``file_path``. Returns None if no results were found.
+


Docstring mismatch: get_results_using_id() documents a limit argument, but the method signature doesn't accept limit and the implementation always fetches all records. Please either add the parameter and plumb it through to client.results.get_poses(limit=...), or remove/update the docstring section so it matches the actual API.

Copilot · 2026-03-02T18:00:34Z

src/platform/client.py

        self.executions = Executions(self)
        self.organizations = Organizations(self)
        self.billing = Billing(self)
-        self.data = Data(self)
+        self.entities = Entities(self)
+        self.results = Results(self)
+        self.projects = Projects(self)


DeepOriginClient replaces the public client.data attribute with client.entities / client.results / client.projects. This is a breaking change for downstream users (and the PR title suggests a patch-level fix). Consider keeping a backward-compatible data alias (possibly deprecated) that forwards to the new wrappers, or at least providing a clear deprecation path/error message so existing integrations don’t fail at runtime without guidance.

Copilot

Pull request overview

Copilot reviewed 33 out of 33 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-02T18:44:44Z

src/drug_discovery/docking.py

+        Args:
+            limit: Maximum number of result records to fetch. If None, the API
+                default applies.
+


The get_results_using_id() docstring lists a limit argument, but the method signature doesn’t accept it and the implementation always fetches the default from client.results.get_poses(). Update the docstring (or add the parameter) to keep the public API documentation accurate.

Suggested change

Args:

limit: Maximum number of result records to fetch. If None, the API

default applies.

Copilot · 2026-03-02T18:44:45Z

src/platform/client.py

        self.tools = Tools(self)
        self.functions = Functions(self)
        self.clusters = Clusters(self)
        self.files = Files(self)
        self.executions = Executions(self)
        self.organizations = Organizations(self)
        self.billing = Billing(self)
-        self.data = Data(self)
+        self.entities = Entities(self)
+        self.results = Results(self)
+        self.projects = Projects(self)


DeepOriginClient no longer exposes client.data (it’s replaced by client.entities, client.results, and client.projects). This is a breaking change for existing SDK users and older docs/examples. Consider keeping a backwards-compatible data alias (possibly deprecated with a warning) or documenting this as a major-version change.

Copilot · 2026-03-02T18:44:45Z

src/platform/job_viz_functions.py

    for single_job in jobs_list:
-        inputs = (
-            single_job._attributes.get("userInputs") if single_job._attributes else None
-        )
-        if inputs and "smiles_list" in inputs:
-            total_ligands += len(inputs["smiles_list"])
+        total_ligands += len(single_job._attributes["userInputs"]["ligands"])



_viz_func_docking now indexes single_job._attributes["userInputs"]["ligands"] directly. This will raise TypeError/KeyError for jobs without _attributes, for older bulk-docking payloads that used userInputs.smiles_list, or for partial DTOs returned by some endpoints. Consider supporting both schemas (new ligands array and legacy smiles_list) and using .get() guards so rendering a JobList can’t crash the UI.

Copilot · 2026-03-02T18:44:45Z

src/drug_discovery/docking.py

+            box_size = float(2 * np.cbrt(pocket.volume))
            box_size = [box_size, box_size, box_size]


When pocket is provided, Docking.run() computes box_size from pocket.volume, but Pocket.volume is optional and Pocket.from_pdb_file() leaves it as None. This will raise a TypeError (np.cbrt(None)) for valid Pocket instances that don’t have volume populated. Handle the None case explicitly (e.g., require an explicit box_size, compute from coordinates, or fall back to legacy pocket.props["volume"] when present) and raise a clear error if it can’t be derived.

Suggested change

box_size = float(2 * np.cbrt(pocket.volume))

box_size = [box_size, box_size, box_size]

# If no explicit box_size is provided, try to infer it from the pocket

if box_size is None:

volume = pocket.volume

# Fallback to legacy volume storage if needed

if volume is None and hasattr(pocket, "props"):

try:

volume = pocket.props.get("volume")

except AttributeError:

# pocket.props is not a dict-like object; ignore and let volume remain None

volume = None

if volume is None:

raise DeepOriginException(

"Could not determine docking box size from the provided pocket: "

"neither pocket.volume nor pocket.props['volume'] is set. "

"Please provide an explicit box_size or populate the pocket volume."

) from None

side_length = float(2 * np.cbrt(volume))

box_size = [side_length, side_length, side_length]

Copilot

Pull request overview

Copilot reviewed 33 out of 33 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (1)

tests/test_entities.py:31

This test hardcodes the org_key segment as "/data-platform/deeporigin/..." when asserting the DELETE URL. If the test environment uses a different org_key, this will fail even if the client is correct. Use client.org_key (or derive from the instantiated client) in the expected URL instead of hardcoding "deeporigin".

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

fix: support for new docking version

cf76421

sg-s requested a review from a team as a code owner February 25, 2026 17:20

sg-s had a problem deploying to prod February 25, 2026 17:20 — with GitHub Actions Failure

sg-s had a problem deploying to staging February 25, 2026 17:20 — with GitHub Actions Failure

sg-s temporarily deployed to dev February 25, 2026 17:20 — with GitHub Actions Inactive

feat: cleaned up some tests, now pass locally

48cd934

sg-s temporarily deployed to dev February 26, 2026 17:47 — with GitHub Actions Inactive

sg-s had a problem deploying to staging February 26, 2026 17:47 — with GitHub Actions Failure

sg-s had a problem deploying to prod February 26, 2026 17:47 — with GitHub Actions Failure

feat: results explorer client

742fbfb

sg-s had a problem deploying to prod March 1, 2026 00:47 — with GitHub Actions Failure

sg-s temporarily deployed to dev March 1, 2026 00:47 — with GitHub Actions Inactive

sg-s had a problem deploying to staging March 1, 2026 00:47 — with GitHub Actions Failure

Merge branch 'main' into bulk-docking-dp-support

61eeaa9

sg-s had a problem deploying to staging March 1, 2026 00:58 — with GitHub Actions Failure

sg-s had a problem deploying to prod March 1, 2026 00:58 — with GitHub Actions Failure

sg-s temporarily deployed to dev March 1, 2026 00:58 — with GitHub Actions Inactive

sg-s self-assigned this Mar 1, 2026

feat: cursor based pagination

320704f

sg-s requested a review from maurocolella March 1, 2026 01:18

sg-s had a problem deploying to prod March 1, 2026 01:18 — with GitHub Actions Failure

sg-s had a problem deploying to staging March 1, 2026 01:18 — with GitHub Actions Failure

sg-s temporarily deployed to dev March 1, 2026 01:18 — with GitHub Actions Inactive

feat: Add get_poses method

e61e584

sg-s temporarily deployed to dev March 1, 2026 02:03 — with GitHub Actions Inactive

sg-s had a problem deploying to prod March 1, 2026 02:03 — with GitHub Actions Failure

sg-s had a problem deploying to staging March 1, 2026 02:03 — with GitHub Actions Failure

feat: gate for legacy results

efcfc33

sg-s temporarily deployed to dev March 1, 2026 02:05 — with GitHub Actions Inactive

sg-s had a problem deploying to prod March 2, 2026 16:25 — with GitHub Actions Failure

sg-s had a problem deploying to staging March 2, 2026 16:25 — with GitHub Actions Failure

sg-s had a problem deploying to prod March 2, 2026 16:25 — with GitHub Actions Failure

sg-s had a problem deploying to dev March 2, 2026 16:25 — with GitHub Actions Failure

fix: tests for function + data platform integration

b3ed163

sg-s had a problem deploying to staging March 2, 2026 17:20 — with GitHub Actions Failure

sg-s had a problem deploying to prod March 2, 2026 17:20 — with GitHub Actions Failure

sg-s had a problem deploying to dev March 2, 2026 17:20 — with GitHub Actions Failure

sg-s had a problem deploying to docs March 2, 2026 17:20 — with GitHub Actions Failure

feat: better testing, better ligand support

456aaf5

sg-s had a problem deploying to docs March 2, 2026 17:56 — with GitHub Actions Failure

sg-s had a problem deploying to prod March 2, 2026 17:56 — with GitHub Actions Failure

sg-s had a problem deploying to dev March 2, 2026 17:56 — with GitHub Actions Failure

sg-s had a problem deploying to staging March 2, 2026 17:56 — with GitHub Actions Failure

sg-s requested a review from Copilot March 2, 2026 17:56

Copilot started reviewing on behalf of sg-s March 2, 2026 17:56 View session

Copilot AI reviewed Mar 2, 2026

View reviewed changes

fix: tests pass on local

a294a9e

sg-s had a problem deploying to staging March 2, 2026 18:36 — with GitHub Actions Failure

sg-s temporarily deployed to dev March 2, 2026 18:36 — with GitHub Actions Inactive

sg-s had a problem deploying to docs March 2, 2026 18:36 — with GitHub Actions Failure

sg-s had a problem deploying to prod March 2, 2026 18:36 — with GitHub Actions Failure

sg-s requested a review from Copilot March 2, 2026 18:38

Copilot started reviewing on behalf of sg-s March 2, 2026 18:39 View session

Copilot AI reviewed Mar 2, 2026

View reviewed changes

sg-s requested a review from Copilot March 2, 2026 18:48

Copilot started reviewing on behalf of sg-s March 2, 2026 18:49 View session

Copilot AI reviewed Mar 2, 2026

View reviewed changes

sg-s merged commit 4845b61 into main Mar 2, 2026
19 of 22 checks passed

sg-s deleted the bulk-docking-dp-support branch March 2, 2026 18:53

-        total_ligands += len(single_job._attributes["userInputs"]["ligands"])
+        # Defensive access to input schema: support both "ligands" and legacy "smiles_list"
+        ligands_count = 0
+        attrs = getattr(single_job, "_attributes", None) or {}
+        user_inputs = attrs.get("userInputs") or {}
+        if isinstance(user_inputs, dict):
+            ligands = user_inputs.get("ligands")
+            if isinstance(ligands, list):
+                ligands_count = len(ligands)
+            else:
+                smiles_list = user_inputs.get("smiles_list")
+                if isinstance(smiles_list, list):
+                    ligands_count = len(smiles_list)
+        total_ligands += ligands_count

	Args:
	limit: Maximum number of result records to fetch. If None, the API
	default applies.

		box_size = float(2 * np.cbrt(pocket.volume))
		box_size = [box_size, box_size, box_size]

-            box_size = float(2 * np.cbrt(pocket.volume))
-            box_size = [box_size, box_size, box_size]
+            # If no explicit box_size is provided, try to infer it from the pocket
+            if box_size is None:
+                volume = pocket.volume
+                # Fallback to legacy volume storage if needed
+                if volume is None and hasattr(pocket, "props"):
+                    try:
+                        volume = pocket.props.get("volume")
+                    except AttributeError:
+                        # pocket.props is not a dict-like object; ignore and let volume remain None
+                        volume = None
+                if volume is None:
+                    raise DeepOriginException(
+                        "Could not determine docking box size from the provided pocket: "
+                        "neither pocket.volume nor pocket.props['volume'] is set. "
+                        "Please provide an explicit box_size or populate the pocket volume."
+                    ) from None
+                side_length = float(2 * np.cbrt(volume))
+                box_size = [side_length, side_length, side_length]

Conversation

sg-s commented Feb 25, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants