Skip to content

Wiped dataclips remain searchable — search_vector not cleared on wipe #4824

@stuartc

Description

@stuartc

Describe the bug

When a dataclip is wiped (per-run wipe with save_dataclips: false, or the project data-retention job), its body/request are cleared to NULL and wiped_at is stamped, but its full-text search_vector is left intact. The work-order body search matches that stale vector with no wiped_at filter, so a wiped dataclip stays searchable by the exact content that was meant to be erased — a data-retention hole.

Version number

Reproduced on main (commit 6331486e8d) and on the defer-dataclip-search-vector branch (8fc24bab42).

I have reproduced this locally on main:

  • Yes
  • No

On main the AFTER INSERT trigger builds the vector synchronously; the wipe leaves it stale. On the deferred-indexing branch the worker builds it instead, but the wipe is the same — and the worker only re-sweeps WHERE search_vector IS NULL, so it never re-indexes a wiped row.

To Reproduce

  1. Create a work order whose input dataclip body contains a distinctive token, e.g. {"secret": "zzzuniquetoken …"}.
  2. In the work-order list, search that project for zzzuniquetoken with the body filter → the work order is found (expected).
  3. Wipe the dataclip — either let the project's data-retention period elapse, or trigger a run with save_dataclips: false. The dataclip's body becomes NULL and wiped_at is set.
  4. Search again for zzzuniquetoken (body filter).
  5. The wiped work order is still returned, even though its content is gone.

Expected behavior

Once a dataclip is wiped, searching for terms that were in its (now-erased) body should return nothing — the indexed content should be erased alongside the body.

Screenshots

N/A (confirmed via automated test rather than UI).

Additional context

  • Code paths:
    • Wipe sets body/request/wiped_at only — Invocation.Query.wipe_dataclips/1 (lib/lightning/invocation/query.ex), used by Runs.wipe_dataclips/1 (lib/lightning/runs.ex) and Projects.wipe_dataclips_for/1 (lib/lightning/projects.ex).
    • Read side has no wiped_at filter — Invocation.build_search_fields_where/2, :body branch (lib/lightning/invocation.ex), reached from the main UI search Invocation.search_workorders/3.
  • Pre-existing, not a regression: the original DB trigger was AFTER INSERT only (priv/repo/migrations/20240329123804_…), so it never re-ran on the wipe UPDATE.
  • A guard already exists, just not wired in: Invocation.exclude_wiped_dataclips/1 filters is_nil(d.wiped_at), but is only applied to search_workorders_for_retry/2 and cancel — not the main search_workorders/3.
  • Fix options: (1) add search_vector: nil to the wipe update_all (on the deferred branch the worker then re-indexes a NULL body to an empty vector → matches nothing) — fixes it at source for every reader; or (2) apply the wiped_at filter to search_workorders/3. Recommend (1), optionally + (2) for defence in depth. Low risk either way.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugNewly identified bug

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    New Issues

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions