Skip to content

Clarify worker cancellation requirements for stuck jobs#1264

Merged
brandur merged 3 commits into
riverqueue:masterfrom
peter941221:peter/1258-stuck-job-cancellation-docs
Jun 1, 2026
Merged

Clarify worker cancellation requirements for stuck jobs#1264
brandur merged 3 commits into
riverqueue:masterfrom
peter941221:peter/1258-stuck-job-cancellation-docs

Conversation

@peter941221
Copy link
Copy Markdown
Contributor

This follows up the stuck-job discussion in #1258.

The current docs already mention that workers should respect context cancellation, but they don't quite connect that requirement to the failure mode users are likely to see in practice: jobs that remain in running because worker code is blocked without also observing ctx.Done().

  1. This change expands Worker.Work's contract to call out blocking operations like channels, timers, and network work explicitly, and recommends a select that also watches ctx.Done().
  2. It adds the same guidance to the top-level package docs and README so that users encounter it earlier, not only in the interface comment.
  3. It adds a short diagnostic hint that points users at num_jobs_stuck in the producer job counts log line when investigating jobs that appear stuck in running.
  4. It adds a small note to the graceful shutdown example to make the cancellation expectation concrete in a worker example that's already about stop behavior.

The goal here isn't to change rescue semantics, only to make the existing behavior easier to understand and diagnose.

Verification:

  1. Formatted the touched Go files with gofmt.
  2. Ran go test ./... -run TestDoesNotExist -count=1 to confirm the repo still loads and compiles after the docs/example changes.

Caveat:

  1. I didn't run the database-backed example tests in this environment because local PostgreSQL authentication isn't configured.

@peter941221 peter941221 marked this pull request as ready for review June 1, 2026 08:28
Copy link
Copy Markdown
Contributor

@brandur brandur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peter941221 I think some of this is a little too much detail for some of the areas like README and general Godoc intro — these sections are arguably already a bit too long and we should be really prescriptive about what is allowed to go in there.

I think the piece in worker.go is a nice improvement — do we want to revert the other stuff and we can keep that?

In terms of general documentation — I think what I'll do is put together a specific "stuck jobs" advanced topics page. We have a little already at https://riverqueue.com/docs/graceful-shutdown#stuck-programs, but this could stand to be expanded a bit.

@peter941221
Copy link
Copy Markdown
Contributor Author

@brandur sure, will do that.

@peter941221
Copy link
Copy Markdown
Contributor Author

Reverted the broader README, Godoc, and example additions and kept the worker.go clarification only. We can move the deeper stuck-jobs guidance to a dedicated advanced topics page later.

Copy link
Copy Markdown
Contributor

@brandur brandur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thx.

@brandur brandur merged commit 0808652 into riverqueue:master Jun 1, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants