Skip to content

fix: surface failing-job link + real error in timelock executor Slack alerts#1925

Open
mirooon wants to merge 3 commits into
mainfrom
fix/timelock-auto-exec-slack-job-link
Open

fix: surface failing-job link + real error in timelock executor Slack alerts#1925
mirooon wants to merge 3 commits into
mainfrom
fix/timelock-auto-exec-slack-job-link

Conversation

@mirooon

@mirooon mirooon commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Which Linear task belongs to this PR?

EXSC-505

Why did I implement it this way?

The Timelock Auto Execution workflow (runPendingTimelockTXs.yml) posts execution results to #dev-sc-timelock-executions through the executor's SlackNotifier. Two operational problems made those alerts much less useful than intended, and this PR fixes both.

Problem 1 — the actual error never showed up. On a per-operation failure (e.g. the recurring Tron Account resource insufficient broadcast failure), the ❌ Timelock Operation Failed message — the only one that carries the real error — was silently rejected by Slack. The Tron broadcast error string is the entire raw transaction JSON (~3.5k chars); it was dropped verbatim into one Slack section block, and Slack rejects any block whose text exceeds 3000 chars with 400 - invalid_blocks. So the whole message was dropped and the channel only ever showed the batch summary with a misleading • tron: Unknown error.

  • Fix: clamp error/detail text below the Slack per-block limit (SLACK_TEXT_LIMIT = 2900) in extractErrorMessage/notifyBatchSummary, so the per-operation failure (with the real error) actually posts. Verified live: the error now renders, truncated with a trailing .

Problem 2 — hard to reach the workflow logs from Slack. To investigate a failure you had to leave Slack, open GitHub Actions, find the right workflow, then the right run, then the right job — many clicks and searching. The notification had no link.

  • Fix: add a View workflow run deep-link to failure / summary / not-scheduled notifications, resolved to the specific failing job (and its failing step) via the jobs API, so a dev can jump straight to the logs in one click. The workflow resolves the job URL (needs actions: read + gh api) and passes it to the executor via TIMELOCK_RUN_URL; the failure-fallback step deep-links the same way. The link is opt-in: absent locally (process.env.TIMELOCK_RUN_URL unset), so manual/local runs are unchanged.

Files:

  • script/utils/slack-notifier.ts — optional runUrl, appendRunLink() context block on failure/summary/not-scheduled messages, truncateText() + SLACK_TEXT_LIMIT.
  • script/deploy/safe/execute-pending-timelock-tx.ts — pass process.env.TIMELOCK_RUN_URL into the notifier.
  • .github/workflows/runPendingTimelockTXs.ymlactions: read; resolve the deep job URL via the jobs API in both the execute step (→ TIMELOCK_RUN_URL) and the failure-fallback step.
  • script/utils/slack-notifier.test.ts — new colocated tests (link present/absent, batch-summary link, oversized-error clamp).

Out of scope: the unfunded Tron timelock executor wallet (the root cause of the failing ops themselves) is a separate operational issue.

Checklist before requesting a review

  • I have performed a self-review of my code
  • This pull request is as small as possible and only tackles one problem
  • I have run /pr-ready (local CodeRabbit) on this branch and resolved (or explicitly documented) all findings — see .agents/commands/pr-ready.md
  • I have added tests that cover the functionality / test the bug
  • For new facets: I have checked all points from this list: https://www.notion.so/lifi/New-Facet-Contract-Checklist-157f0ff14ac78095a2b8f999d655622e
  • I have updated any required documentation

Checklist for reviewer (DO NOT DEPLOY and contracts BEFORE CHECKING THIS!!!)

  • I have checked that any arbitrary calls to external contracts are validated and or restricted
  • I have checked that any privileged calls (i.e. storage modifications) are validated and or restricted
  • I have ensured that any new contracts have had AT A MINIMUM 1 preliminary audit conducted on by <company/auditor>

mirooon and others added 2 commits June 12, 2026 20:02
…in runPendingTimelockTXs workflow

- Added permissions for actions to resolve job URLs for Slack deep-linking.
- Implemented logic to create deep links directly to the failing job and step, improving notification clarity.
- Updated environment variables to support the new deep-linking functionality.
…p oversized errors

The executor's own SlackNotifier (which fires on the happy path and on
per-operation failures) posted neither a link to the CI run nor, for Tron
broadcast failures, the error itself — the raw-transaction error string blew
past Slack's 3000-char per-block limit and was rejected with invalid_blocks.

- SlackNotifier takes an optional runUrl and appends a "View workflow run"
  deep-link to failure/summary/not-scheduled notifications.
- Clamp error/detail text below the Slack block limit so large RPC errors post.
- Workflow resolves the deep job URL via the jobs API and passes it through
  TIMELOCK_RUN_URL; the failure-fallback step also deep-links the same way.
- Add colocated tests for the link block and truncation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@lifi-action-bot lifi-action-bot marked this pull request as draft June 12, 2026 18:33
@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 18105ed9-8eec-4358-9e7d-43abc6dadc4c

📥 Commits

Reviewing files that changed from the base of the PR and between 2d519d3 and 44b6ebf.

📒 Files selected for processing (2)
  • script/utils/slack-notifier.test.ts
  • script/utils/slack-notifier.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • script/utils/slack-notifier.test.ts
  • script/utils/slack-notifier.ts

Walkthrough

Adds optional CI run deep-link support and text-truncation to SlackNotifier, updates the deploy script to pass the run URL, adds tests for deep-link rendering and payload safety, and enhances the GitHub Actions workflow to resolve job-level run URLs for Slack messages.

Changes

Slack Notification Deep-Linking and Payload Safety

Layer / File(s) Summary
SlackNotifier text safety and constructor
script/utils/slack-notifier.ts
Adds SLACK_TEXT_LIMIT, truncateText, and an optional runUrl parameter on the SlackNotifier constructor; extractErrorMessage now truncates string error fields.
SlackNotifier failure notification deep-linking
script/utils/slack-notifier.ts
notifyOperationFailed, notifyBatchSummary, and notifyNotScheduled append an optional Slack context block with a "View workflow run" deep-link when runUrl is configured; batch failure details are truncated before insertion.
SlackNotifier test coverage
script/utils/slack-notifier.test.ts
New Bun tests stub global.fetch to capture outgoing webhook payloads; tests assert presence/absence of run-link context blocks, clamping of oversized error text, and robustness against circular/non-serializable errors.
Deployment script notifier wiring
script/deploy/safe/execute-pending-timelock-tx.ts
Validates the --notify webhook URL via new URL() and passes process.env.TIMELOCK_RUN_URL as the runUrl argument when constructing SlackNotifier.
GitHub Actions workflow job URL resolution
.github/workflows/runPendingTimelockTXs.yml
Adds actions: read permission; executor step sets GH_TOKEN and RUN_OVERVIEW_URL, resolves the current job's html_url into TIMELOCK_RUN_URL via gh api/jq; Slack-failure fallback computes a job deep-link and updates WORKFLOW_URL accordingly.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

  • lifinance/contracts#1923: Modifies execute-pending-timelock-tx.ts Slack notification flow; related to when/what notifications are sent.
  • lifinance/contracts#1373: Earlier changes to script/utils/slack-notifier.ts; this PR builds on that notifier implementation.
  • lifinance/contracts#1718: Also updates runPendingTimelockTXs.yml Slack failure reporting and deep-link behavior.

Suggested labels

AuditNotRequired

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the main changes: surfacing the failing job link and real error in Slack alerts for the timelock executor.
Description check ✅ Passed The description comprehensively covers all required template sections: Linear task reference, detailed implementation rationale with two distinct problems and solutions, file changes, and a completed pre-review checklist. The optional reviewer checklist items are appropriately marked as not applicable.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/timelock-auto-exec-slack-job-link

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint install failed. For unrecoverable errors, disable the tool in CodeRabbit configuration.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mirooon mirooon marked this pull request as ready for review June 12, 2026 18:35

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
script/utils/slack-notifier.ts (1)

507-540: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard against Slack block-count overflow in not-scheduled alerts.

notifyNotScheduled adds one block per operation (Line 507 onward), and the new run-link block on Line 538 further increases total block count. For larger operations arrays, Slack can reject the payload once block count exceeds its limit, dropping the alert entirely.

Suggested fix (cap rendered operations + summarize remainder)
+const SLACK_MAX_BLOCKS = 50
+const RESERVED_NOT_SCHEDULED_BLOCKS = 5 // header, intro, divider, action, run-link(context)
+const MAX_NOT_SCHEDULED_OPERATION_BLOCKS =
+  SLACK_MAX_BLOCKS - RESERVED_NOT_SCHEDULED_BLOCKS

   // Add details for each not-scheduled operation
-  for (const op of operations)
+  const shownOperations = operations.slice(0, MAX_NOT_SCHEDULED_OPERATION_BLOCKS)
+  for (const op of shownOperations)
     if (message.blocks) {
       let operationText = `• *Operation ID:* \`${this.truncateHash(
         op.operationId
       )}\`\n  *MongoDB ID:* \`${
         op.transactionId
       }\`\n  *Safe Tx Hash:* \`${this.truncateHash(op.safeTxHash)}\``

       if (op.executionHash)
         operationText += `\n  *Execution Hash:* \`${this.truncateHash(
           op.executionHash
         )}\``

       message.blocks.push({
         type: 'section',
         text: {
           type: 'mrkdwn',
           text: operationText,
         },
       })
     }

+  if (operations.length > shownOperations.length && message.blocks)
+    message.blocks.push({
+      type: 'section',
+      text: {
+        type: 'mrkdwn',
+        text: `…and ${operations.length - shownOperations.length} more operation(s) not shown.`,
+      },
+    })
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@script/utils/slack-notifier.ts` around lines 507 - 540, notifyNotScheduled
can exceed Slack's block limit by adding a block per operation plus the run-link
block; limit the number of per-operation blocks rendered by defining a
MAX_BLOCKS constant (respecting Slack's limit) and reserving slots for the
run-link and a summary block, compute maxOps = MAX_BLOCKS - reservedSlots,
iterate only over operations.slice(0, maxOps) when pushing operation blocks
(using truncateHash as before), and if operations.length > maxOps push an extra
summary section block like "and N more operations not shown" before calling
appendRunLink and sendNotificationWithRetry so the payload never exceeds Slack's
block count.
🧹 Nitpick comments (2)
script/utils/slack-notifier.test.ts (1)

55-111: ⚡ Quick win

Add a regression test for notifyNotScheduled run-link behavior.

This PR adds appendRunLink() to notifyNotScheduled, but the suite only asserts run-link behavior for failure and batch-summary paths. Add one test for link present/absent on notifyNotScheduled to lock the new contract.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@script/utils/slack-notifier.test.ts` around lines 55 - 111, Add regression
tests for SlackNotifier.notifyNotScheduled to assert run-link behavior: use
mockFetchCapturing() and call new SlackNotifier(WEBHOOK,
RUN_URL).notifyNotScheduled({ network: 'tron', operation: baseOp, reason: 'some
reason' }) and assert a context block with `<${RUN_URL}|View workflow run>` is
present, and a second test that constructs new SlackNotifier(WEBHOOK) (no
RUN_URL) and asserts no context block is emitted. Place these alongside the
existing SlackNotifier tests so notifyNotScheduled and the new appendRunLink()
behavior are covered.
.github/workflows/runPendingTimelockTXs.yml (1)

105-112: Confirm the github.job → Jobs API .name match for runPendingTimelockTXs.yml

execute-timelock in .github/workflows/runPendingTimelockTXs.yml has no job-level name: key, so the Jobs API name field (job display name) should default to the job id/key. That makes the current selector select(.name == "${{ github.job }}") appropriate, and the fallback to .jobs[0].html_url should only trigger for unexpected API/schema behavior. If a job-level name: is added later (or the workflow gains multiple jobs), the selector could stop matching and the fallback may link to the wrong job.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/runPendingTimelockTXs.yml around lines 105 - 112, Add an
explicit job-level name that matches github.job so the Jobs API selector works
reliably: update the job that runs the timelock (the job currently referenced by
github.job / the execute-timelock step) to include name: "execute-timelock" (or
whatever string github.job evaluates to), ensuring the jq selector select(.name
== "${{ github.job }}") used to compute RESOLVED_URL and export TIMELOCK_RUN_URL
continues to match the correct job; alternatively, if you prefer not to add a
name, change the jq selector to match the job key/id field instead of .name.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@script/utils/slack-notifier.ts`:
- Line 652: The fallback in extractErrorMessage uses JSON.stringify(error) which
will throw on circular or non-serializable objects; wrap serialization in a safe
try/catch or use a circular-safe serializer (e.g., a replacer that tracks seen
objects or util.inspect) to produce a string, then truncate to 500 chars. Update
extractErrorMessage to attempt JSON.stringify(error) inside try/catch (or use a
circular-safe replacer) and on error fall back to String(error) or
util.inspect(error, { depth: null }) before slicing to 500 characters.

---

Outside diff comments:
In `@script/utils/slack-notifier.ts`:
- Around line 507-540: notifyNotScheduled can exceed Slack's block limit by
adding a block per operation plus the run-link block; limit the number of
per-operation blocks rendered by defining a MAX_BLOCKS constant (respecting
Slack's limit) and reserving slots for the run-link and a summary block, compute
maxOps = MAX_BLOCKS - reservedSlots, iterate only over operations.slice(0,
maxOps) when pushing operation blocks (using truncateHash as before), and if
operations.length > maxOps push an extra summary section block like "and N more
operations not shown" before calling appendRunLink and sendNotificationWithRetry
so the payload never exceeds Slack's block count.

---

Nitpick comments:
In @.github/workflows/runPendingTimelockTXs.yml:
- Around line 105-112: Add an explicit job-level name that matches github.job so
the Jobs API selector works reliably: update the job that runs the timelock (the
job currently referenced by github.job / the execute-timelock step) to include
name: "execute-timelock" (or whatever string github.job evaluates to), ensuring
the jq selector select(.name == "${{ github.job }}") used to compute
RESOLVED_URL and export TIMELOCK_RUN_URL continues to match the correct job;
alternatively, if you prefer not to add a name, change the jq selector to match
the job key/id field instead of .name.

In `@script/utils/slack-notifier.test.ts`:
- Around line 55-111: Add regression tests for SlackNotifier.notifyNotScheduled
to assert run-link behavior: use mockFetchCapturing() and call new
SlackNotifier(WEBHOOK, RUN_URL).notifyNotScheduled({ network: 'tron', operation:
baseOp, reason: 'some reason' }) and assert a context block with
`<${RUN_URL}|View workflow run>` is present, and a second test that constructs
new SlackNotifier(WEBHOOK) (no RUN_URL) and asserts no context block is emitted.
Place these alongside the existing SlackNotifier tests so notifyNotScheduled and
the new appendRunLink() behavior are covered.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 56dab192-5ef6-43c2-960a-b65d4c23a8a3

📥 Commits

Reviewing files that changed from the base of the PR and between 518b72e and 2d519d3.

📒 Files selected for processing (4)
  • .github/workflows/runPendingTimelockTXs.yml
  • script/deploy/safe/execute-pending-timelock-tx.ts
  • script/utils/slack-notifier.test.ts
  • script/utils/slack-notifier.ts

Comment thread script/utils/slack-notifier.ts Outdated
…rrors (slack-notifier.ts:652)

CodeRabbit (PR #1925): JSON.stringify(error) throws on circular/non-serializable
payloads (e.g. some viem errors), which would break notification assembly on the
failure path. Wrap in try/catch with a String() fallback. Added a regression test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mirooon mirooon enabled auto-merge (squash) June 12, 2026 18:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants