Skip to content

feat: add backfill support for llm-error-pages#2099

Open
ymadobe wants to merge 6 commits intomainfrom
llm-error-pages-backfill
Open

feat: add backfill support for llm-error-pages#2099
ymadobe wants to merge 6 commits intomainfrom
llm-error-pages-backfill

Conversation

@ymadobe
Copy link
Contributor

@ymadobe ymadobe commented Mar 10, 2026

Please ensure your pull request adheres to the following guidelines:

  • make sure to link the related issues in this description
  • when merging / squashing, make sure the fixed issue references are visible in the commits, for easy compilation of release notes
  • If data sources for any opportunity has been updated/added, please update the wiki for same opportunity.

Related Issues

LLMO-1678

Thanks for contributing!

@github-actions
Copy link

This PR will trigger a minor release when merged.

@codecov
Copy link

codecov bot commented Mar 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@ymadobe ymadobe requested a review from duynguyen March 10, 2026 12:15
Copy link
Contributor

@duynguyen duynguyen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Branch Review: llm-error-pages-backfill

Goal: Add backfill support for reprocessing historical weeks via auditContext.weekOffset, plus auto-process two weeks when run on Monday.


Potential Bug: Monday Week Offsets

handler.js:146-148

} else if (isMonday) {
  weekOffsets = [-1, 0];

Offset 0 is the current week. On Monday, the current week is a few hours old at most — this will produce near-empty Athena results. This looks like it should be -2 and -1 (previous week + the week before), or just -1 and nothing more. Was running the in-progress current week intentional?


Breaking Change: auditResult Shape

runAuditAndSendToMystique now returns auditResult as an array ([{...}]) instead of a single object ({...}). This is persisted to the DB as the stored audit result.

  • Step 2 (submitForScraping) reads from a different step's result (step 1), so it's not affected.
  • No opportunity-data-mapper.js exists for this audit, so local code is fine.
  • Risk: External dashboards, monitoring, or reporting tools that query auditResult.success on this audit type will silently break (they'll get undefined since arrays don't have .success).

This change deserves a callout in the PR description.


isMonday Hardcodes new Date()

handler.js:142

const isMonday = new Date().getUTCDay() === 1;

This works with sinon.useFakeTimers in the test, but it's fragile. The Monday test (handler.test.js:363-378) doesn't assert that generateReportingPeriods was called with [-1, 0] — it only checks output length (which is pre-determined by the mock). The test would pass even if the isMonday branch wasn't reached.

Tighten the assertion:

expect(mockGenerateReportingPeriods).to.have.been.calledWith(sinon.match.any, [-1, 0]);

derivePeriodFromBrokenLinks Fallback Not Directly Tested

guidance-handler.js:18-27

The null return path (no brokenLinks, or none with a matching suggestionId) falls back to:

generateReportingPeriods(new Date(), [-1]).weeks[0].periodIdentifier

The guidance-handler test now provides a suggestionId, so the regex always matches in tests. There's no coverage of the fallback. Consider adding a test case where brokenLinks is empty or has no suggestionId.


Positives

  • Replacing getWeekNumber with isoCalendarWeek from shared-utils is the right call — fewer custom week calculations.
  • periodIdentifier baked into week objects (generateReportingPeriods) reduces duplication nicely.
  • audit?.getId() optional chaining for the backfill path where no audit record exists yet is correct.
  • The eslint-disable no-await-in-loop block is appropriately scoped — serial Athena queries make sense given potential rate limits.
  • Default export routing pattern ({ run(message, context) }) is compatible with index.js:305's typeof handler.run === 'function' check.

@ymadobe
Copy link
Contributor Author

ymadobe commented Mar 11, 2026

Thanks @duynguyen for the review.

  1. The concern near-empty current-week data on Monday is valid but applies equally to cdn-logs-report which has been running this way in production. The intent: Monday is the first chance to finalize the just-completed previous week, while also starting to track the new week. I can update this if you could guide me on this.

Added tests for 2,3 and 4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants