Problem
When running this module at enterprise scale with many runner configurations, GitHub API rate limits become a bottleneck that cannot be solved by tuning the existing setup.
How API usage grows with runner configurations
Each runner configuration (in multi-runner mode) gets its own set of Lambda functions (scale-up, scale-down, pool). Each of these makes independent GitHub API calls through a single shared GitHub App:
Scale-up (per batch of workflow job events):
- 1 installation token creation per unique owner/repo group
- 1
actions.getJobForWorkflowRun() per message (job queued check)
- 1
actions.generateRunnerJitconfigForOrg/Repo() per runner created (ephemeral/JIT runners)
- Optionally 1 paginated
GET /orgs/{org}/actions/runner-groups for runner group lookup
Scale-down (per cycle):
- 1 paginated
actions.listSelfHostedRunnersForOrg/Repo() per owner
- 1
actions.getSelfHostedRunnerForOrg/Repo() per runner evaluated
- 1
actions.deleteSelfHostedRunnerFromOrg/Repo() per runner removed
Pool (per adjustment cycle):
- 1
apps.getOrgInstallation() + 1 paginated actions.listSelfHostedRunnersForOrg() per pool
- 1 registration token or JIT config call per runner created
All of these calls go through a single GitHub App installation token, sharing the same rate limit bucket.
Concrete example
With 10 runner configurations, a moderate burst of 50 workflow jobs, and 100 active runners:
| Component |
API calls per cycle |
Notes |
| Scale-up (10 configs × ~5 jobs each) |
~110 |
Auth setup + JIT config per runner |
| Scale-down (10 configs × ~10 runners each) |
~180 |
List + status check + deletions |
| Pool (if enabled, 10 configs) |
~60 |
List + top-up |
| Total per cycle |
~350 |
|
With scale-down running every 5 minutes and scale-up triggered on demand, this can easily consume 4,000–5,000+ calls/hour under sustained load — approaching or exceeding the rate limit.
GitHub API rate limits
- Standard GitHub App: 5,000 req/hour (scales up to 12,500 max with >20 repos/users)
- GitHub Enterprise Cloud: 15,000 req/hour (fixed)
(Source: GitHub docs)
While 15,000/hour sounds generous, it is a per-app limit. For large enterprises running 20+ runner configurations with hundreds of concurrent jobs per hour, this ceiling is reached. There is no way to increase the per-app limit — the only option is to distribute load across multiple apps.
Proposed solution
Allow the module to be configured with multiple GitHub Apps, and have the Lambda functions distribute API calls across them randomly. Each app gets its own independent rate limit bucket, making the effective limit N × 15,000 req/hour.
Key design points:
- No breaking changes: the existing single GitHub App configuration remains unchanged. Additional apps are purely optional via a new
additional_github_apps variable
- Each additional app accepts
id, key_base64, and optionally installation_id (all support direct values or SSM references). All apps must be installed on the same org/repos
- The Lambda randomly selects an app for each authentication flow, distributing load evenly across all configured apps
- The selected app index is threaded through the auth chain so the same app is used for JWT → installation token → API calls within a single logical operation
- Installation ID resolution is optimized: when the primary app is selected, the webhook payload's
installation.id is reused directly (no extra API call). For additional apps, if an installation_id was provided in the configuration it is used directly; otherwise it is looked up via the API. This means providing installation_id for additional apps avoids one API call per authentication flow
I have an implementation ready and will open a PR shortly.
Problem
When running this module at enterprise scale with many runner configurations, GitHub API rate limits become a bottleneck that cannot be solved by tuning the existing setup.
How API usage grows with runner configurations
Each runner configuration (in multi-runner mode) gets its own set of Lambda functions (scale-up, scale-down, pool). Each of these makes independent GitHub API calls through a single shared GitHub App:
Scale-up (per batch of workflow job events):
actions.getJobForWorkflowRun()per message (job queued check)actions.generateRunnerJitconfigForOrg/Repo()per runner created (ephemeral/JIT runners)GET /orgs/{org}/actions/runner-groupsfor runner group lookupScale-down (per cycle):
actions.listSelfHostedRunnersForOrg/Repo()per owneractions.getSelfHostedRunnerForOrg/Repo()per runner evaluatedactions.deleteSelfHostedRunnerFromOrg/Repo()per runner removedPool (per adjustment cycle):
apps.getOrgInstallation()+ 1 paginatedactions.listSelfHostedRunnersForOrg()per poolAll of these calls go through a single GitHub App installation token, sharing the same rate limit bucket.
Concrete example
With 10 runner configurations, a moderate burst of 50 workflow jobs, and 100 active runners:
With scale-down running every 5 minutes and scale-up triggered on demand, this can easily consume 4,000–5,000+ calls/hour under sustained load — approaching or exceeding the rate limit.
GitHub API rate limits
(Source: GitHub docs)
While 15,000/hour sounds generous, it is a per-app limit. For large enterprises running 20+ runner configurations with hundreds of concurrent jobs per hour, this ceiling is reached. There is no way to increase the per-app limit — the only option is to distribute load across multiple apps.
Proposed solution
Allow the module to be configured with multiple GitHub Apps, and have the Lambda functions distribute API calls across them randomly. Each app gets its own independent rate limit bucket, making the effective limit
N × 15,000req/hour.Key design points:
additional_github_appsvariableid,key_base64, and optionallyinstallation_id(all support direct values or SSM references). All apps must be installed on the same org/reposinstallation.idis reused directly (no extra API call). For additional apps, if aninstallation_idwas provided in the configuration it is used directly; otherwise it is looked up via the API. This means providinginstallation_idfor additional apps avoids one API call per authentication flowI have an implementation ready and will open a PR shortly.