Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
4ad74ec
Add deps and config for alt-text-quality rule
kzhou314 Jun 17, 2026
7681fd9
Add alt-text judge provider seam (Copilot + Azure-augmented)
kzhou314 Jun 17, 2026
27e00a0
Extend image extraction with structured context fields
kzhou314 Jun 17, 2026
3c7124c
Add alt-text-quality rule (default disabled)
kzhou314 Jun 17, 2026
1959031
Update tests and add alt-quality fixture
kzhou314 Jun 17, 2026
b863e95
Add offline probe harness
kzhou314 Jun 17, 2026
d7e1296
Add GitHub-representative alt-text-quality fixture
kzhou314 Jun 17, 2026
ad639e0
Refine Azure preamble and clarify NotImplemented vision client
kzhou314 Jun 17, 2026
947e05c
Skip Azure pre-pass for images 50px or smaller
kzhou314 Jun 18, 2026
7893ded
Pass intrinsic image dimensions through probe
kzhou314 Jun 18, 2026
436f64d
Trim verbose comments in Azure vision client
kzhou314 Jun 18, 2026
c9c5ee3
Auto-select azure-augmented mode when Azure creds present
kzhou314 Jun 18, 2026
6c6663c
Document alt-text-quality rule in README
kzhou314 Jun 18, 2026
c7d7d44
Add timeout and retry to model and vision requests
kzhou314 Jun 18, 2026
9c8cb38
Add alt-text-quality to config schema
kzhou314 Jun 18, 2026
b3e26c6
Add unit tests for alt-text-quality rule
kzhou314 Jun 19, 2026
6560deb
Pass page title and section heading to the judge
kzhou314 Jun 19, 2026
794ea25
Add judgment and vision-extraction caches to the judge layer
kzhou314 Jun 19, 2026
dcaa504
Address Copilot review: fetch timeout for images, data URL normalizat…
kzhou314 Jun 19, 2026
fc806c1
Address Copilot review: sanitize image HTML, widen data URL regex, dr…
kzhou314 Jun 19, 2026
5ae4bc1
Trim verbose comments; simplify image extraction
kzhou314 Jun 19, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .github/workflows/scan-static-sites.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,11 @@ permissions:
jobs:
scan:
runs-on: ubuntu-latest
# Plugin config for the model-backed alt-text-quality rule.
env:
GITHUB_MODELS_TOKEN: ${{ secrets.GH_MODELS_TOKEN }}
AZURE_VISION_ENDPOINT: ${{ secrets.AZURE_VISION_ENDPOINT }}
AZURE_VISION_KEY: ${{ secrets.AZURE_VISION_KEY }}
strategy:
fail-fast: false
matrix:
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@ coverage/
.vitest/
*.log
.DS_Store
.env
51 changes: 43 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,13 +115,14 @@ Trigger your scanner workflow manually or on its configured schedule. The plugin

The plugin runs every extracted image through an append-only registry of rules. Each rule returns a finding when an image fails its criteria, and the scanner turns each finding into an issue.

| Rule | ID | Fires when | Example (flagged) |
| ------------------- | ---------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------ |
| **Missing alt** | `missing-alt-text` | The `alt` attribute is absent (`null`) or whitespace-only (`" "`). `alt=""` is treated as intentional decorative use and is **not** flagged. | `<img src="cat.png">`<br>`<img src="cat.png" alt=" ">` |
| **Vague alt** | `vague-alt-text` | The alt text is one of a curated set of generic single words (`image`, `photo`, `icon`, `logo`, `screenshot`, `chart`, `untitled`, etc.) or short filler phrases (`an image of`, `a photo of`). Normalization is applied before matching: case-insensitive, whitespace-collapsed, surrounding punctuation stripped. | `<img alt="image">`<br>`<img alt="An image of">`<br>`<img alt="PHOTO.">` |
| **Filename as alt** | `filename-alt-text` | The alt text ends in a common image file extension (`.png`, `.jpg`, `.jpeg`, `.gif`, `.svg`, `.webp`, `.bmp`, `.ico`). | `<img alt="IMG_1234.png">`<br>`<img alt="Screenshot 2024-04-28.jpg">` |
| **Repeated alt** | `repeated-alt-text` | Two or more adjacent images on the rendered page share the same normalized alt text. Useful for patterns like five star icons all labeled `"3/5 stars"`. | Five consecutive `<img alt="3/5 stars">` elements |
| **Placeholder alt** | `placeholder-alt-text` | The alt text matches a known boilerplate string that signals it was never written (`todo`, `tbd`, `fixme`, `placeholder`, `alt text`, `insert alt text`, `image alt`). Normalization is applied before matching. | `<img alt="TODO">`<br>`<img alt="insert alt text">` |
| Rule | ID | Fires when | Example (flagged) |
| ------------------------ | ---------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------ |
| **Missing alt** | `missing-alt-text` | The `alt` attribute is absent (`null`) or whitespace-only (`" "`). `alt=""` is treated as intentional decorative use and is **not** flagged. | `<img src="cat.png">`<br>`<img src="cat.png" alt=" ">` |
| **Vague alt** | `vague-alt-text` | The alt text is one of a curated set of generic single words (`image`, `photo`, `icon`, `logo`, `screenshot`, `chart`, `untitled`, etc.) or short filler phrases (`an image of`, `a photo of`). Normalization is applied before matching: case-insensitive, whitespace-collapsed, surrounding punctuation stripped. | `<img alt="image">`<br>`<img alt="An image of">`<br>`<img alt="PHOTO.">` |
| **Filename as alt** | `filename-alt-text` | The alt text ends in a common image file extension (`.png`, `.jpg`, `.jpeg`, `.gif`, `.svg`, `.webp`, `.bmp`, `.ico`). | `<img alt="IMG_1234.png">`<br>`<img alt="Screenshot 2024-04-28.jpg">` |
| **Repeated alt** | `repeated-alt-text` | Two or more adjacent images on the rendered page share the same normalized alt text. Useful for patterns like five star icons all labeled `"3/5 stars"`. | Five consecutive `<img alt="3/5 stars">` elements |
| **Placeholder alt** | `placeholder-alt-text` | The alt text matches a known boilerplate string that signals it was never written (`todo`, `tbd`, `fixme`, `placeholder`, `alt text`, `insert alt text`, `image alt`). Normalization is applied before matching. | `<img alt="TODO">`<br>`<img alt="insert alt text">` |
| **Alt quality** (opt-in) | `alt-text-quality` | A vision model judges the alt text against the image itself and flags it when the text is inaccurate, incomplete, or otherwise low-quality — plausible-looking alt that the deterministic rules can't catch. **Disabled by default**; requires a GitHub Models token (optionally Azure AI Vision). See [Alt-text quality](#alt-text-quality-model-backed-opt-in). | `<img src="jane-doe-ceo.jpg" alt="a person">` |

### Image extraction

Expand All @@ -136,6 +137,40 @@ Before rules run, the plugin extracts images from the page through Playwright's

The scanner's built-in Axe scan includes a rule called [`image-alt`](https://dequeuniversity.com/rules/axe/4.10/image-alt) that catches missing and whitespace-only `alt` attributes. If you have both `"axe"` and `"alt-text-scan"` enabled, the same image may be flagged by both. The other four rules in this plugin (`vague-alt-text`, `filename-alt-text`, `repeated-alt-text`, `placeholder-alt-text`) are unique to the plugin and don't overlap with Axe.

### Alt-text quality (model-backed, opt-in)

The five rules above are deterministic pattern matches. `alt-text-quality` goes further: it sends each image and its alt text to a vision model, which judges whether the alt text actually and sufficiently describes the image. This catches plausible-looking but wrong or incomplete alt text — for example `alt="a person"` on a photo of a named individual.

Because it makes a per-image model call (cost and latency), it is **disabled by default**. To turn it on:

1. Enable the rule in `config.json` (see [Configuration](#configuration)):

```json
{
"rules": {
"alt-text-quality": true
}
}
```

2. Provide a GitHub Models token as the `GITHUB_MODELS_TOKEN` environment variable (a PAT with the `models:read` scope).

Optionally, supply Azure AI Vision credentials (`AZURE_VISION_ENDPOINT` and `AZURE_VISION_KEY`) to add an OCR-and-tags pre-pass that enriches the model's context. When both are present the plugin selects this augmented mode automatically; set `ALT_TEXT_JUDGE_MODE` to `copilot` or `azure-augmented` to force a mode.

In a workflow, provide these as repository secrets at the **job** level so the scanner's sub-actions inherit them into the process that runs the plugin. GitHub disallows secret names beginning with `GITHUB_`, so store the token under a different name (e.g. `GH_MODELS_TOKEN`) and map it:

```yaml
jobs:
accessibility_scanner:
runs-on: ubuntu-latest
env:
GITHUB_MODELS_TOKEN: ${{ secrets.GH_MODELS_TOKEN }}
AZURE_VISION_ENDPOINT: ${{ secrets.AZURE_VISION_ENDPOINT }} # optional
AZURE_VISION_KEY: ${{ secrets.AZURE_VISION_KEY }} # optional
steps:
# ...as in "Enable the plugin in your workflow" above
```

---

## Output
Expand Down Expand Up @@ -175,7 +210,7 @@ To override the default enabled state of one or more rules, add a `config.json`
```

- Each key under `rules` is a rule ID from the [Rules](#rules) table above; the value is `true` (run the rule) or `false` (skip it).
- Rules you don't list keep their default behavior. Today every rule defaults to enabled.
- Rules you don't list keep their default behavior. Every rule defaults to enabled except `alt-text-quality`, which is opt-in (see [Alt-text quality](#alt-text-quality-model-backed-opt-in)).
- Unknown rule IDs and non-boolean values are logged as warnings and ignored (typo guard).
- A missing or malformed `config.json` causes the plugin to run with all defaults.
- The plugin reads the config once at startup, not per URL.
Expand Down
3 changes: 2 additions & 1 deletion index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,8 @@ export default async function altTextScan({page, addFinding}: PluginArgs): Promi
for (const rule of enabledRules) {
let results
try {
results = rule.evaluate(ctx)
// Rules may be sync or async; await both shapes uniformly.
results = await rule.evaluate(ctx)
} catch (err) {
console.error(`[alt-text-scan] rule "${rule.id}" threw on ${url}:`, err)
continue
Expand Down
Loading
Loading