Skip to content

Commit 2660cc8

Browse files
authored
Merge pull request #410 from Integration-Automation/feat/failure-signature-batch
Add failure_signature: normalise + hash errors to stable signatures
2 parents c3a4f1a + 1d084cc commit 2660cc8

11 files changed

Lines changed: 312 additions & 0 deletions

File tree

WHATS_NEW.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
11
# What's New — AutoControl
22

3+
## What's new (2026-06-25) — Stable Failure Signatures
4+
5+
Match the *same kind* of failure across runs, despite differing paths and ids. Full reference: [`docs/source/Eng/doc/new_features/v191_features_doc.rst`](docs/source/Eng/doc/new_features/v191_features_doc.rst).
6+
7+
- **`normalize_error` / `failure_signature` / `group_failures`** (`AC_failure_signature`, `AC_group_failures`): two runs that failed the same way rarely have byte-identical error text — paths, line numbers, addresses, ids and timestamps differ every time — which defeats "is this the same failure?" and "which tests fail together?". This strips the variable parts of an error to a canonical form and hashes it (SHA-256), so the same kind of failure gets the same short signature across runs — the join key the rest of the test-robustness tools (run diffing, flake clustering) group on. `group_failures` buckets a list of errors by signature, most frequent first. Pure stdlib (`re` + `hashlib`). No `PySide6`.
8+
39
## What's new (2026-06-24) — Visual Saliency (where to look — spectral-residual)
410

511
Find the region that stands out, with no template / colour / text. Full reference: [`docs/source/Eng/doc/new_features/v190_features_doc.rst`](docs/source/Eng/doc/new_features/v190_features_doc.rst).
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
Stable Failure Signatures
2+
=========================
3+
4+
Two runs that failed the *same way* almost never have byte-identical error text —
5+
paths, line numbers, memory addresses, ids and timestamps differ every time. That
6+
defeats any attempt to ask "is this the same failure as yesterday?" or "which
7+
tests fail *together*?". ``failure_signature`` strips the variable parts of an
8+
error to a canonical form and hashes it (SHA-256), so the same *kind* of failure
9+
gets the same short signature across runs — the join key the rest of the
10+
test-robustness tools (run diffing, flake clustering) group on.
11+
12+
* :func:`normalize_error` — collapse paths / hex addresses / UUIDs / timestamps /
13+
line numbers / bare integers to placeholders,
14+
* :func:`failure_signature` — a short stable SHA-256 of the normalised message,
15+
* :func:`group_failures` — group a list of errors by signature, most frequent
16+
first.
17+
18+
Pure standard library (``re`` + ``hashlib``); no device, no ``PySide6``.
19+
20+
Headless API
21+
------------
22+
23+
.. code-block:: python
24+
25+
from je_auto_control import (normalize_error, failure_signature,
26+
group_failures)
27+
28+
a = r"Timeout at C:\app\run.py line 42 (0x7ffab12c) at 2026-06-24 11:03:21"
29+
b = r"Timeout at C:\app\run.py line 88 (0x1234abcd) at 2026-06-25 09:15:00"
30+
normalize_error(a) # "Timeout at <path> line <n> (0x<addr>) at <ts>"
31+
failure_signature(a) == failure_signature(b) # True — same failure
32+
33+
group_failures([a, b, "Connection refused to /tmp/x.sock"])
34+
# [{"signature": "...", "normalized": "...", "count": 2, "examples": [...]},
35+
# {"signature": "...", "count": 1, ...}]
36+
37+
Windows and POSIX paths, ``0x`` addresses, UUIDs, ISO timestamps, ``line N`` and
38+
any leftover integers become placeholders; whitespace is squeezed.
39+
``group_failures`` keeps up to three distinct raw examples per group and skips
40+
empty / ``None`` messages.
41+
42+
Executor commands
43+
-----------------
44+
45+
``AC_failure_signature`` (``error`` / ``length``) returns ``{signature,
46+
normalized}``; ``AC_group_failures`` (``errors``) returns the grouped list. They
47+
are exposed as read-only ``ac_*`` MCP tools and as Script Builder commands under
48+
**Testing**.
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
穩定的失敗簽章
2+
==============
3+
4+
兩次以*相同方式*失敗的執行,幾乎不會有逐位元組相同的錯誤文字——路徑、行號、記憶體位址、id 與
5+
時間戳每次都不同。這使得「這和昨天是同一個失敗嗎?」或「哪些測試會*一起*失敗?」無從問起。
6+
``failure_signature`` 把錯誤的變動部分剝離成標準形式並雜湊(SHA-256),於是*相同類型*的失敗在
7+
不同執行間會得到相同的短簽章——即其餘 test-robustness 工具(執行比較、flaky 分群)所依據的
8+
join key。
9+
10+
* :func:`normalize_error` ——把路徑 / 十六進位位址 / UUID / 時間戳 / 行號 / 裸整數收斂成佔位符,
11+
* :func:`failure_signature` ——正規化訊息的短而穩定的 SHA-256,
12+
* :func:`group_failures` ——把一組錯誤依簽章分組,最常見者在前。
13+
14+
純標準庫(``re`` + ``hashlib``);不涉及裝置,不匯入 ``PySide6``。
15+
16+
無頭 API
17+
--------
18+
19+
.. code-block:: python
20+
21+
from je_auto_control import (normalize_error, failure_signature,
22+
group_failures)
23+
24+
a = r"Timeout at C:\app\run.py line 42 (0x7ffab12c) at 2026-06-24 11:03:21"
25+
b = r"Timeout at C:\app\run.py line 88 (0x1234abcd) at 2026-06-25 09:15:00"
26+
normalize_error(a) # "Timeout at <path> line <n> (0x<addr>) at <ts>"
27+
failure_signature(a) == failure_signature(b) # True——同一個失敗
28+
29+
group_failures([a, b, "Connection refused to /tmp/x.sock"])
30+
# [{"signature": "...", "normalized": "...", "count": 2, "examples": [...]},
31+
# {"signature": "...", "count": 1, ...}]
32+
33+
Windows 與 POSIX 路徑、``0x`` 位址、UUID、ISO 時間戳、``line N`` 與任何殘留整數都會變成佔位符;
34+
空白會被壓縮。``group_failures`` 每組最多保留三個不同的原始範例,並略過空 / ``None`` 訊息。
35+
36+
執行器指令
37+
----------
38+
39+
``AC_failure_signature``(``error`` / ``length``)回傳 ``{signature, normalized}``;
40+
``AC_group_failures``(``errors``)回傳分組清單。皆以唯讀 ``ac_*`` MCP 工具及 Script Builder
41+
指令(位於 **Testing** 分類下)形式提供。

je_auto_control/__init__.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,10 @@
8888
from je_auto_control.utils.saliency import (
8989
most_salient, salient_regions, saliency_map,
9090
)
91+
# Stable failure signatures (normalise + hash error text; group failures)
92+
from je_auto_control.utils.failure_signature import (
93+
failure_signature, group_failures, normalize_error,
94+
)
9195
# VLM element locator (headless)
9296
from je_auto_control.utils.vision import (
9397
VLMNotAvailableError, click_by_description, locate_by_description,
@@ -1665,6 +1669,7 @@ def start_autocontrol_gui(*args, **kwargs):
16651669
"image_quality", "is_blurry", "quality_gate",
16661670
"detect_scale", "scale_sweep",
16671671
"saliency_map", "salient_regions", "most_salient",
1672+
"normalize_error", "failure_signature", "group_failures",
16681673
# VLM locator
16691674
"VLMNotAvailableError", "locate_by_description", "click_by_description",
16701675
"verify_description",

je_auto_control/gui/script_builder/command_schema.py

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2708,6 +2708,21 @@ def _add_audit_specs(specs: List[CommandSpec]) -> None:
27082708
description="Aggregate the self-heal log (heal rate, brittle "
27092709
"locators).",
27102710
))
2711+
specs.append(CommandSpec(
2712+
"AC_failure_signature", "Testing", "Failure Signature",
2713+
fields=(
2714+
FieldSpec("error", FieldType.STRING,
2715+
placeholder="Timeout at C:\\app.py line 42 (0x7ff..)"),
2716+
FieldSpec("length", FieldType.INT, optional=True, default=12),
2717+
),
2718+
description="Normalise + hash an error to a stable failure signature.",
2719+
))
2720+
specs.append(CommandSpec(
2721+
"AC_group_failures", "Testing", "Group Failures by Signature",
2722+
fields=(FieldSpec("errors", FieldType.STRING,
2723+
placeholder='["err one", "err two"]'),),
2724+
description="Group error messages by failure signature (most frequent).",
2725+
))
27112726
specs.append(CommandSpec(
27122727
"AC_scan_secrets", "Tools", "Scan for Hardcoded Secrets",
27132728
description="Scan 'data' (JSON view) for hardcoded secrets that "

je_auto_control/utils/executor/action_executor.py

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4348,6 +4348,24 @@ def _most_salient(source: Any = None, region: Any = None, size: Any = 64,
43484348
return {"found": result is not None, "region": result}
43494349

43504350

4351+
def _failure_signature(error: str, length: Any = 12) -> Dict[str, Any]:
4352+
"""Adapter: normalise + hash an error message to a stable signature."""
4353+
from je_auto_control.utils.failure_signature import (
4354+
failure_signature, normalize_error)
4355+
return {"signature": failure_signature(str(error), length=int(length)),
4356+
"normalized": normalize_error(str(error))}
4357+
4358+
4359+
def _group_failures(errors: Any) -> Dict[str, Any]:
4360+
"""Adapter: group error messages by failure signature."""
4361+
import json
4362+
from je_auto_control.utils.failure_signature import group_failures
4363+
if isinstance(errors, str):
4364+
errors = json.loads(errors)
4365+
groups = group_failures(errors)
4366+
return {"groups": groups, "count": len(groups)}
4367+
4368+
43514369
def _image_histogram(source: Any = None, bins: Any = 32, space: str = "hsv",
43524370
region: Any = None) -> Dict[str, Any]:
43534371
"""Adapter: per-channel colour histogram of an image / the screen."""
@@ -6576,6 +6594,8 @@ def __init__(self):
65766594
"AC_scale_sweep": _scale_sweep,
65776595
"AC_salient_regions": _salient_regions,
65786596
"AC_most_salient": _most_salient,
6597+
"AC_failure_signature": _failure_signature,
6598+
"AC_group_failures": _group_failures,
65796599
"AC_image_histogram": _image_histogram,
65806600
"AC_histogram_changed": _histogram_changed,
65816601
"AC_changed_regions": _changed_regions,
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
"""Normalise error messages into stable SHA-256 failure signatures + grouping."""
2+
from je_auto_control.utils.failure_signature.failure_signature import (
3+
failure_signature, group_failures, normalize_error,
4+
)
5+
6+
__all__ = ["normalize_error", "failure_signature", "group_failures"]
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
"""Normalise an error message into a stable failure signature.
2+
3+
Two runs that failed the *same way* almost never have byte-identical error text —
4+
paths, line numbers, memory addresses, ids and timestamps differ every time. That
5+
defeats any attempt to ask "is this the same failure as yesterday?" or "which
6+
tests fail *together*?". ``failure_signature`` strips the variable parts of an
7+
error to a canonical form and hashes it (SHA-256), so the same *kind* of failure
8+
gets the same short signature across runs — the join key the rest of the
9+
test-robustness tools (run diffing, flake clustering) group on.
10+
11+
Pure standard library (``re`` + ``hashlib``); no device, no ``PySide6``.
12+
"""
13+
import hashlib
14+
import re
15+
from typing import Any, Dict, Iterable, List
16+
17+
# Ordered (pattern, replacement): the volatile parts of an error, most specific
18+
# first so e.g. a path's trailing line number isn't half-collapsed by the digit rule.
19+
_NORMALIZERS = [
20+
(re.compile(r"[A-Za-z]:\\[^\s:*?\"<>|]+"), "<path>"), # Windows path
21+
(re.compile(r"(?:/[\w.\-]+)+/[\w.\-]+"), "<path>"), # POSIX path
22+
(re.compile(r"0x[0-9A-Fa-f]+"), "0x<addr>"), # memory address
23+
(re.compile(r"\b[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}"
24+
r"-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}\b"), "<uuid>"),
25+
(re.compile(r"\d{4}-\d{2}-\d{2}[ T]\d{2}:\d{2}:\d{2}(?:\.\d+)?"), "<ts>"),
26+
(re.compile(r"\bline\s+\d+\b", re.IGNORECASE), "line <n>"),
27+
(re.compile(r"\b\d+\b"), "<n>"), # any leftover int
28+
]
29+
_WHITESPACE = re.compile(r"\s+")
30+
31+
32+
def normalize_error(message: str) -> str:
33+
"""Collapse the volatile parts of an error message to a canonical form.
34+
35+
Paths, hex addresses, UUIDs, timestamps, line numbers and bare integers
36+
become placeholders, and whitespace is squeezed — so messages that differ
37+
only in those details normalise to the same string.
38+
"""
39+
text = str(message)
40+
for pattern, replacement in _NORMALIZERS:
41+
text = pattern.sub(replacement, text)
42+
return _WHITESPACE.sub(" ", text).strip()
43+
44+
45+
def failure_signature(message: str, *, length: int = 12) -> str:
46+
"""Return a short stable SHA-256 signature of a normalised error message."""
47+
digest = hashlib.sha256(normalize_error(message).encode("utf-8")).hexdigest()
48+
return digest[:max(1, int(length))]
49+
50+
51+
def group_failures(messages: Iterable[str]) -> List[Dict[str, Any]]:
52+
"""Group error messages by signature, most frequent first.
53+
54+
Returns ``[{signature, normalized, count, examples}]`` (up to three distinct
55+
raw examples per group). ``None`` / empty messages are skipped.
56+
"""
57+
groups: Dict[str, Dict[str, Any]] = {}
58+
for message in messages:
59+
if not message:
60+
continue
61+
signature = failure_signature(message)
62+
group = groups.setdefault(signature, {
63+
"signature": signature, "normalized": normalize_error(message),
64+
"count": 0, "examples": []})
65+
group["count"] += 1
66+
if len(group["examples"]) < 3 and str(message) not in group["examples"]:
67+
group["examples"].append(str(message))
68+
return sorted(groups.values(), key=lambda group: group["count"], reverse=True)

je_auto_control/utils/mcp_server/tools/_factories.py

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7652,6 +7652,29 @@ def flakiness_tools() -> List[MCPTool]:
76527652
handler=h.flaky_report,
76537653
annotations=READ_ONLY,
76547654
),
7655+
MCPTool(
7656+
name="ac_failure_signature",
7657+
description=("Normalise an error message (strip paths / addresses / "
7658+
"line numbers / timestamps / ids) and hash it to a stable "
7659+
"SHA-256 signature, so the same kind of failure matches "
7660+
"across runs. Returns {signature, normalized}."),
7661+
input_schema=schema({"error": {"type": "string"},
7662+
"length": {"type": "integer"}},
7663+
required=["error"]),
7664+
handler=h.failure_signature,
7665+
annotations=READ_ONLY,
7666+
),
7667+
MCPTool(
7668+
name="ac_group_failures",
7669+
description=("Group a list of error messages by failure signature, "
7670+
"most frequent first: [{signature, normalized, count, "
7671+
"examples}]."),
7672+
input_schema=schema({
7673+
"errors": {"type": "array", "items": {"type": "string"}}},
7674+
required=["errors"]),
7675+
handler=h.group_failures,
7676+
annotations=READ_ONLY,
7677+
),
76557678
]
76567679

76577680

je_auto_control/utils/mcp_server/tools/_handlers.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2543,6 +2543,16 @@ def most_salient(source=None, region=None, size=64, threshold=None, min_area=4):
25432543
return _most_salient(source, region, size, threshold, min_area)
25442544

25452545

2546+
def failure_signature(error, length=12):
2547+
from je_auto_control.utils.executor.action_executor import _failure_signature
2548+
return _failure_signature(error, length)
2549+
2550+
2551+
def group_failures(errors):
2552+
from je_auto_control.utils.executor.action_executor import _group_failures
2553+
return _group_failures(errors)
2554+
2555+
25462556
def image_histogram(source=None, bins=32, space="hsv", region=None):
25472557
from je_auto_control.utils.executor.action_executor import _image_histogram
25482558
return _image_histogram(source, bins, space, region)

0 commit comments

Comments
 (0)