Skip to content

feat(security): add behavioral session tracker with trifecta detection#965

Open
gemini2026 wants to merge 5 commits intoNVIDIA:mainfrom
gemini2026:feat/session-tracker
Open

feat(security): add behavioral session tracker with trifecta detection#965
gemini2026 wants to merge 5 commits intoNVIDIA:mainfrom
gemini2026:feat/session-tracker

Conversation

@gemini2026
Copy link

@gemini2026 gemini2026 commented Mar 26, 2026

Closes #964

Summary

Adds behavioral session tracking to NemoClaw's security toolkit, detecting multi-step exfiltration attacks that per-action policy gates miss.

Trifecta detection:
Most agent security gates evaluate each tool call in isolation.
An agent that reads credentials, ingests untrusted data, and opens an outbound connection across separate actions can bypass per-action checks.
The session tracker aggregates these capabilities over the lifetime of a session and raises the risk level when all three appear together.

Three capability classes:

  • read_sensitive — agent read credential or configuration files
  • ingested_untrusted — agent fetched from external URLs or piped untrusted data
  • has_egress — agent attempted network egress

Risk classification:

  • 0 capabilities = "clean"
  • 1–2 capabilities = "elevated"
  • All 3 capabilities = "critical" (trifecta)

Session exposure API:

  • Record capability events with tool name and detail
  • Query capabilities, risk level, and trifecta status per session
  • List all sessions with summaries
  • Detailed exposure data: deduplicated sensitive files and external URLs, full egress attempt log

Hardening:

  • Input validation on all public methods (empty sessionId is no-op/null)
  • Event cap at 100 per session to bound memory usage
  • readonly on all output interface fields
  • RiskLevel string literal union type

Full Vitest test coverage (35 tests) including trifecta detection, session isolation, deduplication, boundary conditions, and empty-input edge cases.

Self-contained under nemoclaw/src/security/ with no dependencies on existing NemoClaw internals.
Reference documentation at docs/reference/session-tracker.md.

Test plan

  • cd nemoclaw && npx vitest run src/security/session-tracker.test.ts — 35/35 tests pass
  • npm test — 576/576 total tests pass, 0 regressions
  • tsc --noEmit — clean type check
  • make check — all linters and hooks pass
  • Clean 3-file diff (no extraneous files)
  • Verified trifecta detected only when all 3 capabilities present
  • Verified sessions are isolated from each other
  • Verified event cap at 100 (101st dropped, capability still recorded)
  • Verified exposure deduplicates files and URLs but not egress attempts
  • Verified empty sessionId returns null on all public methods

Summary by CodeRabbit

  • New Features

    • Added a Session Tracker to monitor per-session behaviors, detect a three-capability "trifecta", cap event logs at 100, and classify session risk (clean/elevated/critical). Includes session listings and detailed per-session exposure summaries.
  • Documentation

    • Published a reference guide describing tracker behavior, risk model, event semantics, exposure reporting, and event retention rules.
  • Tests

    • Added comprehensive tests for recording, trifecta detection, risk classification, event capping, session isolation, listing, and exposure output.

Tracks three capability classes per session: read_sensitive,
ingested_untrusted, has_egress. When all three appear (trifecta),
risk escalates to critical, detecting multi-step exfiltration
attacks that per-action gates miss.
Align getExposure with getCapabilities and hasTrifecta by adding
an explicit empty-string guard. Add corresponding test case.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 26, 2026

📝 Walkthrough

Walkthrough

Adds an in-memory SessionStore that records per-session capability events (read_sensitive, ingested_untrusted, has_egress), classifies session risk (clean/elevated/critical), detects the "trifecta" (all three), and exposes listing and detailed exposure APIs; includes tests and documentation.

Changes

Cohort / File(s) Summary
Session tracker implementation
nemoclaw/src/security/session-tracker.ts
New exported enums/types and SessionStore with record, getCapabilities, hasTrifecta, listSessions, getExposure. In-memory Map per session, event log capped at 100 items (further events dropped but capability flags still update). Deduplicates sensitive file and external URL details; egress attempts recorded without deduplication.
Tests for session tracker
nemoclaw/src/security/session-tracker.test.ts
New Vitest suite covering recording, capability retrieval semantics (copy vs reference), trifecta detection, risk classification, event cap behavior (100 vs 101), session isolation, exposure formatting and deduplication rules, and listSessions summaries.
Documentation
docs/reference/session-tracker.md
New reference page describing capability classes, trifecta and risk definitions, event semantics (CapabilityEvent structure, 100-event cap), and the exported TypeScript API and types (Capability, RiskLevel, CapabilityEvent, SessionSummary, SessionExposure, SessionStore methods).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hop through sessions, tally each clue,

a secret read, a fetch, an egress askew.
One hundred crumbs I nest and keep,
flags bloom quiet, then trifecta leaps.
I twitch my whiskers — watchful, spry, and true.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding a behavioral session tracker with trifecta detection capability, which is the primary focus of the PR.
Linked Issues check ✅ Passed The PR implementation fully satisfies all coding requirements from issue #964: session tracking module with trifecta detection, risk classification, event recording, capability queries, session listing, and exposure API with proper deduplication and event caps.
Out of Scope Changes check ✅ Passed All changes are strictly scoped to the three new files specified in issue #964: session-tracker.ts implementation, session-tracker.test.ts tests, and session-tracker.md documentation; no modifications to existing NemoClaw code.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
docs/reference/session-tracker.md (2)

3-3: Align the H1 with title.page.

The H1 does not match the frontmatter page title.

As per coding guidelines: "H1 heading matches the title.page frontmatter value."

Also applies to: 21-21

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/session-tracker.md` at line 3, Update the document so the
top-level H1 matches the frontmatter title.page value ("Session Tracker —
Behavioral Trifecta Detection"): locate the frontmatter key title.page and
replace or edit the existing H1 heading (the leading "# ..." line) to exactly
match that value, ensuring punctuation and capitalization are identical; also
check the second occurrence (line 21) mentioned and make it consistent with
title.page as well.

6-8: Fix product-name casing in frontmatter metadata.

Use NemoClaw, OpenClaw, and OpenShell casing consistently (including keywords/tags).

As per coding guidelines: "NemoClaw, OpenClaw, and OpenShell must use correct casing."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/session-tracker.md` around lines 6 - 8, The frontmatter arrays
keywords and tags use incorrect product-name casing—update occurrences of
"nemoclaw" to "NemoClaw", "openclaw" to "OpenClaw", and "openshell" to
"OpenShell" in the keywords and tags entries in
docs/reference/session-tracker.md (the frontmatter keys: keywords and tags) so
the metadata uses the correct product-name casing consistently.
nemoclaw/src/security/session-tracker.test.ts (1)

121-150: Consolidate duplicated event-cap tests.

These two blocks validate the same boundary behavior (100 stored, 101st dropped). Merging them into one parameterized/compact suite would keep intent while reducing maintenance overhead.

Also applies to: 329-349

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nemoclaw/src/security/session-tracker.test.ts` around lines 121 - 150, The
two test blocks under the "event cap" suite duplicate boundary checks for
storing exactly 100 events and dropping the 101st; consolidate them by creating
a single parameterized/compact test that covers: (1) inserting N events where
N=100 asserts exposure.events length is 100 and last event detail is "/file-99",
(2) inserting N+1 events where the 101st is dropped, and (3) after the log is
full a subsequent store.record("s1", Capability.HasEgress, ...) still flips
getCapabilities("s1")[Capability.HasEgress] to true while exposure.events
remains length 100; reuse store.record, getExposure and getCapabilities calls
and remove the duplicated block (the similar tests later in the file) so the
suite tests the boundary once with clear parameterization.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/reference/session-tracker.md`:
- Around line 95-96: Update the docs for getExposure() to state that it returns
null not only for unknown sessions but also when an empty sessionId is passed;
specifically mention the empty-sessionId null behavior so it matches the
implementation of getExposure(sessionId) and references the parameter name
sessionId and the function getExposure().
- Line 101: The sentence about egressAttempts is misleading because the
implementation emits just the tool name when detail is empty; update the wording
to state that egressAttempts records every has_egress event as "tool + ' ' +
detail" when detail is non-empty, and as just the tool name when detail === ""
(i.e., no trailing space), referencing the egressAttempts array and the
has_egress event/detail variable to make the conditional behavior explicit.

In `@nemoclaw/src/security/session-tracker.ts`:
- Around line 222-223: getExposure currently shallow-copies the session.events
array (const eventsCopy: CapabilityEvent[] = [...sess.events]) which still
exposes the internal CapabilityEvent objects to external mutation; change this
to return deep copies of each event instead — for example, replace the shallow
spread with a per-item clone (e.g., map and shallow-clone each event object with
object spread or use structuredClone/JSON deep-clone if events contain nested
state) so callers of getExposure cannot mutate sess.events; update references in
getExposure and any tests to use the cloned events.

---

Nitpick comments:
In `@docs/reference/session-tracker.md`:
- Line 3: Update the document so the top-level H1 matches the frontmatter
title.page value ("Session Tracker — Behavioral Trifecta Detection"): locate the
frontmatter key title.page and replace or edit the existing H1 heading (the
leading "# ..." line) to exactly match that value, ensuring punctuation and
capitalization are identical; also check the second occurrence (line 21)
mentioned and make it consistent with title.page as well.
- Around line 6-8: The frontmatter arrays keywords and tags use incorrect
product-name casing—update occurrences of "nemoclaw" to "NemoClaw", "openclaw"
to "OpenClaw", and "openshell" to "OpenShell" in the keywords and tags entries
in docs/reference/session-tracker.md (the frontmatter keys: keywords and tags)
so the metadata uses the correct product-name casing consistently.

In `@nemoclaw/src/security/session-tracker.test.ts`:
- Around line 121-150: The two test blocks under the "event cap" suite duplicate
boundary checks for storing exactly 100 events and dropping the 101st;
consolidate them by creating a single parameterized/compact test that covers:
(1) inserting N events where N=100 asserts exposure.events length is 100 and
last event detail is "/file-99", (2) inserting N+1 events where the 101st is
dropped, and (3) after the log is full a subsequent store.record("s1",
Capability.HasEgress, ...) still flips
getCapabilities("s1")[Capability.HasEgress] to true while exposure.events
remains length 100; reuse store.record, getExposure and getCapabilities calls
and remove the duplicated block (the similar tests later in the file) so the
suite tests the boundary once with clear parameterization.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a4fe7a16-7e3b-4730-8713-6e296e3f559e

📥 Commits

Reviewing files that changed from the base of the PR and between f0f53e4 and a17e65d.

📒 Files selected for processing (3)
  • docs/reference/session-tracker.md
  • nemoclaw/src/security/session-tracker.test.ts
  • nemoclaw/src/security/session-tracker.ts

- Deep-copy CapabilityEvent objects in getExposure to prevent
  external mutation of internal session state
- Document getExposure null return for empty sessionId
- Clarify egressAttempts formatting when detail is empty
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/reference/session-tracker.md`:
- Line 3: The H1 heading "# Session Tracker" does not match the frontmatter key
title.page ("Session Tracker — Behavioral Trifecta Detection"); update the H1 so
it exactly matches title.page (including punctuation and spacing) to satisfy the
page-structure rule and ensure the H1 equals title.page.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 209b67da-c891-4e74-a04f-3c003296fd58

📥 Commits

Reviewing files that changed from the base of the PR and between a17e65d and 7a8ee44.

📒 Files selected for processing (2)
  • docs/reference/session-tracker.md
  • nemoclaw/src/security/session-tracker.ts
✅ Files skipped from review due to trivial changes (1)
  • nemoclaw/src/security/session-tracker.ts

@@ -0,0 +1,163 @@
---
title:
page: "Session Tracker — Behavioral Trifecta Detection"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Align the H1 with title.page in frontmatter.

The page-structure rule requires the H1 to match title.page, but Line 21 (# Session Tracker) does not match Line 3 (Session Tracker — Behavioral Trifecta Detection).

As per coding guidelines, "H1 heading matches the title.page frontmatter value."

Also applies to: 21-21

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/session-tracker.md` at line 3, The H1 heading "# Session
Tracker" does not match the frontmatter key title.page ("Session Tracker —
Behavioral Trifecta Detection"); update the H1 so it exactly matches title.page
(including punctuation and spacing) to satisfy the page-structure rule and
ensure the H1 equals title.page.

Align with the pattern used by other reference pages where H1 is
the short name and title.page adds descriptive context.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
docs/reference/session-tracker.md (1)

21-21: ⚠️ Potential issue | 🟡 Minor

H1 must match title.page exactly.

The H1 heading "Session Tracker" does not match the title.page frontmatter value "Session Tracker — Detect Multi-Step Exfiltration Attacks" (line 3). Update line 21 to include the full title with the em dash and subtitle.

As per coding guidelines, "H1 heading matches the title.page frontmatter value."

📝 Proposed fix
-# Session Tracker
+# Session Tracker — Detect Multi-Step Exfiltration Attacks
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/session-tracker.md` at line 21, Update the H1 heading so it
exactly matches the frontmatter title.page value: replace the current "Session
Tracker" H1 with "Session Tracker — Detect Multi-Step Exfiltration Attacks"
(ensure you include the em dash and subtitle exactly as in title.page).
🧹 Nitpick comments (1)
docs/reference/session-tracker.md (1)

40-40: Replace colon with em dash or period.

The colon in this sentence introduces an explanation rather than a list. The formatting rule states that colons should only introduce lists. Consider using an em dash or splitting into two sentences.

As per coding guidelines, "Colons should only introduce a list. Flag colons used as general punctuation between clauses."

📝 Suggested alternatives

Option 1: Use em dash

-A trifecta indicates a possible exfiltration chain: read a secret, get instructions from an attacker, and send the secret out.
+A trifecta indicates a possible exfiltration chain — read a secret, get instructions from an attacker, and send the secret out.

Option 2: Two sentences

-A trifecta indicates a possible exfiltration chain: read a secret, get instructions from an attacker, and send the secret out.
+A trifecta indicates a possible exfiltration chain.
+The agent reads a secret, gets instructions from an attacker, and sends the secret out.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/session-tracker.md` at line 40, Replace the colon in the
sentence "A trifecta indicates a possible exfiltration chain: read a secret, get
instructions from an attacker, and send the secret out." with an em dash or
split into two sentences so it no longer uses a colon as general punctuation;
update the line to either "A trifecta indicates a possible exfiltration chain —
read a secret, get instructions from an attacker, and send the secret out." or
"A trifecta indicates a possible exfiltration chain. It means reading a secret,
getting instructions from an attacker, and sending the secret out."
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@docs/reference/session-tracker.md`:
- Line 21: Update the H1 heading so it exactly matches the frontmatter
title.page value: replace the current "Session Tracker" H1 with "Session Tracker
— Detect Multi-Step Exfiltration Attacks" (ensure you include the em dash and
subtitle exactly as in title.page).

---

Nitpick comments:
In `@docs/reference/session-tracker.md`:
- Line 40: Replace the colon in the sentence "A trifecta indicates a possible
exfiltration chain: read a secret, get instructions from an attacker, and send
the secret out." with an em dash or split into two sentences so it no longer
uses a colon as general punctuation; update the line to either "A trifecta
indicates a possible exfiltration chain — read a secret, get instructions from
an attacker, and send the secret out." or "A trifecta indicates a possible
exfiltration chain. It means reading a secret, getting instructions from an
attacker, and sending the secret out."

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c98d63a2-4938-4370-a148-41e48fce8897

📥 Commits

Reviewing files that changed from the base of the PR and between 7a8ee44 and ce785dc.

📒 Files selected for processing (1)
  • docs/reference/session-tracker.md

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
docs/reference/session-tracker.md (3)

102-102: Use active voice.

The phrase "are not deduplicated" is passive. Rewrite in active voice.

Suggested revision:

-  Egress attempts are not deduplicated.
+  The tracker does not deduplicate egress attempts.

As per coding guidelines, "Active voice required. Flag passive constructions."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/session-tracker.md` at line 102, Replace the passive sentence
"Egress attempts are not deduplicated." with an active-voice version (for
example: "We do not deduplicate egress attempts." or "The system does not
deduplicate egress attempts.") in docs/reference/session-tracker.md so the
statement complies with the project's active-voice guideline.

56-56: Use active voice.

The phrase "are dropped" is passive. Rewrite in active voice.

Suggested revision:

-Events beyond the 100th are dropped, but the capability set continues to update.
+The tracker drops events beyond the 100th, but continues to update the capability set.

As per coding guidelines, "Active voice required. Flag passive constructions."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/session-tracker.md` at line 56, Replace the passive sentence
"Events beyond the 100th are dropped, but the capability set continues to
update." with an active-voice version such as "We drop events beyond the 100th,
but the capability set continues to update." to satisfy the active voice
guideline; locate that exact sentence in session-tracker.md and update it
accordingly.

77-77: Use active voice.

The phrase "are silently ignored" is passive. Rewrite in active voice.

Suggested revision:

-Empty `sessionId` values are silently ignored.
+The method silently ignores empty `sessionId` values.

As per coding guidelines, "Active voice required. Flag passive constructions."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/session-tracker.md` at line 77, Change the passive sentence
about `sessionId` to active voice: locate the sentence containing `sessionId`
and replace "Empty `sessionId` values are silently ignored." with an active
phrasing such as "The system ignores empty `sessionId` values." (or "We ignore
empty `sessionId` values.") to meet the active-voice guideline.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@docs/reference/session-tracker.md`:
- Line 102: Replace the passive sentence "Egress attempts are not deduplicated."
with an active-voice version (for example: "We do not deduplicate egress
attempts." or "The system does not deduplicate egress attempts.") in
docs/reference/session-tracker.md so the statement complies with the project's
active-voice guideline.
- Line 56: Replace the passive sentence "Events beyond the 100th are dropped,
but the capability set continues to update." with an active-voice version such
as "We drop events beyond the 100th, but the capability set continues to
update." to satisfy the active voice guideline; locate that exact sentence in
session-tracker.md and update it accordingly.
- Line 77: Change the passive sentence about `sessionId` to active voice: locate
the sentence containing `sessionId` and replace "Empty `sessionId` values are
silently ignored." with an active phrasing such as "The system ignores empty
`sessionId` values." (or "We ignore empty `sessionId` values.") to meet the
active-voice guideline.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4040fa89-6063-4cd9-ab1e-d75c361b8ff5

📥 Commits

Reviewing files that changed from the base of the PR and between ce785dc and 282315f.

📒 Files selected for processing (1)
  • docs/reference/session-tracker.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: behavioral session tracking with multi-step attack detection

1 participant