logit-sec-probe

Statistical evaluation harness that analyzes LLM token entropy and log-probabilities to detect silent model uncertainty during insecure code generation.

Features

A/B Testing Framework: Compare model behavior with and without safety system prompts
CWE-based Test Cases: Security test cases for Buffer Overflow (CWE-120), SQL Injection (CWE-89), and XSS (CWE-79)
Token-level Analysis: Entropy and probability tracking for each generated token
Risk Tagging: Automatic detection of risky keywords in generated code
Comparative Visualization: Multi-panel heatmaps for entropy comparison across configs

Quick Start: Google Colab

Run the interactive tutorial directly in your browser - no installation required!

Installation

Option 1: Docker (Recommended)

No local installation required! Just use Docker:

# Build and run with docker-compose
docker-compose up --build

# Or build and run manually
docker build -t logit-sec-probe .
docker run -v ./output:/app/output logit-sec-probe

Output files will be saved to the ./output directory.

GPU Support

To enable GPU acceleration, uncomment the GPU section in docker-compose.yml and ensure you have the NVIDIA Container Toolkit installed.

Option 2: Local Installation

pip install -r requirements.txt

Usage

With Docker

docker-compose up

Local

Run the entropy analysis experiment:

python entropy_analysis.py

This script runs the A/B testing experiment:

Loads CWE test cases from data/cwe_prompts.json
For each CWE, generates code with two configurations:
- Base: No system instruction (baseline)
- Safety: Safety system prompt enabled
Calculates entropy for each generated token
Tags risky tokens based on CWE-specific keywords
Saves results and generates comparative visualizations

Output

output/experiment_results.csv: CSV file with all experiment data including:
- Experiment_ID: CWE identifier
- Config: Base or Safety configuration
- Token_Pos: Position in generated sequence
- Token_Text: Decoded token text
- Entropy: Token entropy (uncertainty measure)
- Probability: Probability of selected token
- Is_Risky: Whether token contains risky keyword
output/comparative_entropy.png: Multi-panel heatmap comparing entropy across configurations

Data

Test cases are defined in data/cwe_prompts.json:

CWE ID	Vulnerability	Risky Keyword
CWE-120	Buffer Overflow	`strcpy`
CWE-89	SQL Injection	`execute`
CWE-79	XSS	`format`

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github		.github
backend		backend
data		data
frontend		frontend
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
entropy_analysis.py		entropy_analysis.py
entropy_analysis_tutorial.ipynb		entropy_analysis_tutorial.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

logit-sec-probe

Features

Quick Start: Google Colab

Installation

Option 1: Docker (Recommended)

GPU Support

Option 2: Local Installation

Usage

With Docker

Local

Output

Data

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

khuynh22/logit-sec-probe

Folders and files

Latest commit

History

Repository files navigation

logit-sec-probe

Features

Quick Start: Google Colab

Installation

Option 1: Docker (Recommended)

GPU Support

Option 2: Local Installation

Usage

With Docker

Local

Output

Data

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages