StegX hides files inside PNG images using password-shuffled LSB embedding and authenticated encryption. Version 2 is a ground-up security rewrite built around a versioned container format, Argon2id key derivation, domain-separated HKDF sub-keys, LSB-matching (±1) embedding that defeats chi-square steganalysis and — optionally — ChaCha20-Poly1305 dual-cipher, F5-style matrix embedding, adaptive cost-map filtering, keyfile 2FA, plausible-deniability decoy payloads, and k-of-n Shamir secret sharing across multiple cover images.
- Argon2id replaces PBKDF2 as the default password KDF (PBKDF2 still
selectable via
--kdf pbkdf2at 600k iterations, and the legacy v1 format is still readable for backwards compatibility). - HKDF sub-keys derive independent AES-GCM, ChaCha20-Poly1305, position-shuffle seed and sentinel keys from one master key — so the slow password KDF runs once per operation.
- AEAD with associated-data binds the entire container header (version/flags/KDF params/salts/nonces) to the ciphertext, so any tampering invalidates the GCM tag.
- Dual-cipher mode (
--dual-cipher) layers ChaCha20-Poly1305 on top of AES-256-GCM with independent keys — defence in depth against a catastrophic break of either cipher. - Keyfile 2FA (
--keyfile PATH) mixes an external binary into the KDF input; password alone no longer suffices.
- Multi-algorithm compressor (
--compression best, default): every payload is fed to zlib-9, LZMA2-extreme, bzip2-9, zstd-22, zstd-22 with a bundled pre-trained dictionary, and brotli-11 in parallel; the smallest output wins and is tagged in the metadata so the decoder knows what to reverse. Typical savings on compressible data (text/JSON/code) are 40-75% smaller than zlib alone. The dictionary (~5 KiB, shipped atsrc/stegx/data/stegx_dict_v1.zstd) was trained on a corpus of common file-type headers (PE/ELF/PDF/ZIP/JSON/text/image) so it wins particularly on small payloads under ~1 KiB where plain zstd pays its header overhead. Random/encrypted payloads fall through tonone(storing raw bytes). --compression fastfor latency-critical scenarios — zlib only, same behaviour as pre-2.0.
stegx encode -f a.zip b.txt c.pdf -o out.pngbundles multiple inputs into an in-memory tar archive before compression and encryption. On decode, the bundle flag in metadata triggers transparent extraction — each member is written to the destination directory with its original filename. Path traversal attempts (../, absolute paths, symlinks, devices) are rejected during extraction.
stegx pick-cover --dir ./covers --payload secret.zipranks every image in a directory by capacity and Shannon entropy, and picks the best fit for a given payload. Useful for choosing a cover that has enough headroom AND enough texture to hide LSB modifications.
- LSB matching (±1) replaces LSB replacement by default — defeats the asymmetry exploited by chi-square and RS analysis.
- Adaptive embedding (
--adaptive) filters pixel positions by cost map. Two modes are available via--adaptive-mode:laplacian(default, fast) — keeps positions with the highest edge response. Adequate for classical steganalysers.hill— HILL-inspired cost map (Li et al., ICIP 2014) with a KB high-pass + double box-blur pipeline. Stronger against CNN-based steganalysers (SRNet / YeNet) at a small extra compute cost.
- Matrix (F5) embedding (
--matrix-embedding) uses Hamming(7,3) coding to cut the per-bit change rate by ~2.3×. - Per-image HMAC sentinel replaces the fixed
STEGX_EODmarker, so the sentinel varies with password and cover and can't be pattern-matched. - Capacity ceiling (
--max-fill PCT, default 25%) rejects oversize payloads that would be trivially detectable by CNN steganalysers. - PNG metadata stripping clears Pillow's
Softwarefingerprint chunk; the output's encoder parameters mirror the cover's (compress_level) so file-size and chunk comparisons don't flag the stego.
getpassby default for password entry, plus--password-stdinfor scripting.-pstill works but warns loudly — it leaks into shell history andps.zxcvbnpassword-strength gate: warn on score < 3;--strict-passwordrefuses weak passwords outright.- Unified decode error message — wrong password, wrong keyfile, and non-StegX image all report the same text, removing an oracle.
- Constant-time sentinel compare via
hmac.compare_digest. - OS-level memory locking via
mlock(2)on Linux / macOS andVirtualLockon Windows for every master key and HKDF sub-key — prevents secrets from being paged to swap / hibernation files. Falls back to plain zeroisation if the OS rejects the lock (e.g. withoutCAP_IPC_LOCKor sufficient working-set quota). - Best-effort memory wipe of master keys and derived sub-keys after use.
--fipsmode restricts the pipeline to FIPS 140-validated primitives: PBKDF2-HMAC-SHA256, AES-256-GCM, HKDF-SHA256 and zlib-only compression. Refuses Argon2id, ChaCha20-Poly1305, brotli, lzma, bz2 and zstd. Suitable for compliance-bound environments.- Versioned container (magic byte + version byte + flags) makes future algorithm upgrades non-breaking.
- Plausible-deniability decoy (
--decoy-file/--decoy-password) — the cover is split into two disjoint regions; either password unlocks only its own region. Without both passwords, an observer cannot tell whether a second region carries data. - Paranoid cover split (
--always-split-cover) — always reserves the decoy half and fills it with cryptographically random bits whenever no real decoy is supplied. Equalises LSB modification density across both halves so a statistical observer cannot distinguish "decoy in use" from "no decoy" cases. Costs 50 % of cover capacity; opt-in only. - k-of-n Shamir split (
stegx shamir-split/stegx shamir-combine) — distribute a secret across n cover images; any k reconstruct it.
The v2 payload format is not backwards-compatible with stego images produced by StegX ≤ 1.2.1 (sentinel, seed derivation and container layout all changed). StegX 2.0 can still read v1 stego images transparently via a fallback path, but new stego images use v2. Re-encode anything important.
-
Prerequisites:
- Python 3.8 or higher.
pip(Python package installer).
-
Clone the Repository (Optional):
git clone https://github.com/Delta-Sec/StegX cd stegx_projectAlternatively, install via pip-from-git:
pip install git+https://github.com/Delta-Sec/StegX. -
Install the package:
pip install -e . # editable install — creates a `stegx` binary
Or, to run from a checkout without installing:
pip install -r requirements.txt python -m stegx --help
Optional extras:
pip install -e '.[compression,strength]'addszstandard+brotli(multi-codec compression) andzxcvbn(password-strength gate). -
Shell completions (optional): completions/ contains ready-to-install bash / zsh / fish files. See completions/README.md for per-shell installation paths.
-
Docker (optional): a multi-stage Dockerfile ships with the repo:
docker build -t stegx:latest . docker run --rm -it -v "$PWD:/work" stegx:latest --help # Encode a local file using a bind-mount: docker run --rm -i -v "$PWD:/work" stegx:latest \ encode -i /work/cover.png -f /work/secret.zip \ -o /work/out.png --password-stdin <<< "$PW"
The image runs as a non-root user (
stegx), installs all optional extras ([all]), and exposesstegxas itsENTRYPOINT.
GitHub Actions workflows are checked in under .github/workflows/:
- ci.yml runs on every push + PR:
pytestmatrix on Python 3.9 / 3.10 / 3.11 / 3.12 / 3.13 (Linux) + a 3.12 Windows row.docker build+ smoke test ofstegx --versionandstegx benchmarkinside the built image.python -m build+twine checkproducing an artefact for every successful run.
- release.yml fires on
v*.*.*tags (or manual dispatch):- Builds + publishes the wheel and sdist to PyPI (expects a
PYPI_API_TOKENrepo secret — or switch to OIDC trusted publishing; the workflow has commented guidance). - Builds + pushes a multi-arch (
amd64+arm64) Docker image toghcr.io/<owner>/stegxusing the standardGITHUB_TOKEN.
- Builds + publishes the wheel and sdist to PyPI (expects a
StegX provides four subcommands: encode, decode, shamir-split, shamir-combine.
stegx encode -i <cover> -f <file> -o <output.png> [options]The password is read from a TTY prompt by default (getpass). To script, pipe
it via --password-stdin. The legacy -p flag is still accepted but will warn.
Common options:
| Flag | Description |
|---|---|
-p, --password PW |
Password (discouraged — leaks into shell history). |
--password-stdin |
Read password from a single line of stdin. |
--keyfile PATH |
Mix an external binary into the KDF as a second factor. |
--yubikey |
Require a YubiKey HMAC-SHA1 response (slot 2) as an additional hardware factor. Needs pip install ykman. |
--panic-password PW |
Arm self-destruct: entering this password at decode time wipes the real region's LSBs before reporting. Mutually exclusive with --decoy-file. |
--panic-decoy PATH |
Sacrificial payload returned after panic destruction (omit = silent mode). |
--polyglot-zip PATH... |
After the stego PNG is written, append a ZIP archive of the listed files so the output is simultaneously a valid PNG and a valid ZIP. Public side-channel only; does not affect the hidden StegX payload. |
--kdf {argon2id,pbkdf2} |
Password-based KDF (default: argon2id). |
--dual-cipher |
Layer ChaCha20-Poly1305 over AES-256-GCM. |
--adaptive |
Embed only in high-edge-cost regions (defeats CNN steganalysers). |
--matrix-embedding |
F5-style Hamming(7,3) matrix embedding for the ciphertext body. |
--max-fill PCT |
Refuse payloads filling more than PCT % of capacity (default 25%). |
--strict-password |
Reject passwords with zxcvbn score < 3 (default: warn). |
--no-preserve-cover |
Don't mirror the cover's PNG encoder parameters on save. |
--no-compress |
Disable compression of the payload. |
--compression {fast,best} |
fast = zlib-9 only; best (default) tries zlib, LZMA, bzip2, zstd-22 (+ bundled-dictionary variant) and brotli-11, stores the smallest. |
--always-split-cover |
Paranoia mode: always reserve the decoy half and fill it with random bits even when no --decoy-file is set. Halves cover capacity; opt-in. |
--fips |
Restrict to FIPS 140-validated primitives (PBKDF2 + AES-GCM + zlib). Rejects Argon2id / ChaCha / brotli / lzma / bz2 / zstd. |
--decoy-file PATH |
Hide a decoy payload alongside the real one (plausible deniability). |
--decoy-password PW |
Password for the decoy (prompted if omitted). |
Cover image from a URL: the -i/--image argument accepts an
http(s)://… URL. StegX downloads the bytes to a temp file (only
Content-Type: image/* is accepted, 50 MiB cap, 30-second timeout), verifies
the image with Pillow, uses it as the cover, and deletes the temp file on
exit. Only image decoding — no scripting, no execution of any kind.
Examples:
# Interactive: prompts for password via getpass
stegx encode -i landscape.png -f secret.pdf -o out.png
# Cover pulled straight from a URL (Imgur, S3, etc.)
stegx encode -i https://i.imgur.com/abc123.png -f secret.zip -o out.png
# Hardened: dual cipher + adaptive + matrix embedding + keyfile
stegx encode -i cover.png -f secret.bin -o out.png \
--dual-cipher --adaptive --matrix-embedding --keyfile token.bin
# Plausible-deniability decoy
stegx encode -i cover.png -f real.zip -o out.png \
--decoy-file harmless.txtstegx decode -i <stego.png> (-d <output_dir> | --stdout | -d -) [--keyfile PATH]The password is prompted interactively unless -p, --password-stdin, or
--keyfile changes the auth inputs. All failure modes (wrong password, wrong
keyfile, non-StegX image, corrupted data) report the same error message on
purpose — to avoid leaking information to an attacker.
Output destinations:
-d <dir>— write the extracted file into<dir>(default behaviour).--stdout— write decrypted bytes to stdout (no filename preserved). Use this to pipe directly into another program without touching disk.-d -— same as--stdout.
Examples:
# Normal disk output
stegx decode -i out.png -d ./extracted
# Pipe decrypted bytes into another tool (e.g. SSH key into ssh-agent)
stegx decode -i out.png --stdout --password-stdin <<< "$PW" | ssh-add -
# Pipe into jq, openssl, etc.
stegx decode -i out.png --stdout | jq .stegx benchmark [--iterations N] [--size-kib K]
stegx benchmark --calibrate [--target-ms 500]Times Argon2id KDF runs and runs the compression multiplexer over a
mixed-ASCII sample. --calibrate sweeps Argon2id memory sizes to find
the one that lands closest to --target-ms on your CPU — useful before
bumping the project-wide defaults in src/stegx/kdf.py.
stegx rewrap -i stego.png [-o new.png]Rotates the password / keyfile / YubiKey on an existing stego image without ever materialising the plaintext on disk. The old credentials decrypt the inner payload in memory, the old LSB positions are overwritten with cryptographic noise so they cannot be resurrected, and the payload is re-embedded with the new credentials. Useful when a password is suspected compromised or during scheduled key rotation.
Every encode / decode / rewrap subcommand accepts --audit-log PATH.
Each operation appends one JSONL record containing:
- UTC timestamp
- Operation name + ok/fail bit
- SHA-256 of the cover and/or stego file
- Names (never values) of the security-relevant flags used
- A
prevlink to the previous record'schainhash + its ownchainhash over the canonical form of the record
Tampering with any middle record breaks every subsequent chain hash;
stegx.audit_log.verify_chain(path) walks the file and reports
the first bad line. Payload content is never logged.
Split a secret into n shares hidden across n cover images — any k
reconstruct it.
# Split: 3-of-5 across 5 covers
stegx shamir-split -k 3 -n 5 -f secret.bin \
-c c1.png c2.png c3.png c4.png c5.png -O shares/
# Combine: any 3 shares recover the secret
stegx shamir-combine -i shares/stego_share_01.png \
shares/stego_share_02.png shares/stego_share_03.png \
-d ./out -o recovered.bin[16 B sentinel ] HMAC(sentinel_key, cover_fingerprint)[:16]
[56 B header ] magic | version | kdf_id | flags | kdf_params
| salt(16) | aes_nonce(12) | chacha_nonce(12)
| inner_ct_length(4)
[N B ciphertext] AEAD(AES-256-GCM, optionally chained with ChaCha20-Poly1305)
position_key = Argon2id(password ‖ keyfile?, FIXED_APP_SALT, default_params)
├─ HKDF("stegx/v2/pixel-shuffle-seed" ‖ fingerprint) → shuffle seed
└─ HKDF("stegx/v2/sentinel" ‖ fingerprint) → sentinel key
master_key = Argon2id(password ‖ keyfile?, random_salt_from_header, params_from_header)
├─ HKDF("stegx/v2/aes-256-gcm") → AES key (32 B)
└─ HKDF("stegx/v2/chacha20-poly1305") → ChaCha key (optional)
Argon2id defaults: time_cost=3, memory_cost=64 MiB, parallelism=4.
- Build inner payload:
[4-B metadata_len][JSON metadata][file data (optionally zlib-compressed)]. - Derive
master_key(Argon2id); deriveaes_keyand optionalchacha_keyvia HKDF. aes_ct = AES-GCM(aes_key, aes_nonce, inner, aad=header_with_length_zeroed).- If
--dual-cipher:final_ct = ChaCha20-Poly1305(chacha_key, chacha_nonce, aes_ct, aad=header). - Final container:
header.pack() ‖ final_ct(length field populated after step 3).
position_key→seed_int→ shuffle every pixel-channel index.- If
--adaptive: drop positions outside the top Laplacian-edge percentile. - If
--decoy-file: partition all positions into two disjoint regions by a cover-fingerprint-only deterministic shuffle; real payload uses one region, decoy uses the other. - LSB-matching (±1) on sentinel + header + ciphertext; matrix (F5 Hamming 7-3) optionally on the ciphertext body only.
- Save as PNG with stripped metadata chunks and cover-matched
compress_level.
- Derive
position_key→ shuffled positions → read first 16 bytes; compare against HMAC-derived sentinel (constant-time). - On match, read 56-byte header, parse KDF params, derive
master_key. - Read
inner_ct_lengthbytes of ciphertext, reverse dual-cipher if flagged, decrypt with AES-GCM. AEAD tag verifies that the header was not tampered. - Parse metadata, decompress if flagged, write to output directory with a sanitised filename.
Sentinel-then-AEAD-tag means wrong password / wrong keyfile / non-StegX image all fail with the same generic error — no oracle.
StegX has been tested against multiple steganalysis tools and techniques. It was able to resist extraction and avoid detection by:
| Tool | Status |
|---|---|
| Stegseek | ❌ Failed to extract |
| zsteg | ❌ No patterns found |
| binwalk | ✅ Clean output |
| exiftool | ✅ Metadata clean |
| Chi-Square Test | ✅ Low anomaly (13K vs 119K in Steghide) |
| Entropy Test | ✅ 7.99 bits/byte (high randomness) |
| Histogram Check | ✅ High similarity with original |
📎 See detailed comparison in Why StegX is Better than steghide.pdf
- Error: Insufficient image capacity: The file (after potential compression and encryption overhead) is too large to fit in the LSBs of the chosen cover image. Try a larger image, ensure the cover image is PNG/BMP, or hide a smaller file.
- Error: Decryption failed. The password might be incorrect...: This
InvalidTagerror almost always means the password provided for decoding does not match the one used for encoding, or the stego-image file has been modified or corrupted. - Error: Could not find hidden data marker...: The
STEGX_EODsentinel was not found. This indicates the image was likely not created by StegX or has been significantly altered (e.g., re-saved with lossy compression like JPEG). - Error: Payload or metadata seems corrupted: The data extracted could be decrypted, but the internal structure (metadata length, JSON format, or decompressed size) is inconsistent. The image might be corrupted.
- Unsupported Image Mode: Ensure the input cover image is in a supported format (RGB, RGBA, L, P). Formats like CMYK are not directly supported for LSB embedding.
- Output Image Larger than Expected: PNG compression might vary. The primary goal of using PNG is lossless storage of LSB data, not minimal file size.
This project is licensed under the MIT License - see the LICENSE file for details.
