Unified wordlist generation toolkit for pentest and red team operations — 25 subcommands in a single CLI. Charset/mask generation, personal & corporate target profiling, web scraping (JS/CSS/PDF extraction), OCR, document parsing (PDF/XLSX/DOCX), leet speak permutations, XOR crypto, DNS/subdomain fuzzing, phone number generation, corporate user enumeration, healthcare/pharma patterns, default credential databases (IoT/ICS/SCADA/PLC/HMI), ISP WiFi keyspace generation, password-DNA behavioral analysis, keyword combiner, word mangling, merge & sanitize, ML-based ranking with SecLists corpus training, and statistical analysis.
Full documentation: Wiki
DISCLAIMER: This tool is intended exclusively for authorized security testing, penetration testing, and educational purposes. Unauthorized use against systems you do not own or have explicit written permission to test is illegal and unethical. The author assumes no liability for misuse.
pip install wfh-wordlist # core (charset, profile, dns, scrape, analyze, ...)
pip install wfh-wordlist[docs] # + PDF/XLSX/DOCX extraction
pip install wfh-wordlist[scrape] # + PDF crawl during web scraping
pip install wfh-wordlist[ocr] # + OCR (requires PyTorch)
pip install wfh-wordlist[full] # all extrasVerify installation:
wfh --help # should show 25 subcommands
pip show wfh-wordlist # check versiongit clone https://github.com/mrhenrike/WordListsForHacking.git
cd WordListsForHacking
# Linux / macOS / Termux
chmod +x setup_venv.sh && ./setup_venv.sh && source .venv/bin/activate
# Windows PowerShell
.\setup_venv.ps1; .\.venv\Scripts\Activate.ps1wfh # interactive menu (pip install)
python wfh.py # interactive menu (from source)
python wfh.py --help # full CLI helpOS prerequisites (OCR only): see the Installation wiki page.
| # | Command | Description |
|---|---|---|
| 1 | charset |
Charset/mask generation (crunch-style + hashcat masks) |
| 2 | pattern |
Template-based generation with variables |
| 3 | profile |
Personal target profiling (CUPP-style) |
| 4 | corp |
Corporate target profiling |
| 5 | corp-users |
Corporate domain user/password generation (50+ patterns) |
| 6 | phone |
Phone number wordlists (BR, US, UK) |
| 7 | scrape |
Web scraping (CeWL/CeWLeR-style) with JS/CSS/PDF extraction |
| 8 | ocr |
OCR text extraction from images |
| 9 | extract |
Extract words from PDF/XLSX/DOCX |
| 10 | leet |
Leet speak permutations |
| 11 | xor |
XOR encrypt/decrypt/brute-force |
| 12 | analyze |
Statistical analysis (pipal-style) |
| 13 | merge |
Merge & deduplicate wordlists |
| 14 | dns |
DNS/subdomain fuzzing (alterx-style) |
| 15 | pharma |
Healthcare/pharmacy credential patterns |
| 16 | sanitize |
Clean & normalize wordlists |
| 17 | reverse |
Reverse line order |
| 18 | corp-prefixes |
Corporate prefix usernames (MSP/SOC/DevOps) |
| 19 | train |
Train ML pattern model (local + SecLists corpus) |
| 20 | sysinfo |
Hardware & compute info |
| 21 | mangle |
Word mangling rules |
| 22 | default-creds |
Query default credentials database (IoT/routers/printers/ICS) |
| 23 | isp-keygen |
ISP default WiFi password keyspace generator |
| 24 | combiner |
Keyword combiner (intelligence-wordlist-generator style) |
| 25 | password-dna |
Analyze password patterns and generate behavioral variants |
Detailed syntax and examples for each subcommand: Wiki — Subcommands
python wfh.py --threads 20 --compute cuda --no-ml <subcommand>| Flag | Default | Description |
|---|---|---|
--threads N |
5 |
Thread count (1–300) |
--compute MODE |
auto |
auto / cpu / gpu / cuda / rocm / mps / hybrid |
--no-ml |
off | Disable ML ranking |
-v |
off | Verbose logging |
python wfh.py corp-users --domain acme.com.br --file employees.txt --passwords --combo -o acme_combo.lstpython wfh.py profile --name "João Silva" --nick joao --birth 15/03/1990 --leet aggressive -o target.lstpython wfh.py charset 8 8 --mask "?u?l?l?l?d?d?d?s" -o passwords.lstpython wfh.py pattern -t "{company}{year}!" --vars company=acme,globex year=2020-2026 -o patterns.lstpython wfh.py dns -d acme.com.br --words dev staging api admin portal -o subdomains.lstpython wfh.py analyze passwords.lst --top 30 --masks --format json -o analysis.jsonpython wfh.py default-creds --list-vendors
python wfh.py default-creds --vendor mikrotik --format combo -o mikrotik_creds.lst
python wfh.py default-creds --protocol snmp --format user -o snmp_users.lstpython wfh.py isp-keygen --list
python wfh.py isp-keygen --isp xfinity_comcast --estimate
python wfh.py isp-keygen --isp xfinity_comcast --limit 100000 -o xfinity.lstpython wfh.py scrape https://target.com --include-js --include-css --include-pdf --lowercase -o words.lst
python wfh.py scrape https://target.com --emails --output-emails emails.txt --output-urls urls.txt
python wfh.py scrape https://target.com --subdomain-strategy children --stream -o stream.lstpython wfh.py merge list1.lst list2.lst --min-len 6 --sort -o merged.lst
python wfh.py sanitize merged.lst --inplaceMore examples and scenarios: Wiki — Quick Start
Analyze password patterns and generate behavioral variants. The password-dna subcommand extracts structural "DNA" from known passwords (uppercase, lowercase, digit, symbol positions) and produces new candidates that follow the same behavioral patterns.
# Analyze a leaked/known password list and generate variants
python wfh.py password-dna --input known_passwords.lst --depth 2 -o dna_variants.lst
# Generate variants from a single seed with aggressive expansion
python wfh.py password-dna --seed "Company2024!" --depth 3 --leet -o seed_variants.lst
# DNA analysis report only (no generation)
python wfh.py password-dna --input known_passwords.lst --analyze-only --format json -o dna_report.jsonQuery the built-in database of 1,329+ factory-default credentials covering 88 vendors and 14 protocols — routers, switches, printers, IP cameras, ICS/SCADA (PLCs, HMIs, RTUs), IoT gateways, and more.
# List all supported vendors
python wfh.py default-creds --list-vendors
# Export credentials for a specific vendor
python wfh.py default-creds --vendor siemens --format combo -o siemens_creds.lst
# Filter by protocol (telnet, ssh, http, snmp, modbus, s7comm, etc.)
python wfh.py default-creds --protocol modbus --format user -o modbus_users.lst
# Search by device category
python wfh.py default-creds --category ics --format combo -o ics_defaults.lst
# Export full database as JSON
python wfh.py default-creds --export-all --format json -o all_defaults.json| File | Description | Entries |
|---|---|---|
passwords/wlist_brasil.lst |
Brazilian password corpus — cultural word banks, corporate patterns, leet speak, keyboard walks. Company names and CNPJs are public OSINT data. | ~3.88M |
passwords/default-creds-combo.lst |
Default credential user:password combos (routers, printers, ICS/SCADA) | ~3K |
data/default_credentials.json |
Structured default credentials database (1,329 entries, 88 vendors, 14 protocols) | — |
fuzzing/discovery_br.lst |
Brazilian web discovery & API fuzzing paths | ~900 |
usernames/username_br.lst |
Brazilian + global username patterns | ~1.6K |
labs/*.lst |
Workshop & training wordlists | — |
Details: Wiki — Brazilian Wordlist
# Linux/macOS
grep -qxF 'YourPassword' passwords/wlist_brasil.lst && echo "FOUND!" || echo "Not found"
# Windows PowerShell
Select-String -Path passwords\wlist_brasil.lst -Pattern '^YourPassword$' -SimpleMatch -QuietIf found: change it immediately, enable MFA/2FA, use a password manager, and never reuse passwords.
Full guide: Wiki — Password Check
WFH includes a lightweight ML model that ranks generated candidates by structural pattern probability. Train it with local data or the SecLists corpus:
python wfh.py train --auto # local wordlists only
python wfh.py train --seclists # SecLists corpus (auto-discover)
python wfh.py train --auto --seclists # combined (recommended)
python wfh.py train --seclists /path/to/SecLists --seclists-categories password frequencyThe model stores only structural patterns — no PII, passwords, or company names.
Details: Wiki — ML Model
| Project | Inspiration |
|---|---|
| CUPP | Personal target profiling |
| Crunch | Charset-based generation |
| CeWL | Web scraping for wordlists |
| CeWLeR | Modern Python web scraping (JS/CSS/PDF) |
| routersploit | Default credentials for IoT/routers |
| alterx | DNS/subdomain fuzzing |
| pipal | Statistical analysis |
| SecLists | Curated security lists |
| elpscrk | Permutation-based generation |
| BEWGor | Biographical wordlist generator |
| pnwgen | Phone number generation |
| intelligence-wordlist-generator | Keyword combiner |
| SCaDAPass | ICS/SCADA default credentials |
Contributions welcome. See CONTRIBUTING.md.
MIT License — Copyright (c) 2026 André Henrique (@mrhenrike)
Created by André Henrique (@mrhenrike) — União Geek