Skip to content

hornikmatej/pdf_pycrack

Repository files navigation

PDF-PyCrack

PyPI version License: MIT Python 3.12+ Code style: black uv

A not yet blazing fast, parallel PDF password cracker for Python 3.12+.


Table of Contents

Features

  • Multi-core Cracking: Utilizes all CPU cores for maximum speed.
  • Efficient Memory Usage: Handles large PDFs with minimal RAM.
  • Resilient Workers: Worker processes handle errors gracefully; the main process continues.
  • Progress Tracking: Real-time progress bar and statistics.
  • Customizable: Tune password length, charset, batch size, and more.
  • Comprehensive Error Handling: Clear error messages and robust test coverage for all edge cases.

Installation

Install from PyPI (recommended):

uv pip install pdf-pycrack

For development:

git clone https://github.com/hornikmatej/pdf_pycrack.git
cd pdf_pycrack
uv sync

Quick Start

uv run pdf-pycrack <path_to_pdf>

For all options:

uv run pdf-pycrack --help

Usage

Basic usage:

uv run pdf-pycrack tests/test_pdfs/numbers/100.pdf

Custom charset and length:

uv run pdf-pycrack tests/test_pdfs/letters/ab.pdf --min-len 2 --max-len 2 --charset abcdef

Using as a Python Library

You can also use pdf-pycrack programmatically in your Python code:

from pdf_pycrack import crack_pdf_password, PasswordFound

result = crack_pdf_password(
    pdf_path="my_encrypted_file.pdf",
    min_len=4,
    max_len=6,
    charset="0123456789"
)

if isinstance(result, PasswordFound):
    print(f"Password found: {result.password}")

Benchmarking

Measure and compare password cracking speed with the advanced benchmarking tool:

uv run python benchmark/benchmark.py --standard

Performance regression testing:

# Check latest benchmark for performance regression
uv run python benchmark/regression_detector.py

# Check with custom threshold and fail on regression
uv run python benchmark/regression_detector.py --threshold 10.0 --fail-on-regression

Custom runs:

uv run python benchmark/benchmark.py --pdf tests/test_pdfs/letters/ab.pdf --min-len 1 --max-len 2 --charset abcdef
uv run python benchmark/benchmark.py --processes 4 --batch-size 1000

Results are saved in benchmark/results/ as JSON and CSV. The system automatically detects performance regressions by comparing against recent baselines. See benchmark/README.md for full details, options, and integration tips.

Testing & Error Handling

Run all tests:

uv run pytest

Tests are marked by category:

  • numbers, letters, special_chars, mixed

Run a subset:

uv run pytest -m numbers

Error Handling:

The suite in tests/test_error_handling.py covers:

  • File not found, permission denied, directory instead of file
  • Corrupted/unencrypted PDFs
  • Empty charset, invalid parameters
  • Memory errors, worker process failures

All errors are reported with clear messages and suggested actions.

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes and add/update tests
  4. Run all tests and pre-commit hooks:
    uv run pre-commit install
    uv run pre-commit run --all-files
    uv run pytest
  5. Open a pull request

License

This project is licensed under the MIT License. See the LICENSE file for details.


Further documentation: