Skip to content

YYalcinoz/malware-analyzer

Repository files navigation

Malware Analyzer

Dashboard Screen

Application Screenshot

Features

  • File hashing: MD5, SHA1, SHA256
  • Entropy analysis (global and section-level for PE)
  • PE parsing (sections, imports, metadata)
  • String extraction and IOC extraction
  • YARA rule scanning
  • VirusTotal hash lookup (optional API key)
  • AlienVault OTX IP reputation lookup (optional API key)
  • Office document analysis (macros, external URLs, embedded objects, risk indicators)
  • PDF structural risk checks
  • MITRE ATT&CK technique mapping from detected behaviors
  • Risk scoring with explanation and breakdown
  • Report export:
    • JSON
    • HTML
    • STIX 2.1 bundle

Production-Grade Security & Performance Highlights

  • Custom filename hardening: upload names are sanitized by a custom regex-backed secure_filename() implementation (not solely framework defaults), including Unicode normalization, path-separator stripping, and strict character allowlisting.
  • XSS Protection: Comprehensive HTML entity escaping (html_module.escape()) is applied across all dynamically generated HTML reports to mitigate Cross-Site Scripting vulnerabilities from malicious files.
  • Path Traversal Defense: strict bounds checking and UUID validation are employed on all report endpoints to ensure paths cannot escape the secure reports directory.
  • Non-root container runtime: the Docker image creates and runs as a non-privileged appuser (UID 10001) instead of root.
  • Guaranteed sample cleanup: uploaded binaries are removed in a finally block after analysis so potentially dangerous files are not retained on disk.
  • YARA performance caching: YARA rules are lazy-loaded and cached in-memory at module level for faster sequential scans.
  • Storage clarity: uploaded source files are deleted after analysis, while generated reports are persisted in reports/ (or Docker volume) for retrieval/export.
  • Optional API hardening: POST /analyze supports bearer-token protection and per-IP rate limiting via environment variables.

Supported File Types

  • PE and binaries: .exe, .dll, .sys, .bin, .dat
  • Office: .doc, .docx, .xls, .xlsx, .ppt, .pptx, .docm, .xlsm, .dotm, .pptm
  • PDF: .pdf
  • Scripts: .ps1, .vbs, .js, .hta, .bat, .cmd, .py, .pyw, .pyc
  • Archives: .zip, .rar, .7z
  • Others: .apk, .elf

Tech Stack

  • Python 3.11+
  • Flask
  • YARA (via yara-python)
  • python-magic
  • pefile
  • oletools
  • pdfid

Project Structure

  • app.py: Flask application and API routes
  • malware_analyzer_lib/: analysis modules
  • templates/index.html: frontend UI
  • yara_rules/: YARA signatures
  • Dockerfile: container build recipe
  • docker-compose.yml: one-command local container run
  • requirements.txt: used libraries

Screenshot After Analysis

Application Screenshot

Quick Start (Local)

1. Clone and enter project

git clone <https://github.com/YYalcinoz/malware-analyzer>
cd malware-analyzer

2. Create virtual environment

python -m venv venv
# Windows PowerShell
venv\Scripts\Activate.ps1
# Linux/macOS
source venv/bin/activate

3. Install dependencies

pip install -r requirements.txt

4. Configure environment

Copy .env.example to .env and set values:

SECRET_KEY=replace_with_a_long_random_secret
FLASK_DEBUG=false
PORT=5000
VT_API_KEY=
OTX_API_KEY=
ANALYZE_AUTH_TOKEN=
ANALYZE_RATE_LIMIT_PER_MINUTE=30

5. Run app

python app.py

Open: http://127.0.0.1:5000

Quick Start (Docker)

Option A: Docker Compose (recommended)

# Compose reads SECRET_KEY and optional API/auth settings from your local .env file.
docker compose up --build

Reports are stored in a Docker named volume (analyzer_reports) to avoid Linux host-permission issues.

Open: http://127.0.0.1:5000

Option B: Docker CLI

docker build -t malware-analyzer .
docker run --rm -p 5000:5000 -e SECRET_KEY=change-me malware-analyzer

# If you want to pass env vars too:
docker run --rm -p 5000:5000 \
  -e SECRET_KEY=change-me \
  -e VT_API_KEY= \
  -e OTX_API_KEY= \
  -e ANALYZE_AUTH_TOKEN= \
  -e ANALYZE_RATE_LIMIT_PER_MINUTE=30 \
  malware-analyzer

API Endpoints

  • GET / - Web UI
  • POST /analyze - Upload and analyze file
  • GET /report/<analysis_id>/json - Download JSON report
  • GET /report/<analysis_id>/html - Download HTML report
  • GET /export/stix/<analysis_id> - Download STIX 2.1 bundle
  • GET /health - Health check

Security Notes

  • Do not upload live malware to systems you cannot isolate.
  • Use this in a lab/sandbox environment.
  • Keep API keys and auth tokens in .env only (never commit .env).
  • Uploaded files are removed after analysis; generated reports are stored in reports/ (or analyzer_reports volume in Docker).
  • Optional hardening for public deployments:
    • Set ANALYZE_AUTH_TOKEN to require Authorization: Bearer <token> on POST /analyze.
    • Set ANALYZE_RATE_LIMIT_PER_MINUTE to cap analyze requests per client IP (0 disables limiter).

CI

GitHub Actions workflow is provided in .github/workflows/ci.yml and runs on pushes/PRs with:

  • python -m compileall
  • python -m pytest -q
  • ruff check .
  • bandit basic scan

Limitations

  • Primarily static analysis; no full behavioral sandbox execution.
  • Accuracy depends on available signatures/rules and external intel coverage.

Troubleshooting

RuntimeError: SECRET_KEY is required when FLASK_DEBUG is false

This error occurs when FLASK_DEBUG=false but no SECRET_KEY is set in .env.

Solution for local development:

  • Set FLASK_DEBUG=true in .env (SECRET_KEY will not be required)
  • Or provide a SECRET_KEY value in .env

Solution for production:

  • Keep FLASK_DEBUG=false
  • Generate and set a strong SECRET_KEY in .env:
    python -c "import secrets; print(secrets.token_hex(32))"

Agent Definitions

The markdown files for the AI agents used in this project are stored in:

.github/agents/

About

Python/Flask-based Static Malware Analyzer. Features file hashing, PE/Office/PDF analysis, YARA scanning, VirusTotal/OTX lookups, and MITRE ATT&CK mapping. Generates STIX 2.1, HTML, and JSON reports.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages