PBZ PDF to CSV parser

This parser converts PBZ statements to CSV.

Requirements

Python 3.9+ (stdlib only)
pdftotext from Poppler

Install Poppler:

macOS: brew install poppler
Ubuntu/Debian: sudo apt-get install poppler-utils

Python setup

python3 -m venv .venv
source .venv/bin/activate

Usage

python parser.py data/pdfs/*.pdf \
  --out out/pbz_statements.csv \
  --validation-out out/pbz_validation.csv

Notes:

Output includes raw_text_block for audit and tx_fingerprint for idempotent imports.
Dedupe by tx_fingerprint is on by default. Use --no-dedupe to keep all rows.
raw_text_block contains embedded newlines and will be CSV-quoted.

JavaScript parser (Bun)

Requires Bun 1.0+ and pdftotext (Poppler).

bun parser.js data/pdfs/*.pdf \
  --out out/pbz_statements_js.csv \
  --validation-out out/pbz_validation_js.csv

Run tests (compares JS output to the sample CSVs in out/):

bun test

Static app (Bun build)

Install dependencies:

bun install

Build the static app:

bun run build

The output is written to dist/. Host that folder on any static provider.

Notes:

The browser app uses PDF.js to rebuild a fixed-width layout before parsing.
If columns drift, tune the line grouping and x-position tolerance in src/app.js.
UI strings live in src/strings.json (Croatian by default).

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
PERS.md		PERS.md
README.md		README.md
bun.lock		bun.lock
csv.js		csv.js
package.json		package.json
parser-core.js		parser-core.js
parser.js		parser.js
parser.py		parser.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PBZ PDF to CSV parser

Requirements

Python setup

Usage

JavaScript parser (Bun)

Static app (Bun build)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PBZ PDF to CSV parser

Requirements

Python setup

Usage

JavaScript parser (Bun)

Static app (Bun build)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages