This repo shows a controlled spreadsheet automation workflow: reviewed formulas are copied from an HTML catalog into an Excel-compatible workbook, while unsafe or invalid rows are rejected and preserved in audit outputs.
- Public problem: workshop formulas need to move from a shared catalog into a workbook without hidden copy-paste drift.
- Visible check: open
examples/outputs/reader-summary.mdand the workbook'sFormula Auditsheet. - Main limit: this is offline, fixture-backed workbook automation; browser, authenticated-source, and enterprise workbook integration work remain outside this slice.
The broader pattern is a controlled handoff from one source of rules into a spreadsheet people already use. The risk is not just whether a formula can be copied; it is whether each copied formula has an allowed destination, a clear decision trail, and a visible place for a person to review anything held back.
Start with the human-readable proof surface, not the code: read examples/outputs/reader-summary.md, then open examples/outputs/community_workshop_formula_sync.xlsx and review the Formula Audit sheet. The workbook is intentionally small so the automation boundary, the review trail, and the rejected examples are easy to inspect.
A community workshop coordinator keeps planning formulas on a simple web-style page. The workbook used for weekly planning has cells for attendance, material kits, remaining seats, and volunteer coverage.
The useful work is not advanced spreadsheet modeling. It is keeping the copied formulas aligned with the catalog while making every automated change reviewable.
Manual copy and paste is easy to get wrong. A formula can land in the wrong cell, a stale external workbook reference can sneak in, or a malformed formula can be pasted because it looked harmless in the source page.
This reference workflow treats the HTML catalog as an input, checks each formula against a declared workbook plan, writes only approved formulas, and records the decision for every catalog row.
The goal is a small, inspectable automation that handles repetitive workbook updates without removing human judgment. The script handles parsing, allow-list checks, workbook writing, and audit output. A reviewer still decides whether the resulting workbook logic is appropriate.
No AI model or external decision service is used. The automation is deterministic Python code with explicit validation rules.
examples/fixtures/workshop_formula_catalog.html: a checked-in HTML table of formula definitions.examples/fixtures/workshop_plan.csv: workbook cells, sample inputs, and the explicit list of formula destinations the automation may update.- Optional live HTML source: an unauthenticated URL configured through
FORMULA_SOURCE_URLand, when needed,FORMULA_SOURCE_SELECTOR.
examples/outputs/community_workshop_formula_sync.xlsx: an Excel-compatible workbook with aPlansheet and aFormula Auditsheet.examples/outputs/reader-summary.md: the first review artifact for non-technical readers.examples/outputs/formula-audit.csv: a diff-friendly technical audit file with the same accept/reject decisions.
The workflow parses the catalog, normalizes destinations, checks whether each target cell is allowed, rejects external workbook or URL references, rejects malformed formulas, writes accepted formulas, and creates audit evidence.
The default fixture deliberately includes rejected examples so reviewers can see how the workflow behaves when a source row should not be applied.
A person should still review the workbook logic, check whether the accepted formulas match the planning intent, inspect rejected rows, and recalculate the workbook in Excel or LibreOffice. Python writes formulas into the .xlsx file; it does not evaluate Excel formulas as Excel would.
- Read
examples/outputs/reader-summary.md. - Open
examples/outputs/community_workshop_formula_sync.xlsx. - Check the
Plansheet formulas in cellsB10:B13. - Check the
Formula Auditsheet and confirm the three rejected rows were not applied. - Notice that formula text in audit outputs is shown as safe text evidence, while the
Plancells still contain workbook formulas.
Evidence-to-decision example: the source row legacy_budget_lookup contains an external workbook reference, so the validator rejects it, writes the reason into the audit outputs, and leaves the workbook cell unchanged.
Many teams keep formulas, rules, or small operational calculations in one place and need those rules copied into spreadsheets without manual drift. The important pattern is not the community workshop theme; it is the controlled movement of workbook logic from a source catalog into an auditable workbook update.
This repo maps that problem to a Python runtime, a bounded HTML scraping boundary, an Excel-compatible .xlsx output, a small VBA import template for native Excel users, a deterministic validation layer, and human-readable audit files. The fixture path is offline and reproducible. The optional live path proves the same parser can read an unauthenticated HTML endpoint when a reviewer provides FORMULA_SOURCE_URL.
The pattern fits small formula catalogs, planned workbook destinations, and workflows where humans approve the final workbook. It does not fit login-protected scraping, large spreadsheet refactors, macros that must run inside a locked-down enterprise Excel environment, or workbooks whose correctness depends on volatile external services.
This slice does not automate a browser, authenticated source, or existing enterprise workbook; those require environment-specific ingestion, template, security, and operations work.
Extension points are intentionally visible: replace the HTML parser boundary, expand the workbook destination plan, add stricter formula rules, or wire the generated audit CSV into a workbook-side import process. Those are productionization steps, not claims made by this reference slice.
Use Python 3.11 or another currently supported Python 3 release. The command path below uses python3.11 because the local validation machine has a newer default python3 than this small reference needs. It keeps the virtual environment and scratch outputs under ${TMPDIR:-/tmp}, installs only the direct review dependencies, runs the source package through PYTHONPATH=src, and disables test cache and bytecode output so an ordinary review run does not leave noisy local files in the public clone.
python3.11 -m venv --clear "${TMPDIR:-/tmp}/reference-representative-workflow-venv"
. "${TMPDIR:-/tmp}/reference-representative-workflow-venv/bin/activate"
PIP_DISABLE_PIP_VERSION_CHECK=1 python -m pip install --no-cache-dir -r requirements.txt 'pytest==8.3.4'
export PYTHONPATH=src
export WORKSHOP_REVIEW_DIR="${TMPDIR:-/tmp}/community-workshop-workflow-review"
rm -rf "$WORKSHOP_REVIEW_DIR"
mkdir -p "$WORKSHOP_REVIEW_DIR"
PYTHONDONTWRITEBYTECODE=1 python -m formula_catalog_sync --source examples/fixtures/workshop_formula_catalog.html --plan examples/fixtures/workshop_plan.csv --output "$WORKSHOP_REVIEW_DIR/community_workshop_formula_sync.xlsx" --summary "$WORKSHOP_REVIEW_DIR/reader-summary.md" --audit-csv "$WORKSHOP_REVIEW_DIR/formula-audit.csv"
PYTHONDONTWRITEBYTECODE=1 python -m pytest -p no:cacheprovider
git status --shortAfter the ordinary public review path, git status --short should print nothing. Remove the temporary review files when finished with rm -rf "${TMPDIR:-/tmp}/community-workshop-workflow-review" "${TMPDIR:-/tmp}/reference-representative-workflow-venv".
There are no servers, watchers, browser sessions, or long-lived processes in this repo.
The Python tests confirm that the catalog parser reads the HTML fixture, the validation layer accepts four allow-listed formulas, the rejected examples stay out of the workbook, and the generated workbook contains the expected sheets and formula cells.
The workbook writer pins Office document timestamps and package metadata so rerunning the documented fixture command does not create a binary diff when the source inputs have not changed.
Microsoft Excel is not available in this local validation environment. The checked validation covers workbook structure and stored formulas through openpyxl; final formula calculation remains a human Excel or LibreOffice review step.
This reference stops before production scheduling, login-protected scraping, browser automation, real organizational data, advanced Excel deployment packaging, and environment-specific workbook tuning.
The live URL mode supports unauthenticated HTML with the same table shape as the fixture. Authenticated sources, custom selectors beyond the documented table id forms, and organization-specific workbook templates should be added only after the target environment is known.
- Author: Juan Luis Herrera Cortijo
- Contact:
juan.luis.herrera.cortijo@gmail.com - GitHub:
https://github.com/JLHerreraCortijo - Copyright: Copyright (c) 2026 Juan Luis Herrera Cortijo. All rights reserved.
- License:
Portfolio Review License - Third-party dependencies retain their own licenses.