Skip to content

Add font embedding to HTML export and vegalite_fonts/vega_fonts API#247

Draft
jonmmease wants to merge 5 commits intojonmmease/auto-fontsfrom
jonmmease/html-export
Draft

Add font embedding to HTML export and vegalite_fonts/vega_fonts API#247
jonmmease wants to merge 5 commits intojonmmease/auto-fontsfrom
jonmmease/html-export

Conversation

@jonmmease
Copy link
Collaborator

@jonmmease jonmmease commented Feb 25, 2026

Summary

Connect the font pipeline from PRs 1 and 2 to the existing HTML export methods (vegalite_to_html / vega_to_html). Previously, auto_install_fonts had no effect on HTML output — fonts were referenced by name but never included.

This PR adds:

  • CDN mode (bundle=false): auto-installed fonts get <link> tags pointing to Google Fonts API or Fontsource jsDelivr CDN
  • Bundle mode (bundle=true): auto-installed fonts are subsetted to only the characters actually rendered, encoded as WOFF2, and inlined as base64 @font-face CSS — producing fully self-contained HTML
  • embed_local_fonts option for embedding system-installed fonts discovered via fontdb
  • vegalite_fonts() / vega_fonts() public API for retrieving font information in multiple formats, usable alongside javascript_bundle() for building custom HTML

Motivation

The existing HTML export produces <script> tags for the Vega runtime but has no font handling at all. Even with auto_install_fonts=true, the HTML output references font families by name without including any font data or CDN links. This means bundled HTML depends on the viewer's system fonts and unbundled HTML has no way to load the correct fonts.

Users building custom HTML with javascript_bundle() also had no way to get font information for their charts. The new vegalite_fonts / vega_fonts API exposes font data as composable building blocks.

Changes

New module: font_embed.rs

Font subsetting and WOFF2 encoding pipeline:

  • aggregate_chars_by_font_key() — collects characters used per (font, weight, style) from the scenegraph text extraction
  • subset_and_encode() / subset_and_encode_bytes() — subsets TTF files to required glyphs and encodes as base64 WOFF2
  • generate_font_face_css() — orchestrates the pipeline, dispatching to Fontsource-cached or fontdb-local font paths; returns Vec<String> of individual @font-face blocks
  • Priority-based deduplication across Unicode subset ranges (latin first, then extensions)
  • 15 unit tests

Additions to html.rs

Per-font formatting functions (replacing the earlier generate_font_tags):

  • font_cdn_url() — CDN stylesheet URL (Google Fonts API or Fontsource jsDelivr). Returns None for local fonts
  • font_link_tag() — wraps URL in <link rel="stylesheet">
  • font_import_rule() — wraps URL in @import url(...)
  • 9 unit tests covering Google, Fontsource, and local font cases

Public API: vegalite_fonts / vega_fonts (converter.rs)

New methods on VlConverter that return Vec<String> of font information in a requested format:

  • FontFormat::Name — font family names (["Roboto", "Open Sans"])
  • FontFormat::Url — CDN stylesheet URLs
  • FontFormat::LinkTag<link rel="stylesheet"> tags
  • FontFormat::ImportRule@import url(...) CSS rules
  • FontFormat::FontFace@font-face blocks with subsetted base64 WOFF2 data

Each function accepts auto_install_fonts and embed_local_fonts parameters (defaulting to converter config). vegalite_fonts compiles VL→Vega, then delegates to vega_fonts. Local fonts return None for url/link_tag/import_rule formats (filtered out), and only appear in name and font_face formats.

HTML export refactored to use vega_fonts

vegalite_to_html / vega_to_html now use vega_fonts() internally:

  • Bundle mode: single call with FontFormat::FontFace
  • CDN mode: call with FontFormat::LinkTag for auto-installed fonts, plus optional second call with FontFormat::FontFace for local fonts when embed_local_fonts is set

Other converter changes

  • vega_to_text_by_font() — new method + embedded JS function that walks the rendered Vega scenegraph, collecting unique characters per (font, weight, style) tuple
  • preprocess_fonts extended to collect locally-available fonts when embed_local_fonts is set

Data model (extract.rs)

  • FontForHtml refactored: flat font_id/font_type fields replaced with FontSource enum (Fontsource { font_id, font_type } | Local)

CLI and Python bindings

  • --embed-local-fonts global CLI flag
  • embed_local_fonts parameter in Python configure_converter() and ConverterConfig
  • vegalite_fonts() / vega_fonts() Python functions (sync + async) with format parameter
  • FontFormat type alias and full type stubs

Build

  • font-subset crate dependency for TTF subsetting + WOFF2 encoding

Design decisions

  • Subsetting vs full embedding: Fonts are subsetted to only the characters actually rendered. A typical Latin-text chart embeds ~5-15KB of WOFF2 per font weight, vs 200-500KB for full fonts.
  • embed_local_fonts is opt-in (default false): System fonts may have licensing restrictions. Explicit opt-in avoids accidental embedding.
  • embed_local_fonts is orthogonal to auto_install_fonts and bundle: Local fonts produce inline @font-face CSS even in CDN mode (they have no CDN URL).
  • Per-font functions over combined tag generation: font_cdn_url/font_link_tag/font_import_rule operate on single fonts for composability. One URL per Google font family (not combined). No preconnect hints.
  • CFF/OTF and TTC fonts not supported for embedding: The font-subset crate only handles TTF/glyf. Failures follow the missing_fonts policy.
  • fontdb clone pattern: The fontdb::Database is cloned out of the USVG_OPTIONS mutex so the lock is released before CPU-intensive subsetting.

Commits

  1. aebd8c8 — Add font_embed.rs module and per-font formatting functions in html.rs, with unit tests
  2. 0faf83e — Integrate font embedding into vegalite_to_html/vega_to_html: add vega_to_text_by_font JS bridge, extend preprocess_fonts, wire font CSS into build_html
  3. 2b52123 — Add embed_local_fonts config option with FontSource enum, fontdb integration, CLI flag, and Python bindings
  4. 85885a3 — Add vegalite_fonts/vega_fonts public API with FontFormat enum; refactor HTML export to use it internally; Python bindings and type stubs

Testing

  • pixi run check-rs — clean
  • pixi run fmt-rs / pixi run fmt-py — clean
  • pixi run clippy — no new warnings
  • pixi run test-rs — 102 passed, 0 failed (includes new unit tests in font_embed.rs, html.rs, converter.rs)
  • pixi run test-cli — 106 passed, 0 failed
  • pixi run test-py — 131 passed, 1 skipped

@jonmmease jonmmease force-pushed the jonmmease/html-export branch from c61a02d to 0faf83e Compare February 25, 2026 18:56
Add a new `embed_local_fonts: bool` config flag (default false) that enables
embedding locally-available fonts (system, --font-dir, vendored) as inline
@font-face CSS with base64-encoded WOFF2 data in HTML output.

Key changes:
- Add FontSource enum (Fontsource/Local) to distinguish font origins
- Extract subset_and_encode_bytes for in-memory font data from fontdb
- Query fontdb for local font faces by family/weight/style
- Handle CFF/TTC subset failures via missing_fonts policy
- Local fonts get inline @font-face CSS even in CDN mode (no CDN URL)
- Propagate to CLI (--embed-local-fonts) and Python bindings

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jonmmease jonmmease changed the title Add HTML export with font embedding Add font management and HTML export with font embedding Feb 25, 2026
Add vegalite_fonts() and vega_fonts() as public methods on VlConverter
with a FontFormat enum (name, url, link_tag, import_rule, font_face).
Refactor vegalite_to_html/vega_to_html to use vega_fonts() internally.

Replace generate_font_tags() with per-font functions (font_cdn_url,
font_link_tag, font_import_rule). Change generate_font_face_css() to
return Vec<String> for per-block access.

Add Python bindings (sync + async) and type stubs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jonmmease jonmmease changed the title Add font management and HTML export with font embedding Add font embedding to HTML export and vegalite_fonts/vega_fonts API Feb 26, 2026
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant