Skip to content

Render EMF+ (GDI+) fills and strokes#51

Open
ssubbotin wants to merge 9 commits into
kakwa:masterfrom
ssubbotin:feature/emfplus-rendering
Open

Render EMF+ (GDI+) fills and strokes#51
ssubbotin wants to merge 9 commits into
kakwa:masterfrom
ssubbotin:feature/emfplus-rendering

Conversation

@ssubbotin

@ssubbotin ssubbotin commented Jun 4, 2026

Copy link
Copy Markdown

EMF+ (GDI+) records embedded in EMF comment records were dispatched to stub handlers that emitted nothing, so EMF+ content was silently dropped and only the GDI fallback rendered. For Enterprise Architect (and similar) dual EMF/EMF+ exports this produced opaque dark rectangles where soft drop shadows should be, and EMF+-only documents converted to an empty SVG (issue #12).

This implements EMF+ fill and stroke rendering, enabled by the existing -p / options->emfplus flag.

What renders now

  • SetWorldTransform applied to all EMF+ coordinates, plus a 64-slot EMF+ object table.
  • Fills: FillRects, FillPath, FillPolygon -- inline ARGB colours (alpha mapped to fill-opacity), object-table SolidColor brushes, and LinearGradient brushes (emitted as <linearGradient> + url(#...)).
  • Strokes: DrawPath, DrawRects, DrawLines -- pen colour (from the pen's embedded brush), width scaled by the world transform, fill="none".
  • Dual-mode arbitration: when the EMF+ layer has actually drawn and we are outside a GetDC window, the source-less GDI BITBLT brush-fill that EA bakes its drop shadows with is suppressed as a duplicate, so shadows render as soft semi-transparency instead of dark rectangles. Suppression is deliberately limited to that case -- general GDI drawing, raster images and text still render, because a dual file may carry its whole picture in GDI with only a sparse EMF+ layer. With no rendered EMF+, output is byte-identical to before.

Not yet handled (the GDI fallback still renders these): EMF+ text (DrawString), images (DrawImage/DrawImagePoints), hatch/texture/path-gradient brushes, multi-stop gradient blends, dashed pens and EMF+ clipping.

Safety / testing

The EMF+ parsing paths run on untrusted input, so they are guarded against malformed records (record-extent bounds without 32-bit pointer overflow, the vendored VAR*_get dangling-pointer behaviour, non-finite coordinates, and truncated pen/brush objects). A new tests/resources/check_emfplus.sh plus generated fixtures and the issue #12 document exercise the new code, and a content-collapse guard was added to check_correctness.sh. The full check_correctness.sh suite (good, corrupted and Enterprise-Architect corpora) passes clean under valgrind and xmllint.

This also fixes goodies/format, whose find -exec clang-format invocation was missing line continuations and never ran (its column limit is set to 80 to match the formatting the tree actually uses).

Fixes #12

ssubbotin added 9 commits June 4, 2026 18:28
EMF+ records embedded in EMF comment records were dispatched to stub
handlers that emitted nothing, so EMF+ content (notably Enterprise
Architect drop shadows) was silently dropped and only the GDI fallback
rendered. Implement the first rendering slice:

* SetWorldTransform: store the EMF+ world matrix in drawingStates and
  apply it to all EMF+ coordinates (matrix then global scaling).
* FillRects / FillPath with an inline ARGB brush (btype=1): emit SVG
  paths carrying fill and fill-opacity derived from the alpha channel.
* Accumulate completed EMF+ objects into a 64-slot object table so
  FillPath can resolve a path by ID; convert it to an SVG d= attribute
  (Start/Line/Bezier point types, CloseSubpath, Int16 and Float coords).

Harden the new untrusted-input paths:

* Do not trust the Rects pointer from U_PMR_FILLRECTS_get: the vendored
  U_PMF_VARRECTS_get frees its buffer but leaves the pointer non-NULL on
  a bounds failure, and the record getter swallows that failure. Gate
  reads and the free on the element count fitting the validated record
  size; this is safe with both the vendored and system libuemf.
* Suppress EMF+ fill emission while a GDI path is open (inPath), so a
  comment record between BEGINPATH and ENDPATH cannot inject markup into
  the open d= attribute and produce non-well-formed SVG.
* Compute the record extent bound by subtraction to avoid a 32-bit
  pointer wrap, and stop latching states->Error on a truncated EMF+
  record (which aborted the whole conversion and discarded all output).

Add tests/resources/check_emfplus.sh and three generated fixtures
(generator: tests/resources/gen_emfplus_fixtures.py) covering the fills
and each hardening case; the fixtures join the emf-ea and emf-corrupted
corpora so the existing valgrind/xmllint CI jobs exercise them under -p.
Fill records referencing a brush from the EMF+ object table (btype=0)
previously emitted nothing; only inline-ARGB fills rendered. Resolve the
two brush types Enterprise Architect uses for shape bodies:

* SolidColor: emit fill (and fill-opacity from the alpha channel).
* LinearGradient: emit an SVG <linearGradient> definition (user-space
  coordinates from the brush RectF mapped through the world transform,
  start and end color stops) and reference it by url(). This is a
  two-stop approximation; the optional multi-stop blend data is not yet
  used.

Both Fill records now route inline and object-table brushes through a
shared resolver; unsupported brush types (hatch, texture, path gradient)
still emit nothing. Guard the gradient coordinates with isfinite() so a
non-finite brush RectF cannot emit nan/inf into the SVG, which DTD
validation would not catch.

Extend the EMF+ test with solid, gradient and non-finite-gradient
fixtures (generator: tests/resources/gen_emfplus_fixtures.py); the
solid/gradient fixtures join the emf-ea corpus and the non-finite case
joins emf-corrupted, so the existing valgrind/xmllint CI jobs exercise
them under -p.
DrawPath records emitted nothing; EMF+ shape and connector outlines were
absent (fills rendered, but boxes had no borders). Resolve the referenced
Pen from the EMF+ object table into SVG stroke attributes:

* stroke color from the pen's embedded SolidColor brush (default black),
* stroke-width from the pen width in world units, magnified by the world
  transform's linear scale (geometric mean of the matrix determinant) so
  the stroke thickens with the geometry, and scaled to device units,
* fill="none" on the stroked path.

The pen width is read via U_PMF_PENDATA_get (which bounds-checks), and the
embedded brush is resolved with U_PMF_PEN_get only when the pen declares no
variable-length optional data (dashed/compound line data, custom caps):
that getter sizes the brush offset through U_PMF_LEN_PENDATA, whose walk of
those fields is not bounds-checked and would over-read a truncated pen
object. Such pens still stroke with the correct width in the default color.

Add a DrawPath fixture (emf-ea) and a truncated-dashed-pen fixture
(emf-corrupted) so the existing valgrind/xmllint CI jobs cover both the
happy path and the over-read guard under -p.
Extend EMF+ stroke coverage to the two remaining outline records used by
Enterprise Architect diagrams:

* DrawRects: stroke each rectangle as a closed SVG path (fill="none"),
  reusing the pen resolver and the FillRects bounds guard against the
  vendored U_PMF_VARRECTS_get dangling-pointer behaviour (the rects begin
  16 bytes into a DrawRects record, after the header and element count).
* DrawLines: stroke the points as an SVG polyline, closed when the record
  sets the closed-path flag. The element count is clamped to what the
  record actually holds, relative-coordinate point lists are skipped, and
  the whole polyline is suppressed if any transformed point is non-finite
  (which would otherwise emit nan/inf coordinates that DTD validation, all
  CDATA, would not catch).

Add DrawRects and DrawLines fixtures (emf-ea) and a non-finite DrawLines
fixture (emf-corrupted) so the existing valgrind/xmllint CI jobs cover the
happy paths and the finite-coordinate guard under -p.
FillPolygon emitted nothing; small filled EMF+ shapes (e.g. arrowheads)
were missing. Fill the polygon as a closed SVG path, resolving the fill
through the shared brush resolver (inline color or object-table brush,
including gradients). The point count is clamped to what the record holds
(points start 20 bytes in, after the brush id and element count),
relative-coordinate lists are skipped, and the polygon is suppressed when
it has fewer than three points or any transformed point is non-finite
(which would otherwise emit nan/inf coordinates).

Add a FillPolygon fixture (emf-ea) and a non-finite-point fixture
(emf-corrupted) for the valgrind/xmllint CI jobs under -p.
Enterprise Architect dual EMF/EMF+ files bake their drop shadows twice:
once as a soft semi-transparent EMF+ fill (now rendered) and once as an
opaque source-less GDI BITBLT brush-fill — the latter is what produced the
dark shadow rectangles in the output.

Track dual-mode state on drawingStates: emfPlusDrew is set the moment an
EMF+ record actually emits SVG (not on mere presence), and gdiPlay marks a
GetDC window where GDI output is intended rather than fallback. When the
EMF+ layer has drawn and we are outside a GetDC window, U_EMRBITBLT_draw
drops the source-less brush-fill BITBLT as a duplicate of the EMF+
rendering. Suppression is deliberately limited to that case: a BITBLT
carrying real bitmap data, and all other GDI drawing, still render, because
a dual file may carry its whole picture in GDI with only a sparse EMF+
layer and muting it would erase primary content.

With no rendered EMF+, the state stays inert and output is byte-identical
to a pure-GDI conversion. Add a content-collapse guard to
check_correctness.sh (fail if -p output is a tiny fraction of the no-p
output) and assertions to check_emfplus.sh: image4 loses its opaque GDI
shadows while keeping the EMF+ soft shadows and GetDC text, and the
image-bearing dual files test-150/test-155 keep their rasters.
The EMF document reported in issue kakwa#12 (LibreOffice tdf#107034 attachment
132406) is a dual EMF/EMF+ file that produced an empty SVG before EMF+
rendering existed. With -p it now renders its EMF+ fills, strokes and
gradients (an empty 0-path conversion becomes 500+ paths). Add it to the
emf-ea corpus and assert it renders non-empty, well-formed output.
The find -exec invocation was missing line continuations, so clang-format
never ran. Restore them and set ColumnLimit to 80 to match the formatting
the tree actually uses (it was 120, which no committed file follows).
Update the EMF+ record coverage table (1 supported, 10 partial) and add a
changelog entry describing the EMF+ fill/stroke rendering, the dual-mode
shadow-fallback suppression, and the parser hardening.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

libemf2svg doesn't handle EMF+ records conversion

1 participant