Add Accept: text/markdown content negotiation for NiceGUI pages#5889
Add Accept: text/markdown content negotiation for NiceGUI pages#5889evnchn wants to merge 4 commits intozauberzeug:mainfrom
Accept: text/markdown content negotiation for NiceGUI pages#5889Conversation
Allow LLMs, CLI tools, and agents to request a markdown representation of any NiceGUI page by sending `Accept: text/markdown` in the HTTP request. Browsers continue to receive HTML as usual. The implementation adds a `_to_markdown()` method to the base `Element` class with smart duck-typing dispatch (content → text → label+value → recurse children), minimizing per-element overrides. Non-visual elements opt out via a `MARKDOWN_SKIP = True` class attribute. Clients created for markdown responses are cleaned up immediately via deferred deletion since they will never receive a WebSocket connection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rodja
left a comment
There was a problem hiding this comment.
I really like the concept. Might also be noteworthy in #5834. I have not tested/reviewd the PR in detail and hope that @falkoschindler can do that part.
|
@rodja If we have |
|
We should also track markdown deliveries in plausible so we can see how much llms looking things up. |
|
Perhaps, but it will consume custom events, and we need to do Plausible on server-side which could be a hassle (at least we aren't doing it right now, so we need bootstrapping) |
|
Yes, its only nice to have and should not block the merge of this feature. |
|
I'd say 3.11? |
falkoschindler
left a comment
There was a problem hiding this comment.
Yes, 3.11 sounds good. I just did a first review and there are still some things to do and/or to think about:
-
ChatMessage._markdown_textstores HTML-processed text, making the roundtrip lossyIn
chat_message.py:58,self._markdown_text = textis assigned after the text has been HTML-escaped and\nhas been replaced with<br />(lines 49-51). The_to_markdownmethod then tries to reverse this withhtml.unescape()and.replace('<br />', '\n'), but this roundtrip is lossy:- If the original text literally contained
<br />or HTML entities like&, these would be incorrectly transformed. - When
text_html=True,_markdown_textstores raw HTML which_to_markdowndoesn't properly handle (it would leave HTML tags in the markdown output).
Fix: Store the original, unprocessed text before escaping. Add a
self._original_text = text(or rename) before line 49, and use that in_to_markdown.+ self._original_text = list(text) # store before HTML processing if not text_html: text = [html.escape(part) for part in text] text = [part.replace('\n', '<br />') for part in text] sanitize = False - self._markdown_text = text
And adjust
_to_markdownto useself._original_textdirectly instead of reversing the escaping. - If the original text literally contained
-
Duck-typing dispatch can produce misleading markdown for layout/value elements
The base
_to_markdown()inelement.py:229checkshasattr(self, 'label') and hasattr(self, 'value')as a fallback. This matches anyValueElementwith a label property, including elements wherevaluehas a non-textual meaning:Splitter(value = split percentage, e.g.50) → would produce": 50"or just"50"Drawer(value = open/closed boolean) →": True"Carousel(value = current slide name) → could be confusingLinearProgress/CircularProgress(value = progress float) →": 0.7"Rating(value = star count) →": 3"Knob(value = angle/number) →": 42"Pagination(value = current page) →": 1"
These elements should either override
_to_markdown()to skip theirvalueand recurse into children (for containers likeDrawer,Footer,ScrollAreawhose children carry the real content), or produce meaningful output (forRating,Progress), or be skipped entirely (forSplitterwhere the value is a split percentage).Suggestion: Containers (
Drawer,Footer,ScrollArea,Carousel) should override to recurse into children only.Splittershould skip its value and recurse.Fullscreenshould useMARKDOWN_SKIP. ForSlider,Knob,Rating,Pagination, andProgress, consider custom overrides or skipping. -
MARKDOWN_SKIPclass attribute placement inElementThe
MARKDOWN_SKIPclass attribute atelement.py:215is placed in the middle of the class body between_collect_slot_dict()and_to_dict(). Class-level attributes should typically be grouped at the top of the class body, near other class-level declarations. This improves discoverability and follows the existing pattern in NiceGUI's element subclasses. -
Repeated visibility check pattern in every override
Every
_to_markdownoverride begins with:if not self.visible: return ''
This is needed because overrides don't call
super()._to_markdown()(which already has this check). Consider a template method pattern:def _to_markdown(self) -> str: if not self.visible or self.MARKDOWN_SKIP: return '' return self._render_markdown() def _render_markdown(self) -> str: """Override this in subclasses.""" # ... current duck-typing dispatch ...
This eliminates the duplicated guard in every override and makes it impossible to forget the visibility check.
-
DialogandMenushould not unconditionally skip markdownDialog,Menu, andNotificationall haveMARKDOWN_SKIP = True, but all can be visible to the user:DialogandMenucan be open by default (value=True), so they should render children only when open.Notificationis always visible when created and carries a user-facingmessage— it should render that message instead of being skipped.
The test
test_dialog_skippedwould need updating to cover both open/closed cases. Similar tests should be added forMenuandNotification. -
FullscreenmissingMARKDOWN_SKIPFullscreenis a non-visual control element (ValueElementwith a boolean value) that would hit the label+value fallback and produce misleading output. It should haveMARKDOWN_SKIP = True. (Other childless non-visual elements likePageScroller,Skeleton, andJoystickwould produce empty strings via the base class recurse-into-children path, so they're harmless — butFullscreenhas avaluethat triggers the wrong dispatch.) -
Accept header parsing is simplistic
'text/markdown' in accept(client.py:155-156) doesn't handle quality values (text/markdown;q=0.9) or wildcard patterns (text/*). Starlette doesn't provide a built-in Accept parser, so a proper implementation would require a small custom parser. The current approach works correctly for the real-world use case (agents sending exactlyAccept: text/markdown), but a code comment noting this simplification would be helpful. -
Button markdown: ambiguous syntax and icon-only buttons silently dropped
[Click me](button.py:48) looks like an incomplete markdown link. Worse, icon-only buttons (no label) return an empty string and are silently omitted from the output. An icon button is still a meaningful interactive element. Consider using the icon name as fallback (e.g.[icon:thumb_up]) or always emitting[Button]when the label is empty. The[...]syntax itself could also be reconsidered —**Click me**or[Button: Click me]would be less ambiguous. -
Expansion hardcodes
###(h3)In
expansion.py:49, the label is always rendered as### label. Nested expansions would all render at the same heading level. Consider making this relative to nesting depth, or use bold (**Details**) to avoid heading hierarchy issues. -
Simple
_to_markdownoverrides should be one-linersMost overrides (
Checkbox,Switch,Button,Link,Image,Separator,Code) are simple enough to condense into a single expression, e.g.:def _to_markdown(self) -> str: return f'- [{"x" if self.value else " "}] {self._text or ""}' if self.visible else ''
This reduces visual noise across 15+ overrides and makes the pattern scannable. Multi-step methods like
Table._to_markdownandExpansion._to_markdownshould stay as-is. -
Test coverage gaps
No tests for:
ui.select/ui.radio/ui.slider/ui.numberwithout labelui.badge/ui.chip(text elements)ui.htmlelement (content passthrough)ui.mermaid(content passthrough)ui.date/ui.time(value elements)- Nested elements that trigger the duck-typing fallback in unexpected ways
text_html=TrueonChatMessage- Pages with
@ui.pageasync handlers - Error scenarios (e.g., what happens when
_to_markdownraises)
-
X-NiceGUI-Content: pageheaderThe custom header in
markdown_response.py:25is undocumented. If it's intended for programmatic detection, a brief comment explaining its purpose would help.
…Menu, tests - Template method: _to_markdown() guards visibility/skip, delegates to _render_markdown() - ChatMessage: store original text before HTML escaping (lossy roundtrip fix) - Dialog/Menu: render children only when open, not unconditional MARKDOWN_SKIP - Notification: render message text instead of skipping - Fullscreen: add MARKDOWN_SKIP = True - Button: [Button: label] syntax, icon-only fallback - Expansion: bold instead of h3 to avoid heading hierarchy issues - Containers (Drawer/Footer/ScrollArea/Carousel/Splitter): recurse children only - Non-text value widgets (Slider/Knob/Rating/Pagination/Progress): MARKDOWN_SKIP - ChoiceElement: resolve selected value to display label - Condense simple overrides to one-liners - Accept header: add comment noting simplistic parsing - X-NiceGUI-Content header: add purpose comment - 15 new tests covering gaps (select, radio, badge, chip, html, dialog, menu, notification) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@falkoschindler Your 12 feedbacks was in evnchn#130 and I looked over them before merging into this PR's branch |
Motivation
As mentioned in #5833 (comment), there is now the
Accept: text/markdownheader, which I believe Claude Code actively uses (since when it visits https://code.claude.com/docs/en/setup#alpine-linux-and-musl-based-distributions it correctly gets the Markdown version of the site).However, currently NiceGUI websites shows up as a monolithic non-understandable JSON blob to AI agents as evidenced by #5833 (comment)
I would like to implement a NiceGUI-level API such that all NiceGUI websites are automatically agent-friendly (short of interactivity, but that's expected.
Implementation
Allow LLMs, CLI tools, and agents to request a markdown representation of any NiceGUI page by sending
Accept: text/markdownin the HTTP request. Browsers continue to receive HTML as usual.The implementation adds a
_to_markdown()method to the baseElementclass with smart duck-typing dispatch (content → text → label+value → recurse children), minimizing per-element overrides. Non-visual elements opt out via aMARKDOWN_SKIP = Trueclass attribute. Clients created for markdown responses are cleaned up immediately via deferred deletion since they will never receive a WebSocket connection.End-to-end results
(note: cannot use the typical Fetch tool, since that only works for publically-accessible URLs)
Progress