-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Fix HTML entity encoding/decoding in markdown conversion #7565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
bdbch
wants to merge
12
commits into
main
Choose a base branch
from
claude/fix-tiptap-issue-7539-NyhyF
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+300
−4
Open
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
3613fef
fix: handle HTML character escaping in MarkdownManager for proper rou…
claude fa4bc05
refactor: move HTML entity utilities into @tiptap/core to avoid dupli…
claude e22fcbf
chore: add changeset for HTML entity escaping fix
claude 111fc6e
fix: make encodeHtmlEntities symmetric by encoding "
claude e635ec9
chore: update changeset to mention " roundtrip fix
claude 4c7430d
fix: skip HTML entity encoding for text nodes with code marks
claude 15b753c
test: add " entity tests for markdown conversion
claude f73da0f
fix: don't encode " to " in markdown output
claude b39e6c7
fix: use spec.code instead of hardcoded type names for code detection
claude eb505a5
fix: address review feedback on changeset wording and   edge…
claude 27583ba
refactor: remove dead decode call from Text extension and clean up im…
claude 3774f9e
refactor: extract encodeTextForMarkdown helper and fix JSDoc wording
claude File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| --- | ||
| '@tiptap/core': patch | ||
| '@tiptap/markdown': patch | ||
| --- | ||
|
|
||
| Fix HTML character escaping in markdown roundtrip. HTML entities (`<`, `>`, `&`, `"`) are now decoded to literal characters when parsing markdown into the editor. `<`, `>`, and `&` are re-encoded when serializing back to markdown, while `"` is preserved as a literal character since double quotes are ordinary in markdown. Code detection for skipping encoding now uses the `code: true` extension spec instead of hardcoded type names. Literal characters inside code blocks and inline code are always preserved. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,55 @@ | ||
| import { describe, expect, it } from 'vitest' | ||
|
|
||
| import { decodeHtmlEntities, encodeHtmlEntities } from '../utilities/htmlEntities.js' | ||
|
|
||
| describe('decodeHtmlEntities', () => { | ||
| it('decodes < to <', () => { | ||
| expect(decodeHtmlEntities('<div>')).toBe('<div>') | ||
| }) | ||
|
|
||
| it('decodes & to &', () => { | ||
| expect(decodeHtmlEntities('a & b')).toBe('a & b') | ||
| }) | ||
|
|
||
| it('decodes " to "', () => { | ||
| expect(decodeHtmlEntities('"hello"')).toBe('"hello"') | ||
| }) | ||
|
|
||
| it('handles doubly-encoded sequences like &lt;', () => { | ||
| expect(decodeHtmlEntities('&lt;')).toBe('<') | ||
| }) | ||
|
|
||
| it('returns plain text unchanged', () => { | ||
| expect(decodeHtmlEntities('hello world')).toBe('hello world') | ||
| }) | ||
| }) | ||
|
|
||
| describe('encodeHtmlEntities', () => { | ||
| it('encodes < to <', () => { | ||
| expect(encodeHtmlEntities('<div>')).toBe('<div>') | ||
| }) | ||
|
|
||
| it('encodes & to &', () => { | ||
| expect(encodeHtmlEntities('a & b')).toBe('a & b') | ||
| }) | ||
|
|
||
| it('does not encode " (quotes are valid in markdown)', () => { | ||
| expect(encodeHtmlEntities('"hello"')).toBe('"hello"') | ||
| }) | ||
|
|
||
| it('returns plain text unchanged', () => { | ||
| expect(encodeHtmlEntities('hello world')).toBe('hello world') | ||
| }) | ||
| }) | ||
|
|
||
| describe('roundtrip', () => { | ||
| it.each(['<div>', 'a & b', 'x < y & y > z'])('encode then decode roundtrips: %s', input => { | ||
| expect(decodeHtmlEntities(encodeHtmlEntities(input))).toBe(input) | ||
| }) | ||
|
|
||
| it('decode is a superset of encode – " decodes but " is not encoded', () => { | ||
| // " passes through encode unchanged, " decodes to " | ||
| expect(encodeHtmlEntities('"hello"')).toBe('"hello"') | ||
| expect(decodeHtmlEntities('"hello"')).toBe('"hello"') | ||
| }) | ||
| }) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| /** | ||
| * Decode common HTML entities in text content so they display as literal | ||
| * characters inside the editor. The decode order matters: `&` must be | ||
| * decoded **last** so that doubly-encoded sequences like `&lt;` first | ||
| * survive the `<` pass and then correctly become `<` (not `<`). | ||
| */ | ||
| export function decodeHtmlEntities(text: string): string { | ||
| return text | ||
| .replace(/</g, '<') | ||
| .replace(/>/g, '>') | ||
| .replace(/"/g, '"') | ||
| .replace(/&/g, '&') | ||
| } | ||
|
|
||
| /** | ||
| * Encode HTML special characters so they roundtrip safely through markdown. | ||
| * `&` is encoded **first** to avoid double-encoding the ampersand in other | ||
| * entities (e.g. `<` → `<`, not `&lt;`). | ||
| * | ||
| * Note: `"` is intentionally NOT encoded here because double quotes are | ||
| * ordinary characters in markdown and do not need escaping. The decode | ||
| * function still handles `"` because the markdown tokenizer may emit it. | ||
| */ | ||
| export function encodeHtmlEntities(text: string): string { | ||
| return text.replace(/&/g, '&').replace(/</g, '<').replace(/>/g, '>') | ||
| } | ||
bdbch marked this conversation as resolved.
Show resolved
Hide resolved
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.