-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Closed
Labels
BugThis tag is applied to issues which reports bugs.This tag is applied to issues which reports bugs.
Description
Describe the bug
When parsing invalid HTML with an extra closing tag and no matching opening tag (for example, </a>), the open-tag stack context gets corrupted.
Reproduction Steps
Run:
import net.html
fn main() {
content := '<!doctype html>
<html>
<body>
<div>
<a href="#">x</a></a>
</div>
<article class="news-post">hello</article>
</body>
</html>'
mut doc := html.parse(content)
// this works because it uses the global index, not the local stack that gets corrupted
by_attr := doc.get_tags_by_attribute_value('class', 'news-post')
println('by_attr: ${by_attr.len}')
// this does not work
by_class := doc.get_tags_by_class_name('news-post')
println('by_class: ${by_class.len}')
}Expected Behavior
In my opinion, invalid unmatched closing tags should be safely ignored.
Current Behavior
Invalid unmatched closing tags break dom parsing.
Possible Solution
Codex suggestion:
In https://github.com/vlang/v/blob/master/vlib/net/html/dom.v#L125, preserve the stack size before the pop loop; if no matching opener is found, restore it and continue (ignore the unmatched closing tag).
if is_close_tag(tag) {
temp_int = stack.peek()
temp_string = tag.name[1..]
+ old_stack_size := stack.size
for !is_null(temp_int) && temp_string != tag_list[temp_int].name
&& !tag_list[temp_int].closed {
dom.print_debug(temp_string + ' >> ' + tag_list[temp_int].name + ' ' +
(temp_string == tag_list[temp_int].name).str())
stack.pop()
temp_int = stack.peek()
}
+
+ if is_null(temp_int) || temp_string != tag_list[temp_int].name {
+ stack.size = old_stack_size
+ continue
+ }
temp_int = stack.peek()
temp_int = if !is_null(temp_int) { stack.pop() } else { root_index }
if is_null(temp_int) {
stack.push(root_index)
}
dom.print_debug('Removed ' + temp_string + ' -- ' + tag_list[temp_int].name)
}
Additional Information/Context
No response
V version
V 0.5.0
Environment details (OS name and version, etc.)
| V full version | V 0.5.0 0d00c76.5e0489f |
|---|---|
| OS | linux, Linux version 6.6.87.2-microsoft-standard-WSL2 (root@439a258ad544) (gcc (GCC) 11.2.0, GNU ld (GNU Binutils) 2.37) #1 SMP PREEMPT_DYNAMIC Thu Jun 5 18:30:46 UTC 2025 (WSL 2) |
| Processor | 6 cpus, 64bit, little endian, AMD Ryzen 5 3500 6-Core Processor |
| Memory | 5.34GB/7.72GB |
| V executable | /root/v/v |
| V last modified time | 2026-02-17 00:36:34 |
| V home dir | OK, value: /root/v |
| VMODULES | OK, value: /root/.vmodules |
| VTMP | OK, value: /tmp/v_0 |
| Current working dir | OK, value: /root/workspace/testando |
| Git version | git version 2.53.0 |
| V git status | weekly.2026.07-76-g5e0489fe-dirty |
| .git/config present | true |
| cc version | cc (GCC) 15.2.1 20260209 |
| gcc version | gcc (GCC) 15.2.1 20260209 |
| clang version | clang version 21.1.8 |
| tcc version | tcc version 0.9.28rc 2025-02-13 HEAD@f8bd136d (x86_64 Linux) |
| tcc git status | thirdparty-linux-amd64 696c1d84 |
| emcc version | N/A |
| glibc version | ldd (GNU libc) 2.43 |
Note
You can use the π reaction to increase the issue's priority for developers.
Please note that only the π reaction to the issue itself counts as a vote.
Other reactions and those to comments will not be taken into account.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
BugThis tag is applied to issues which reports bugs.This tag is applied to issues which reports bugs.