Description
I noticed something peculiar about how HTMLDocument handles html closing tags within script tags. My expectation was that it wouldnt do anything at all and just treat anything between <script></script> tags as a string, but its modifying the closing tag on heading tags.
Here's a small sample php script
<?php
/**
* Compare old DOMDocument vs new Dom\HTMLDocument for script content handling.
* Testing various libxml constants to see if any preserve script content.
*/
$html = <<<'HTML'
<!DOCTYPE html>
<html>
<head></head>
<body>
<h3>body heading</h3>
<script type="text/html" id="tmpl-test">
<h3>template heading</h3>
</script>
<script type="text/javascript">
var a = "<h1>asdf</h1>";
var b = `<h1>asdf</h1>`;
<h1>asdf</h1>
</script>
</body>
</html>
HTML;
echo "\n\n=== Old DOMDocument ===\n";
$oldDom = new DOMDocument();
@$oldDom->loadHTML( $html );
echo $oldDom->saveHTML();
echo "\n\n=== New Dom\\HTMLDocument ===\n";
$newDom = \Dom\HTMLDocument::createFromString( $html );
echo $newDom->saveHTML();
It outputs the following, and you can see that the closing heading tags are just </1>
=== Old DOMDocument ===
<!DOCTYPE html>
<html>
<head></head>
<body>
<h3>body heading</h3>
<script type="text/html" id="tmpl-test">
<h3>template heading</h3>
</script>
<script type="text/javascript">
var a = "<h1>asdf</h1>";
var b = `<h1>asdf</h1>`;
<h1>asdf</h1>
</script>
</body>
</html>
=== New Dom\HTMLDocument ===
<!DOCTYPE html><html><head></head>
<body>
<h3>body heading</h3>
<script type="text/html" id="tmpl-test">
<h3>template heading</3>
</script>
<script type="text/javascript">
var a = "<h1>asdf</1>";
var b = `<h1>asdf</1>`;
<h1>asdf</1>
</script>
</body></html>
Its strange because DOMDocument was known to have issues with closing tags within script tags, and apparently HTMLDocument was supposed to fix this. But its literally the opposite in this case.
PHP Version
PHP 8.4.17 (cli) (built: Jan 16 2026 02:36:09) (ZTS gcc 10.2.1 x86_64)
Copyright (c) The PHP Group
Built by Static PHP <https://static-php.dev> #StandWithUkraine
Zend Engine v4.4.17, Copyright (c) Zend Technologies
with Zend OPcache v8.4.17, Copyright (c), by Zend Technologies
Operating System
Ubuntu 24.04
Description
I noticed something peculiar about how HTMLDocument handles html closing tags within script tags. My expectation was that it wouldnt do anything at all and just treat anything between
<script></script>tags as a string, but its modifying the closing tag on heading tags.Here's a small sample php script
It outputs the following, and you can see that the closing heading tags are just </1>
Its strange because DOMDocument was known to have issues with closing tags within script tags, and apparently HTMLDocument was supposed to fix this. But its literally the opposite in this case.
PHP Version
Operating System
Ubuntu 24.04