ThinkChat2.0新版上线,更智能更精彩,支持会话、画图、视频、阅读、搜索等,送10W Token,即刻开启你的AI之旅 广告
# DOM(内置,默认开启) DOM扩展使您可以通过带有PHP的DOM API对XML文档进行操作。 该扩展需要[libxml](https://www.php.net/manual/en/book.libxml.php)PHP扩展 默认情况下启用此扩 > **注意事项**: > DOM扩展使用UTF-8编码。使用[utf8\_encode()](https://www.php.net/manual/en/function.utf8-encode.php)和[utf8\_decode()](https://www.php.net/manual/en/function.utf8-decode.php)可以处理ISO-8859-1编码的文本,也可以使用[iconv](https://www.php.net/manual/en/ref.iconv.php)的其他编码。 对于部分HTML使用这个时要小心。这将只需要包含至少一个HTML元素和一个BODY元素的完整HTML文档。如果您正在处理部分HTML,并且您填充了它周围缺失的元素,并且没有在元元素中指定字符编码,那么它将被视为ISO-8859-1,并且将会混淆UTF-8字符串。例子: ``` <pre class="calibre17">``` <span class="token1"><</span><span class="token1">?</span>php $body <span class="token1">=</span> <span class="token4">getHtmlBody</span><span class="token3">(</span><span class="token3">)</span><span class="token3">;</span> $doc <span class="token1">=</span> <span class="token5">new</span> <span class="token4">DOMDocument</span><span class="token3">(</span><span class="token3">)</span><span class="token3">;</span> $doc<span class="token1">-</span><span class="token1">></span><span class="token4">loadHtml</span><span class="token3">(</span><span class="token2">"<html><body>"</span><span class="token3">.</span>$body<span class="token3">.</span><span class="token2">"</body></html>"</span><span class="token3">)</span><span class="token3">;</span> <span class="token">// $doc解析HTML为 ISO-8859-1.</span> <span class="token">//这是正确的,但是如果您的源是UTF-8,那么这可能不是您想要的</span> <span class="token1">?</span><span class="token1">></span> <span class="token1"><</span><span class="token1">?</span>php $body <span class="token1">=</span> <span class="token4">getHtmlBody</span><span class="token3">(</span><span class="token3">)</span><span class="token3">;</span> $doc <span class="token1">=</span> <span class="token5">new</span> <span class="token4">DOMDocument</span><span class="token3">(</span><span class="token3">)</span><span class="token3">;</span> $doc<span class="token1">-</span><span class="token1">></span><span class="token4">loadHtml</span><span class="token3">(</span><span class="token2">"<html><head><meta charset=\"UTF-8\"><meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\"></head><body>"</span><span class="token3">.</span>$body<span class="token3">.</span><span class="token2">"</body></html>"</span><span class="token3">)</span><span class="token3">;</span> <span class="token">// $doc 解析HTML为 UTF-8.</span> <span class="token1">?</span><span class="token1">></span> ``` ```