php - Convert spaces between PRE tags, via DOM parser -



php - Convert spaces between PRE tags, via DOM parser -

regex original thought solution, although became apparent dom parser more appropriate... i'd convert spaces   between pre tags within string of html text. example:

<table atrr="zxzx"><tr> <td>adfa adfadfaf></td><td><br /> dfa dfa</td> </tr></table> <pre class="abc" id="abc"> abc 123 <span class="abc">abc 123</span> </pre> <pre>123 123</pre>

into (note space in span tag attribute preserved):

<table atrr="zxzx"><tr> <td>adfa adfadfaf></td><td><br /> dfa dfa</td> </tr></table> <pre class="abc" id="abc"> abc&nbsp;123 <span class="abc">abc&nbsp;123</span> </pre> <pre>123 123</pre>

the result needs serialised string format, utilize elsewhere.

this tricky when want insert &nbsp; entities without dom converting ampersand &amp; entities because entities nodes , spaces character data. here how it:

$dom = new domdocument; $dom->loadhtml($html); $xp = new domxpath($dom); foreach ($xp->query('//text()[ancestor::pre]') $textnode) { $remaining = $textnode; while (($nextspace = strpos($remaining->wholetext, ' ')) !== false) { $remaining = $remaining->splittext($nextspace); $remaining->nodevalue = substr($remaining->nodevalue, 1); $remaining->parentnode->insertbefore( $dom->createentityreference('nbsp'), $remaining ); } }

fetching pre elements , working nodevalues doesnt work here because nodevalue attribute contain combined domtext values of children, e.g. include nodevalue of span childs. setting nodevalue on pre element delete those.

so instead of fetching pre nodes, fetch domtext nodes have pre element parent somewhere on axis:

domelement pre domtext "abc 123" <-- picking domelement span domtext "abc 123" <-- , 1 domelement domtext "123 123" <-- , 1

we go through each of domtext nodes , split them separate domtext nodes @ each space. remove space , insert nbsp entity node before split node, in end tree

domelement pre domtext "abc" domentity nbsp domtext "123" domelement span domtext "abc" domentity nbsp domtext "123" domelement domtext "123" domentity nbsp domtext "123"

because worked domtext nodes, domelements left untouched , preserve span elements within pre element.

caveat:

your snippet not valid because doesnt have root element. when using loadhtml, libxml add together missing construction dom, means snippet including doctype, html , body tag back.

if want original snippet back, you'd have getelementsbytagname body node , fetch children innerhtml. unfortunately, there no innerhtml function or property in php's dom implementation, have manually:

$innerhtml = ''; foreach ($dom->getelementsbytagname('body')->item(0)->childnodes $child) { $tmp_doc = new domdocument(); $tmp_doc->appendchild($tmp_doc->importnode($child,true)); $innerhtml .= $tmp_doc->savehtml(); } echo $innerhtml;

also see

innerhtml in php's domdocument? noob question domdocument in php http://stackoverflow.com/search?q=user%3a208809+dom

php html dom html-parsing

Comments

Popular posts from this blog

iphone - Dismissing a UIAlertView -

intellij idea - Update external libraries with intelij and java -

javascript - send data from a new window to previous window in php -