Asked  1 Year ago    Answers:  5   Viewed   10 times

I want to access a custom attribute that I added to some elements in an HTML file, here's an example of the littleBox="somevalue" attribute

<div id="someId" littleBox="someValue">inner text</div>

The Following doesn't work:

foreach($html->find('div') as $element){
 echo $element;
 if(isset($element->type)){
 echo $element->littleBox;
   }
}

I saw an article with a similar problem, but I couldn't replicate it for some reason. Here is what I tried:

function retrieveValue($str){
if (stripos($str, 'littleBox')){//check if element has it
$var=preg_split("/littleBox="/",$str);
//echo $var[1];
$var1=preg_split("/"/",$var[1]);
echo $var1[0];
}
else
return false;
}

When ever I call the retrieveValue() function, nothing happens. Is $element (in the first PHP example above) not a string? I don't know if I missed something but it's not returning anything.

Here's the script in it's entirety:

<?php
require("../../simplehtmldom/simple_html_dom.php");

if (isset($_POST['submit'])){

$html = file_get_html($_POST['webURL']);

// Find all images 
foreach($html->find('div') as $element){
    echo $element;
   if(isset($element->type)!= false){
    echo retrieveValue($element);
   }
}
}


function retrieveValue($str){
if (stripos($str, 'littleBox')){//check if element has it
$var=preg_split("/littleBox="/",$str);
//echo $var[1];
$var1=preg_split("/"/",$var[1]);
return $var1[0];
}
else
return false;
}

?>

<form method="post">
Website URL<input type="text" name="webURL">
<br />
<input type="submit" name="submit">
</form>

 Answers

1

Have you tried:

$html->getElementById("someId")->getAttribute('littleBox');

You could also use SimpleXML:

$html = '<div id="someId" littleBox="someValue">inner text</div>';
$dom = new DOMDocument;
$dom->loadXML($html);
$div = simplexml_import_dom($dom);
echo $div->attributes()->littleBox;

I would advice against using regex to parse html but shouldn't this part be like this:

$str = $html->getElementById("someId")->outertext;
$var = preg_split('/littleBox="/', $str);
$var1 = preg_split('/"/',$var[1]);
echo $var1[0];

Also see this answer https://stackoverflow.com/a/8851091/1059001

Thursday, April 1, 2021
 
Maury
 
3

You're not creating the DOM correctly, you must do it like this:

// Create a DOM object
$dom = new simple_html_dom();
// Load HTML from a string
$dom->load(curl_exec($ch))

print_r( $dom );

Check the Manual for more details...

Edit

It seems that is a cURL settings problem, please refer to the documentation to configure it correctly...

This is a function I usualy use to download some pages, feel free to adjust it to your needs:

function dlPage($href) {

    $curl = curl_init();
    curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
    curl_setopt($curl, CURLOPT_HEADER, false);
    curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($curl, CURLOPT_URL, $href);
    curl_setopt($curl, CURLOPT_REFERER, $href);
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
    curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.125 Safari/533.4");
    $str = curl_exec($curl);
    curl_close($curl);

    // Create a DOM object
    $dom = new simple_html_dom();
    // Load HTML from a string
    $dom->load($str);

    return $dom;
    }

$url = 'http://www.example.com/';
$data = dlPage($url);
print_r($data);
Thursday, April 1, 2021
 
Xavio
 
4

You can use DOMDocument to get at the attributes:

$html = '<li data-docid="thisisthevaluetoget" class="search-results-item"></li>';
$doc = new DOMDocument;
$doc->loadHTML($html);
$nodes = $doc->getElementsByTagName('li');
foreach ($nodes as $node) {
    if ($node->hasAttributes()) {
        foreach ($node->attributes as $a) {
            echo $a->nodeName.': '.$a->nodeValue.'<br/>';
        }
    }
}
Thursday, April 1, 2021
 
ajreal
 
4

Isn't it easy. Try things first then ask. (:

<?php
include 'simple_html_dom.php';
$html = file_get_html('http://www.weather.gov.sg/lws/zoneInfo.do');

$n = 0;
$table = $html->find('table',3)->find('table',0)->find('table',0)->find('table',0)->find('table',3)->find('table',0);

$i = -3;
$rows = $table->find('tr');
$holder = array();

foreach($rows as $element){
    $i++;
    if($i < 0) continue;

    $holder[$i]['name'] = $element->find('td',0)->plaintext;
    $holder[$i]['zone_or_school'] = $element->find('td',1)->plaintext;
    $holder[$i]['risk'] = $element->find('td',2)->plaintext;
    $holder[$i]['from'] = $element->find('td',3)->plaintext;
    $holder[$i]['till'] = $element->find('td',4)->plaintext;
}

var_dump($holder);
?>

if you want to get a particular data then you can filter it out:

foreach($holder as $key => $val)
{
if($holder[$key]['name']=='Bedoc')
$my_data = $holder[$key];
}

this code isn't debuged cause i am on mobile now. But maybe you have get the idea if not works. Thanks

Saturday, May 29, 2021
 
Claudio
 
2
$dom = new DOMDocument();
$dom->loadHTML('<a href="http://foo.bar/">Click here</a>');

foreach ($dom->getElementsByTagName('a') as $item) {

    $item->setAttribute('href', 'http://google.com/');
    echo $dom->saveHTML();
    exit;
}
Friday, June 18, 2021
 
anjan
 
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :