PHP Screen Scraping

Status
Not open for further replies.

Sean

‫‫‫‫‫‫  ‫  Don't Worry, Be Happy
Dec 12, 2011
1,121
405
I want to take a small piece of text of an other site and echo on my site :)

so for example say I want to take the number of posts from my account I have this:

PHP:
<?php
$page = file_get_contents("http://devbest.com/members/seandavies.10142/");
echo $page;
 
?>

This then displays the entire page on my site / localhost, so What I want to do it narrow down the scraping to the contents inside one DIV or CLASS that is in the CSS of devbest so that say the message is like
<div class="post_count">300</div>

I want to echo just that div from devbest, to display "300" :) any help
 

Sean

‫‫‫‫‫‫  ‫  Don't Worry, Be Happy
Dec 12, 2011
1,121
405
Search up Ubercms auto config theres a code there that could help
I would prefer someone just post a snippet that would do the job, or explain to me how to do it rather than me searching google and deciphering code for hours :)
 

TesoMayn

Boredom, it vexes me.
Oct 30, 2011
1,482
1,482
Untested since I no longer have a webhost...

PHP:
$page = file_get_contents('http://devbest.com/members/seandavies.10142/');
$doc = new DOMDocument();
$doc->loadHTML($page);
$divs = $doc->getElementsByTagName('div');
foreach($divs as $div) {
    if ($div->getAttribute('id') === 'content') {
         echo $div->nodeValue;
    }
}
 

Sean

‫‫‫‫‫‫  ‫  Don't Worry, Be Happy
Dec 12, 2011
1,121
405
Loads of error appearing from Line 7 "$doc->loadHTML($page);"

some errors are :

Warning: DOMDocument::loadHTML(): Unexpected end tag : head in Entity, line: 68 in C:\xampp\htdocs\index.php on line 7

Warning: DOMDocument::loadHTML(): htmlParseStartTag: misplaced <body> tag in Entity, line: 70 in C:\xampp\htdocs\index.php on line 7

Warning: DOMDocument::loadHTML(): Tag header invalid in Entity, line: 91 in C:\xampp\htdocs\index.php on line 7

Warning: DOMDocument::loadHTML(): Tag nav invalid in Entity, line: 116 in C:\xampp\htdocs\index.php on line 7

Warning: DOMDocument::loadHTML(): Tag nav invalid in Entity, line: 373 in C:\xampp\htdocs\index.php on line 7
 

TesoMayn

Boredom, it vexes me.
Oct 30, 2011
1,482
1,482
Are you using PHP5?

You can try this but you'll need phpQuery ( )

Also, I don't believe that any of these will work as the message count is not in it's own DIV.

PHP:
require_once('phpQuery/phpQuery.php');
$html = file_get_contents('http://devbest.com/members/seandavies.10142/');
phpQuery::newDocumentHTML($html);
$resultData = pq('div#results-data');
echo $resultData;
 
Status
Not open for further replies.

Users who are viewing this thread

Top