You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
|
|
11 years ago | |
|---|---|---|
| src | 11 years ago | |
| tests | 11 years ago | |
| .editorconfig | 11 years ago | |
| .gitattributes | 11 years ago | |
| .gitignore | 11 years ago | |
| .scrutinizer.yml | 11 years ago | |
| .travis.yml | 11 years ago | |
| LICENSE.md | 11 years ago | |
| README.md | 11 years ago | |
| composer.json | 11 years ago | |
| phpunit.xml.dist | 11 years ago | |
README.md
Readability
This is an extract of the Readability class from the full-text-rss fork. It kind be defined as a better version of the original php-readability.
Differences
The default php-readability lib is really old and needs to be improved. I found a great fork of full-text-rss from @Dither which improve the Readability class.
- I've extracted the class from its fork to be able to use it out of the box
- I've added some simple tests
- and changed the CS, run
php-cs-fixerand added a namespace
But the code is still really hard to understand / read ...
Usage
use Readability\Readability;
$url = 'http://www.medialens.org/index.php/alerts/alert-archive/alerts-2013/729-thatcher.html';
// you can use whatever you want to retrieve the html content (Guzzle, Buzz, cURL ...)
$html = file_get_contents($url);
$readability = new Readability($html, $url);
$result = $readability->init();
if ($result) {
// display the title of the page
echo $readability->getTitle()->textContent;
// display the *readability* content
echo $readability->getContent()->textContent;
} else {
echo 'Looks like we couldn\'t find the content. :(';
}
