diff --git a/README.md b/README.md index 05d62aa..27d933a 100644 --- a/README.md +++ b/README.md @@ -3,11 +3,11 @@ [![Build Status](https://travis-ci.org/j0k3r/php-readability.svg?branch=master)](https://travis-ci.org/j0k3r/php-readability) [![Code Coverage](https://scrutinizer-ci.com/g/j0k3r/php-readability/badges/coverage.png?b=master)](https://scrutinizer-ci.com/g/j0k3r/php-readability/?branch=master) -This is an extract of the Readability class from the [full-text-rss](https://github.com/Dither/full-text-rss) fork. It kind be defined as a better version of the original [php-readability](https://bitbucket.org/fivefilters/php-readability/overview). +This is an extract of the Readability class from this [full-text-rss](https://github.com/Dither/full-text-rss) fork. It can be defined as a better version of the original [php-readability](https://bitbucket.org/fivefilters/php-readability/overview). ## Differences -The default php-readability lib is really old and needs to be improved. I found a great fork of [full-text-rss](http://fivefilters.org/content-only/) from @Dither which improve the Readability class. +The default php-readability lib is really old and needs to be improved. I found a great fork of full-text-rss from [@Dither](https://github.com/Dither/full-text-rss) which improve the Readability class. - I've extracted the class from its fork to be able to use it out of the box - I've added some simple tests @@ -15,6 +15,12 @@ The default php-readability lib is really old and needs to be improved. I found **But** the code is still really hard to understand / read ... +## Requirements + +By default, this lib will use the [Tidy extension](https://github.com/htacg/tidy-html5) if it's available. Tidy is only used to cleanup the given HTML and avoid problems with bad HTML structure, etc .. + +Since Composer doesn't support suggestion on PHP extension, I write this suggestion here. + ## Usage ```php @@ -26,6 +32,8 @@ $url = 'http://www.medialens.org/index.php/alerts/alert-archive/alerts-2013/729- $html = file_get_contents($url); $readability = new Readability($html, $url); +// or without Tidy +// $readability = new Readability($html, $url, 'libxml', false); $result = $readability->init(); if ($result) {