Do not set domainRegExp for local files

`parse_url($this->url, \PHP_URL_HOST)` will return `null` for local filesystem path.
Casting it to `string` will produce an empty regular expression,
which would match any link when computing link density.

(cherry picked from commit c7208f6ad2)

This also fixes a warning since 1.x passes the `null` directly to `preg_replace` instead of explicitly casting it to `string`.
pull/103/head
Jan Tojnar 1 year ago
parent 487ce3a517
commit 235baf965c
  1. 5
      src/Readability.php

@ -1396,7 +1396,10 @@ class Readability implements LoggerAwareInterface
$this->logger->debug('Parsing URL: ' . $this->url); $this->logger->debug('Parsing URL: ' . $this->url);
if ($this->url) { if ($this->url) {
$this->domainRegExp = '/' . strtr(preg_replace('/www\d*\./', '', parse_url($this->url, \PHP_URL_HOST)), ['.' => '\.']) . '/'; $host = parse_url($this->url, \PHP_URL_HOST);
if (null !== $host) {
$this->domainRegExp = '/' . strtr(preg_replace('/www\d*\./', '', $host), ['.' => '\.']) . '/';
}
} }
mb_internal_encoding('UTF-8'); mb_internal_encoding('UTF-8');

Loading…
Cancel
Save