summaryrefslogtreecommitdiffstats
path: root/lib/Scraper
AgeCommit message (Collapse)Author
2023-10-19Update Scraper.phpIgorA100
Use FetcherConfig::DEFAULT_USER_AGENT for Curl Signed-off-by: IgorA100 <igora100@gmail.com>
2023-10-19Fix: Set CURLOPT_USERAGENTIgorA100
Some sites do not serve content without a User Agent Set CURLOPT_USERAGENT= Google Chrome Signed-off-by: IgorA100 <igora100@gmail.com>
2023-01-11Workaround for #2048Benjamin Brahmer
The league/uri version that we inherit in Nextcloud is a bit outdated. That version can't handle certain uris. Signed-off-by: Benjamin Brahmer <info@b-brahmer.de>
2022-11-17Bump fivefilters/readability.php from 2.1.0 to 3.1.0 (#1989)dependabot[bot]
* Bump fivefilters/readability.php from 2.1.0 to 3.1.0 Bumps [fivefilters/readability.php](https://github.com/fivefilters/readability.php) from 2.1.0 to 3.1.0. - [Release notes](https://github.com/fivefilters/readability.php/releases) - [Changelog](https://github.com/fivefilters/readability.php/blob/master/CHANGELOG.md) - [Commits](https://github.com/fivefilters/readability.php/compare/v2.1.0...v3.1.0) --- updated-dependencies: - dependency-name: fivefilters/readability.php dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> * update Authors Signed-off-by: Benjamin Brahmer <info@b-brahmer.de> * Change namespace for fivefilters Signed-off-by: Benjamin Brahmer <info@b-brahmer.de> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Benjamin Brahmer <info@b-brahmer.de> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Benjamin Brahmer <info@b-brahmer.de>
2021-10-03Fix spelling of receiveDaniel Rheinbay
recive => receive Signed-off-by: Daniel Rheinbay <danielrheinbay@gmail.com>
2020-09-25Move to nextcloud config and update phpunitSean Molenaar
Signed-off-by: Sean Molenaar <sean@seanmolenaar.eu>
2020-01-07Allow getContent() in Scraper and IScraper to return null (#606)Petros Koutsolampros
Allow getContent() in Scraper and IScraper to return null
2019-12-24Reimplement full-text scraping (#563)DriverXX
Add readability.php scraper Fixes #482 Signed-off-by: Gioele Falcetti <thegio.f@gmail.com>