summaryrefslogtreecommitdiffstats
path: root/vendor/fguillot/picofeed/docs/grabber.markdown
diff options
context:
space:
mode:
Diffstat (limited to 'vendor/fguillot/picofeed/docs/grabber.markdown')
-rw-r--r--vendor/fguillot/picofeed/docs/grabber.markdown35
1 files changed, 25 insertions, 10 deletions
diff --git a/vendor/fguillot/picofeed/docs/grabber.markdown b/vendor/fguillot/picofeed/docs/grabber.markdown
index 6a7dd2ada..2098b25d0 100644
--- a/vendor/fguillot/picofeed/docs/grabber.markdown
+++ b/vendor/fguillot/picofeed/docs/grabber.markdown
@@ -6,33 +6,48 @@ The web scraper is useful for feeds that display only a summary of articles, the
How the content grabber works?
------------------------------
-1. Try with rules first (xpath patterns) for the domain name (see `PicoFeed\Rules\`)
+1. Try with rules first (XPath queries) for the domain name (see `PicoFeed\Rules\`)
2. Try to find the text content by using common attributes for class and id
3. Finally, if nothing is found, the feed content is displayed
-**The best results are obtained with Xpath rules file.**
+**The best results are obtained with XPath rules file.**
How to use the content scraper?
-------------------------------
+Before parsing all items, just call the method `$parser->enableContentGrabber()`:
+
```php
-use PicoFeed\Reader;
+use PicoFeed\Reader\Reader;
+use PicoFeed\PicoFeedException;
+
+try {
-$reader = new Reader;
-$reader->download('http://www.egscomics.com/rss.php');
+ $reader = new Reader;
-$parser = $reader->getParser();
+ // Return a resource
+ $resource = $reader->download('http://www.egscomics.com/rss.php');
-if ($parser !== false) {
+ // Return the right parser instance according to the feed format
+ $parser = $reader->getParser(
+ $resource->getUrl(),
+ $resource->getContent(),
+ $resource->getEncoding()
+ );
- $parser->enableContentGrabber(); // <= Enable the content grabber
+ // Enable content grabber before parsing items
+ $parser->enableContentGrabber();
+
+ // Return a Feed object
$feed = $parser->execute();
- // ...
+}
+catch (PicoFeedException $e) {
+ // Do Something...
}
```
When the content scraper is enabled, everything will be slower.
-For each item a new HTTP request is made and the HTML downloaded is parsed with XML/Xpath.
+**For each item a new HTTP request is made** and the HTML downloaded is parsed with XML/XPath.
Configuration
-------------