Compare commits

...

147 Commits

Author SHA1 Message Date
Knah Tsaeb 9ebd496c2d Merge remote-tracking branch 'origin/master' into kt_bridge 2018-10-18 11:16:17 +02:00
logmanoriginal 717b0bdd9c Fix items link to localhost
References #864
2018-10-16 19:16:51 +02:00
logmanoriginal 62d737efe2 Replace emoticon images by their textual representation
References #850
2018-10-16 19:02:55 +02:00
triatic 6fce03daa7 [FB2Bridge] Add updated timestamps to each post (#849)
Additionally, exclude shared posts from output since they already exist inside other posts.
2018-10-16 18:34:39 +02:00
logmanoriginal 7561c0685d [FacebookBridge] Fix 'SpSonSsoSredS' text in title
The function 'defaultLinkTo' applied to the source HTML does break
regex matches later in the bridge. We need to apply the function
right before adding the contents to the item for the bridge to work
properly.

References #856
2018-10-15 19:53:46 +02:00
logmanoriginal f48eac854f Bump version to 'dev.2018-10-15' 2018-10-15 18:59:03 +02:00
logmanoriginal a87e7781b1 Bump version to 2018-10-15 2018-10-15 18:54:53 +02:00
logmanoriginal 0dc761d6cf [README] Update authors
Not sure why, but the GitHub API responded with false results the
last time. Cleaning up to reflect current list of contributors.
2018-10-15 18:53:27 +02:00
logmanoriginal d14f8e3c83 [BundesbankBridge] Add new bridge 2018-10-15 18:38:42 +02:00
logmanoriginal b4aea21f71 [DesoutterBridge] Add new bridge 2018-10-15 18:35:49 +02:00
logmanoriginal c06a09fe99 [GlassdoorBridge] Add new bridge 2018-10-15 18:33:02 +02:00
sysadminstory 704ad50607 [DealabsBridge] Follow website changes (#852)
Pepper changed the CSS class of some elements. The bridge was changed to
follow these changes.
2018-10-15 18:25:04 +02:00
sysadminstory d89c65d219 [ZoneTelechargementBridge] Update the base URL and make URI unique (#853)
- Base URL updated
- Show name has different styles on the Website, use another way to get the show name
- Entry URIs are now unique to make sure RSS readers don't treat episodes as duplicates
- No more new lines in the feed or item title
2018-10-15 18:23:08 +02:00
sysadminstory 9a3c776096 [ExtremeDownloadBridge] Make URI and titles unique (#854)
- Entry URIs are unique to make sure RSS readers don't treat episodes as duplicates
- Titles are unique to make sure RSS readers don't treat streams and downloads as duplicates
2018-10-15 18:19:57 +02:00
triatic 85e8a67568 [MrssFormat.php] Prevent PHP Notice (#858)
Prevent PHP Notice when running in CLI mode
2018-10-15 18:14:06 +02:00
Nicolas Delsaux ee158468fa Expanded Sexactu to cover the whole GQ magazine (#861)
The bridge has been expanded to better cover the whole GQ magazine.
It should support all countries (provided they all use the same absurdly shitty publication system).
It is guaranteed to be only tested with sexactu articles (that I now obtain by loading Maïa Mazaurette author page).
2018-10-15 18:09:20 +02:00
logmanoriginal 5779f641c0 [FacebookBridge] Add option to limit number of returned items
This commit adds a new optional parameter 'limit' which can be used
to limit the number of items returned by this bridge (i.e. '&limit=10')

As requested in #669
2018-10-15 17:35:10 +02:00
LogMANOriginal b90bcee1fc
Return exceptions in requested feed formats (#841)
* [Exceptions] Don't return header for bridge exceptions
* [Exceptions] Add link to list in exception message

This is an alternative when the button is not rendered
for some reason.

* [index] Don't return bridge exception for formats
* [index] Return feed item for bridge exceptions
* [BridgeAbstract] Rename 'getCacheTime' to 'getModifiedTime'
* [BridgeAbstract] Move caching to index.php to separate concerns

index.php needs more control over caching behavior in order to cache
exceptions. This cannot be done in a bridge, as the bridge might be
broken, thus preventing caching from working.

This also (and more importantly) separates concerns. The bridge should
not even care if caching is involved or not. Its purpose is to collect
and provide data.

Response times should be faster, as more complex bridge functions like
'setDatas' (evaluates all input parameters to predict the current
context) and 'collectData' (collects data from sites) can be skipped
entirely.

Notice: In its current form, index.php takes care of caching. This
could, however, be moved into a separate class (i.e. CacheAbstract)
in order to make implementation details cache specific.

* [index] Add '_error_time' parameter to $item['uri']

This ensures that error messages are recognized by feed readers as
new errors after 24 hours. During that time the same item is returned
no matter how often the cache is cleared.

References https://github.com/RSS-Bridge/rss-bridge/issues/814#issuecomment-420876162

* [index] Include '_error_time' in the title for errors

This prevents feed readers from "updating" feeds based on the title

* [index] Handle "HTTP_IF_MODIFIED_SINCE" client requests

Implementation is based on `BridgeAbstract::dieIfNotModified()`,
introduced in 422c125d8e and
simplified based on https://stackoverflow.com/a/10847262

Basically, before returning cached data we check if the client send
the "HTTP_IF_MODIFIED_SINCE" header. If the modification time is
more recent or equal to the cache time, we reply with "HTTP/1.1 304
 Not Modified" (same as before). Otherwise send the cached data.

* [index] Don't encode exception message with `htmlspecialchars`
* [Exceptions] Include error message in exception
* [index] Show different error message for error code 0
2018-10-15 17:21:43 +02:00
logmanoriginal 996295e82f Add 'dev.' to the release version in master
This helps (roughly) identifying versions when opening issues on
GitHub, using the latest ZIP file for master.

References #773
2018-09-26 20:04:27 +02:00
logmanoriginal 13bd7fe21b [contents] Return error if the server responded with any code other than 200 2018-09-26 19:16:02 +02:00
logmanoriginal fcc9f9fd61 [FacebookBridge] Use alternative URI to load more posts
The URI "https://facebook.com/username?_fb_noscript=1" returns two
posts per user. Some profiles, however, are very active, causing the
bridge to miss items if more than two posts are send within the cache
duration (5 minutes).

The alternative suggested in #669 is to use a different URI:
"https://facebook.com/pg/username/posts?_fb_noscript=1"

While the contents of this URI essentially look the same when viewed
in a browser, it actually returns more than 10 posts depending on the
profile.

References #669
2018-09-26 18:24:46 +02:00
logmanoriginal e1c4914b1c [FacebookBridge] Optimize for readability 2018-09-25 18:56:33 +02:00
logmanoriginal 93e7ea9fea [HtmlFormat] Make feeds available via syndication links 2018-09-22 19:51:18 +02:00
logmanoriginal 2d1b446bd1 [DevToBridge] Add new bridge
Returns feeds for tags from https://dev.to

References #840
2018-09-22 18:57:07 +02:00
logmanoriginal 1d451610d6 [ParameterValidator] Move 'getQueriedContext' from BridgeAbstract 2018-09-22 17:04:55 +02:00
logmanoriginal f853ffc07c [ParameterValidator] Refactor 'validation' into 'ParameterValidator'
Adds a new class 'ParameterValidator' to replace the functions from
'validator.php', separating private functions from 'validateData' to
class private functions in the process.

Instead of echoing error messages, adds messages to a private variable,
accessible via 'getInvalidParameters'.

BridgeAbstract now adds invalid parameter names to the error message.
2018-09-22 16:42:04 +02:00
logmanoriginal e3a5a6a170 [index] Update and improve parameter handling for bridge and cache
- Use 'array_diff_key' instead of 'unset'
- Remove parameters for caches

By removing certain parameters for caches, the loading times can be
improved considerably:

* action: It doesn't matter which action the user took to generate
feed items.

* format: This has the biggest impact on performance, because cached
items are now shared between different formats (i.e. try switching
between Atom, Html and Mrss and compare previous vs. now). If a
server handles lots of requests, this may even reduce bandwidth if
the same contents are requested for different formats.

* _noproxy: The proxy behavior has no impact on the produced items,
so it can be ignored.

* _cache_timeout: This is another option which might impact performance
for some servers, especially if 'custom_timeout' has been enabled in
the configuration. Requests with different cache timeouts no longer
result in separate cache files.
2018-09-22 15:44:03 +02:00
logmanoriginal 243e324efc [NineGagBridge] Fix missing sections breaking feeds
Posts may supply a list of 'sections' or a single 'postSection'

References #844
2018-09-22 15:19:14 +02:00
logmanoriginal ae58b1566e [NineGagBridge] Remove type hinting
Type hinting for strings doesn't work prior to PHP 7, see
http://php.net/manual/en/functions.arguments.php#functions.arguments.type-declaration

References #837
2018-09-22 15:19:14 +02:00
sysadminstory c044694b21 [ZoneTelechargementBridge] Sort episodes from newest to oldest (#835)
References #834
2018-09-21 20:22:49 +02:00
triatic db24f55c86 [FB2Bridge] Do not strip <h3> and <h4> (#836)
Do not strip <h3> and <h4>. Output looks better when they are retained. See attached.
2018-09-21 20:19:22 +02:00
logmanoriginal eb30038d6b [README] Update and reorganize 2018-09-16 18:20:35 +02:00
logmanoriginal 712a581ed6 [README] Add badge for Guix release
Unfortunately there is no way to query the current package version,
so this is only a placeholder
2018-09-16 16:01:51 +02:00
logmanoriginal d3df4b51b8 [README] Add badge for current debian release 2018-09-16 15:13:30 +02:00
logmanoriginal e6476a600d [KununuBridge] Fix broken bridge and simplify implementation 2018-09-16 09:55:35 +02:00
Grégory T 811e8d8c88 [ETTVBridge] Improvements and bug fixes (#682)
* Fix typo with status field
* Comply with other bridges

Change the uri element of an item to point, not on the magnet link, but on the page, as similar bridges do.

* Improved to return name & uri matching with query

This change makes it possible for the feed reader to discover a title and url consistent with the user's search.
2018-09-15 17:11:36 +02:00
logmanoriginal adc6f72e97 [style] Fix first letter of labels not capitalized
This error is caused by setting label::before { content: " "; },
which makes the first letter a whitespace on all labels, neccessary
 for browsers that doesn't support the grid layout.

This commit clears the content if the browser supports the grid layout,
properly capitalizing labels again. If a browser doesn't support grid
layout, labels stay as they are provided by the bridge.
2018-09-15 17:04:20 +02:00
logmanoriginal 182153485c [Arte7Bridge] Move parameter examples into tool tip for readability 2018-09-15 16:50:10 +02:00
LogMANOriginal bf9946d1fc
CSS adjustments to improve readability for bridge parameters (#763)
* Group common selectors
* Fix indentation using tabs
* Use same styles for number and text inputs
* Use grid layout for parameters

Introduces the grid layout for bridge parameters. All parameters are
arranged in a grid to improve readability. Read more on grid layouts
at

- https://www.w3schools.com/css/css_grid.asp
- https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Grid_Layout

Notice:

Grid layouts are not supported in very old browser versions:
https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Grid_Layout/CSS_Grid_and_Progressive_Enhancement

This is why @supports checks for browser support (not supported in IE)
https://developer.mozilla.org/en-US/docs/Web/CSS/@supports#Browser_compatibility

In case grid layout is not supported, the displayed form is usable
but not very pretty due to <br> being removed by this commit for
cosmetic reasons (breaks grid layout).

Unfortunately it doesn't seem possible to insert line breaks manually
via '::after { content: '\A' }' in cases where grid layout isn't
supported.

* Add padding to card parameters

Adds padding to parameters to improve readability. For bridges without
parameters (count($parameters) === 0), the parameter 'div' is no longer
created.

* Add colon ':' after label via CSS
* Capitalize first letter of label for readability
* Fix checkbox isn't aligned left

Sets the size of the checkbox to 20x20 px for good measure.

* Harmonize formatting
* Add new style to number and select boxes

References #797

* Add fallback solution for browsers without grid support
2018-09-15 16:39:50 +02:00
triatic ec60752650 [FB2Bridge] Prevent Facebook link href's ending in two quotes (#831)
Additionally prevent Facebook links having two forward slashes after the hostname.
2018-09-15 15:16:15 +02:00
sysadminstory 6688cf0c3b [AutoJMBridge] Fix concatenation bug (#833) 2018-09-15 15:12:34 +02:00
ORelio ae45a8cfee [contents] Fix open_basedir warning (#832)
References #818
2018-09-15 14:46:11 +02:00
Matthew Seal e34ef6cb4f [MrssFormat] Escape double quotes in XML attributes (#813)
XML attributes need to have certain characters escaped to be valid. The title attribute can have double quotes in it which need to be properly encoded for attributes.
2018-09-15 14:13:05 +02:00
sysadminstory 5c92a736fa [ZoneTelechargementBridge] Added Bridge for ww2.zone-telechargement1.org (#829)
* [ZoneTelechargementBridge] Added Bridge for ww2.zone-telechargement1.org

Goal for this bridge is to follow the episode publication of a TV show
season while it's broadcasted on the TV.
2018-09-13 19:36:48 +01:00
Eugene Molotov 911bcfb246 [PikabuBridge] Implemented bridge (#830)
* [PikabuBridge] Implemented bridge
2018-09-13 12:52:26 +01:00
ZeNairolf efa550ef61 Add 9gag.com bridge (#801)
* Add 9gag.com bridge
2018-09-13 10:11:42 +01:00
sysadminstory d5d7683ed3 [AutoJMBridge] New Bridge (#827)
* [AutoJMBridge] New Bridge

This bridge will show all the car offers AutoJM has for the model you
choosed and using your filter. Very useful to wait for a cheap price for
a new car !
2018-09-13 10:05:07 +01:00
triatic fe94914eb5 [AtomFormat.php] Eliminate PHP Notice when running in CLI mode (#824) 2018-09-12 14:37:27 +01:00
Quentin Delmas 622802e5d4 Fix multiple warnings.
Fix JSON request string in case of empty location
2018-09-12 13:31:11 +01:00
sysadminstory 6da8daf1a3 [DealabsBridge] Fix for #782 and all categories are now available (#821)
This commit fixes #782 by updating the parameter value of 'Maison &
Jardin', but this means the user has to update his RSS Feed URL (.because
of the bridge structure, it would be a nightmare to fix it in another
way)

This commits add all the categories available on Dealabs Website.
2018-09-11 22:11:00 +01:00
la Bécasse 654e502e84 Arte7 collection support (#819)
* Arte7 collection support
2018-09-11 22:09:47 +01:00
sysadminstory c8ace9e3bd [ExtremeDownloadBridge] Added Bridge for ww1.extreme-d0wn.com (#820)
* [ExtremeDownloadBridge] Added Bridge for ww1.extreme-d0wn.com

Goal for this bridge is to follow the episode publication of a TV show season
while it's broadcasted on the TV.
2018-09-11 20:10:46 +01:00
Monsieur Poutounours 5722a6c139 Adding a bridge for theyetee.com (#809)
* Adding a bridge for theyetee.com

The bridge fetches daily shirts at theyetee.com.
The Yetee offers two new shirts each day, but you can buy them only for a few hours !
Unfortunately, the site don't provide RSS feed, so the only way to keep up to date on new shirt is their daily mailing ... until now !
2018-09-10 20:56:55 +01:00
Quentin Delmas 458b826871 Remove declaration of extractFromDelimiters, it is now a reusable function. Fixes #815 2018-09-10 09:29:19 +01:00
Quentin Delmas b397a42876 version: Bump to 2018-09-09 2018-09-09 21:00:10 +01:00
Corentin Garcia 111c45d010 [GithubSearchBridge] Fix content parsing, add tags if present (#803)
* [GithubSearchBridge] Fix content parsing, add tags if present

* [GithubSearchBridge] Add categories (from tags)
2018-09-09 20:30:29 +01:00
Corentin Garcia 55b36b0455 [DauphineLibereBridge] Use https, fix content parsing (fix issue #780) (#811) 2018-09-09 20:23:59 +01:00
ORelio de8cee6a1c Catching up | [Main] Debug mode, parse utils, MIME | [Bridges] Add/Improve 20 bridges (#802)
* Debug mode improvements

 - Improve debug warning message
 - Restore error reporting in debug mode
 - Fix 'notice' messages for unset fields

* Add parsing utility functions

html.php
 - extractFromDelimiters
 - stripWithDelimiters
 - stripRecursiveHTMLSection
 - markdownToHtml (partial)

bridges
 - remove now-duplicate functions
 - call functions from html.php instead

* [Anidex] New bridge

Anime torrent tracker

* [Anime-Ultime] Restore thumbnail

* [CNET] Recreate bridge

Full rewrite as the previous one was broken

* [Dilbert] Minor URI fix

Use new self::URI property

* [EstCeQuonMetEnProd] Fix content extraction

Bridge was broken

* [Facebook] Fix "SpSonsSoriSsés" label

... which was taking space in item title

* [Futura-Sciences] Use HTTPS, More cleanup

Use HTTPS as FS now offer HTTPS
Clean additional useless HTML elements

* [GBATemp] Multiple fixes

- Fix categories: missing "break" statements
- Restore thumbnail as enclosure
- Fix date extraction
- Fix user blog post extraction
- Use getSimpleHTMLDOMCached

* [JapanExpo] Fix bridge, HTTPS, thumbnails

- Fix getSimpleHTMLDOMCached call
- Upgrade to HTTPS as JE now offers HTTPS
- Restore thumbnails as enclosures

* [LeMondeInformatique] Fix bridge, HTTPS

- Upgrade to HTTPS as LMI now offers HTTPS
- Restore thumbnails using small images
- Fix content extraction
- Fix text encoding issue

* [Nextgov] Fix content extraction

- Restore thumbnail and use small image
- Field extraction fixes

* [NextInpact] Add categories and filtering by type

- Offer all RSS feeds
- Allow filtering by article type
- Implement extraction for brief articles
- Remove article limit, many brief articles are publied all at once

* [NyaaTorrents] New bridge

Anime torrent tracker

* [Releases3DS] Cache content, restore thumbnail

- Use getSimpleHTMLDOMCached
- Restore thumbnail as enclosure

* [TheHackerNews] Fix bridge

 - Fix content extraction including article body
 - Restore thumbnail as enclosure

* [WeLiveSecurity] HTTPS, Fix content extraction

- Upgrade to HTTPS as WLS now offers HTTPS
- Fix content extraction including article body

* [WordPress] Reduce timeout, more content selectors

- Reduce timeout to use default one (1h)
- Add new content selector (articleBody)
- Find thumbnail and set as enclosure
- Fix <script> cleanup

* [YGGTorrent] Increase limit, use cache

- Increase item limit as uploads are very frequent
- Use getSimpleHTMLDOMCached

* [ZDNet] Rewrite with FeedExpander

- Upgrade to HTTPS as ZD now offers HTTPS
- Use FeedExpander for secondary fields
- Fix content extraction for article body

* [Main] Handle MIME type for enclosures

Many feed readers will ignore enclosures (e.g. thumbnails) with no MIME type. This commit adds automatic MIME type detection based on file extension (which may be inaccurate but is the only way without fetching the content).

One can force enclosure type using #.ext anchor (hacky, needs improving)

* [FeedExpander] Improve field extraction

- Add support for passing enclosures
- Improve author and uri extraction
- Fix 'notice' PHP error messages

* [Pull] Coding style fixes for #802

* [Pull] Implementing changes for #802

 - Fix coding style issues with str append
 - Remove useless CACHE_TIMEOUT
 - Use count() instead of $limit
 - Use defaultLinkTo() + handle strings
 - Use http_build_query()
 - Fix missing </em>
 - Remove error_reporting(0)
 - warning CSS (@LogMANOriginal)
 - Fix typo in FeedExpander comment

* [Main] More documentation for markdownToHtml

See #802 for more details
2018-09-09 20:20:13 +01:00
Quentin Delmas 123fce4394 [ForGifsBridge] Fix permissions of ForGifsBridge 2018-09-09 17:34:36 +01:00
Quentin Delmas a3f99c9c3f [GOGBridge] Added bridge for GOG.com 2018-09-09 17:32:36 +01:00
Eugene Molotov bf30ad127c [FacebookBridge] Removes query string from post links
* [FacebookBridge] Removes query string from post links
2018-09-09 16:31:15 +01:00
logmanoriginal 37f84196b7 [GooglePlusPostBridge] Fix title is empty if content is too short
The bridge would generate empty titles if the content is longer than
50 characters, but doesn't have further spaces in it. With this commit
the title is correctly generated based on the contents, taking missing
spaces into account.

References #786
2018-09-08 17:07:57 +02:00
Corentin Garcia 44764f7182 [GrandComicsDatabaseBridge] Fix links in content (#804) 2018-09-08 11:12:27 +01:00
Antoine Cadoret 19f294d71d Add fields to leboncoin bridge (#783)
* [LeBonCoinBridge] Add fields to LeBonCoinBridge
2018-08-31 14:34:41 +01:00
Teromene b0e33e4e01
Update LeBonCoinBridge to use the site's API (#795)
* Update LeBonCoinBridge to use the site's API
2018-08-28 14:20:02 +01:00
Eugene Molotov 558fa50a2a [core] Enabled debug mode before including core files (#790) 2018-08-25 20:02:47 +01:00
Eugene Molotov ffb8b82c73 [FileCache] reseting cached file stat result to have correct getTime() result (#792)
* [FileCache] reseting cached file stat result to have correct getTime() result
2018-08-25 20:00:51 +01:00
Eugene Molotov 422c125d8e [core] Returning 304 http code when returning cached data (#793) 2018-08-25 20:00:38 +01:00
Quentin Delmas 059656c370 Fix phpcs. 2018-08-22 16:25:08 +01:00
Quentin Delmas 9fc1e97efe Avoid bot exclusion. 2018-08-22 16:21:39 +01:00
Quentin Delmas be3620acb7 Add extension check for the "json" extension. 2018-08-22 16:21:20 +01:00
LogMANOriginal 16c0a61232
[README] Add a "Deploy to Cloud" button for Docker
Adds a button to deploy RSS-Bridge to the Docker Cloud as described here: https://docs.docker.com/docker-cloud/apps/deploy-to-cloud-btn/
2018-08-21 18:40:39 +02:00
Walter Barrett 704a87ad97 Icons: Allow Bridge-specified icons (#788) 2018-08-21 17:46:47 +02:00
sysadminstory c4cccfe0f3 [LesJoiesDuCode] Switch to HTTPS and remove author (#787)
Website offers now HTTPS, therefore the bridge was switched to it.
The post author is not displayed anymore on the homepage, so it has been
removed.
2018-08-21 17:41:56 +02:00
Marcin C d07deb0930 css: Modern look for RSS-Bridge (#781) 2018-08-21 17:22:46 +02:00
Piranhaplant e7dab5d351 Fixed timestamp on Pixiv bridge (#785) 2018-08-18 16:54:24 -03:00
logmanoriginal ad82d50bbd [CNETBridge] Remove bridge
CNET now provides public feeds at https://www.cnet.com/rss/

References #775
2018-08-12 11:02:44 +02:00
logmanoriginal c305c1ded7 [BlaguesDeMerdeBridge] Adjust to layout changes
References #767
2018-08-10 21:08:47 +02:00
logmanoriginal f14a5bd771 [CADBridge] Remove bridge
https://cad-comic.com/ now provides feeds at

- https://cad-comic.com/feed (rss)
- https://cad-comic.com/feed/atom (atom)

Thus multiple alternatives are available to choose from, making this
bridge obsolete:

- FilterBridge (using one of the feeds above)
- WordPressBridge (on the main site)
- One of the two available feeds

References #752
2018-08-10 19:53:32 +02:00
logmanoriginal a20d5f9af0 tests: reuse RssBridge.php instead of implementing a custom solution 2018-08-10 15:33:32 +02:00
logmanoriginal ee28b124e0 [DanbooruBridge] Fix bridge
This commit fixes an issue caused by self closing tags not supported
by simplehtmldom (<source>).

Adds a monkey patch to extend simplehtmldom with the ability to detect
that particular tag. Most of the code added is copied directly from
simplehtmldom (see vendor/simplehtmldom) with adjustments to account
for RSS-Bridge formatting.

Related to: https://sourceforge.net/p/simplehtmldom/bugs/83/

Notice: The tag itself is valid according to Mozilla:

The HTML <picture> element serves as a container for zero or more
<source> elements and one <img> element to provide versions of an
image for different display device scenarios. The browser will
consider each of the child <source> elements and select one
corresponding to the best match found; if no matches are found
among the <source> elements, the file specified by the <img>
element's src attribute is selected. The selected image is then
presented in the space occupied by the <img> element.

-- https://developer.mozilla.org/en-US/docs/Web/HTML/Element/picture

References #753
2018-08-09 21:55:43 +02:00
LogMANOriginal 7dee3a175a
[index] Add '?action=list' to list bridges (#493)
Adds a new action '?action=list' to return a list of bridges as JSON formatted text. Each bridge brings following information:

- status (active/inactive)
- uri
- name
- parameters
- maintainer
- description

For inactive bridges only the status is returned.
Bridges that cannot be instantiated are considered inactive.
2018-08-09 19:14:10 +02:00
logmanoriginal 5fea9fc1f5 bridges: Fix bridges failing unit test 2018-08-09 17:04:16 +02:00
logmanoriginal 6bceb2b2db [tests] Add unit test for bridge implementation
Adds unit test for bridge implementations:

- Custom functions must be in protected or private scope
- getName() must return a valid string (non-empty)
- getURI() must return a valid URI
- Each bridge must define constants for NAME, URI, DESCRIPTION and
  MAINTAINER. CACHE_TIMEOUT and PARAMETERS are optional.

The unit test is written for PHPUnit 6.x and will automatically be
tested by Travis-CI for PHP 7.0 (see .travis.yml).

Remarks:

Unit tests for bridge data were scrapped in #378 for complexity
reasons (tests would have to be maintained for each bridge). This
unit test, however, is written for testing all bridges without
taking specific implementation details into account.
2018-08-09 17:04:16 +02:00
Eugene Molotov df81fa62d1 [VkBridge] Video attachment fixes (#766)
* use defaultLinkTo
* remove duplicate video links
* remove line ending before "Reposted" label
* return newline before reposted string
* remove comments
* use video links that won't require login
* set title if video has no title
2018-08-09 17:02:36 +02:00
Eugene Molotov f8c6400373 [HtmlFormat] Hide "Categories" label, if array of categories is empty (#765) 2018-08-09 16:46:53 +02:00
logmanoriginal de7622ebbf version: Bump to 2018-08-07 2018-08-07 18:37:38 +02:00
logmanoriginal 09c9d015b4 [ForGifsBridge] Add new bridge 2018-08-04 23:42:58 +02:00
logmanoriginal 3a496e3b18 [FilterBridge] Add option to build title from content
Adds a new option '&title_from_content=on' to build the title for feed
items from the feeds content. The title is generated from the first
whitespace after 50 characters of the content or the entire content if
the total size is lower than 50 characters.

References #587
2018-08-04 20:46:59 +02:00
Eugene Molotov df58f5bbdb [core] Add urljoin (#756)
Adds php-urljoin from https://github.com/fluffy-critter/php-urljoin to replace the custom implementation of 'defaultLinkTo'
2018-08-02 06:31:56 +02:00
logmanoriginal 9d0452d11b [.travis] Use composer for HHVM
This fixes the HHVM build failing because pear doesn't exist in HHVM.
2018-08-01 19:37:10 +02:00
sublimz f92ac49947 [LeBonCoinBridge] Add cities support (#751) 2018-08-01 17:25:18 +02:00
Benasse a574fa15ac [YGGTorrentBridge] Order search result by publish date (#762) 2018-07-31 21:46:10 +02:00
Nemo 8f9a385b4d [AmazonPriceTrackerBridge] Improve Amazon scraper logic (#761)
- Now works on all websites, and even with products
  with multiple prices
- Closes #750
2018-07-31 21:44:37 +02:00
logmanoriginal 53bdfa3bf0 [GooglePlusPostBridge] Skip posts without message 2018-07-31 19:15:09 +02:00
logmanoriginal 53278b2eed [GooglePlusPostBridge] Add option to include image in content
References #600
2018-07-31 19:09:12 +02:00
logmanoriginal 5f3c55b808 [GooglePlusPostBridge] General cleanup 2018-07-31 18:55:35 +02:00
logmanoriginal fb79a67370 [GooglePlusPostBridge] Normalize static::URI usage
This commit fixes a few things related to static::URI

1) Remove trailing slash from the URI to simplify using 'defaultLinkTo'
2) Use static::URI instead of self::URI for consistency
3) Remove custom implementation of 'defaultLinkTo'
2018-07-31 18:29:14 +02:00
logmanoriginal 3c4e12ceba [GooglePlusPostBridge] Add images to enclosures
Images are collected for each post and added to enclosures. Images or
animtions from lh3.googleusercontent.com are specifically handled in
order to return the animated version of the gif and the original sized
image (this is normally taken care of by JS in the browser).
2018-07-31 18:18:22 +02:00
logmanoriginal 0d1923c52f [GitHubGistBridge] Add new bridge
Adds a new bridge for https://gist.github.com

The bridge generates feeds for comments on a particular gist based on
the gist ID or full URI. For better readability the general behavior
of code sections is manually restored with the original CSS styles
from GitHub.
2018-07-29 16:31:47 +02:00
logmanoriginal ce896b4247 [SkimfeedBridge] Add new bridge
New bridge for Skimfeed: https://skimfeed.com

Generates feeds for all features of Skimfeed:

- News (the ones displayed on the front page)
- Hot topics ("What's Hot" section on the front page)
- Tech news (preconfigured feeds in the menu bar)
- Custom feeds (using the configuration system of Skimfeed), see
https://skimfeed.com/custom.php

The number of items returned by the bridge can be limited for all
categories ('&limit=...'). This parameter is optional, all categories
are unlimited by default!

Authors are added with HTML anchors in order to allow quick navigation
to source channels.

The bridge ships with developer tools to auto-generate lists in the
future (especially useful for 'Tech news'!)

References #748
2018-07-27 23:18:32 +02:00
sysadminstory a4b2d88dbe [DealabsBridge] Follow website change (#758) 2018-07-25 20:02:31 +02:00
logmanoriginal 65ec04ea98 [contents] Remove superfluous debug log from getContents
References #757
2018-07-25 19:56:46 +02:00
logmanoriginal afb4de318b [FlickrBridge] Fix missing scheme for image URLs
References #754
2018-07-23 20:14:46 +02:00
Eugene Molotov 43bb17f995 [VkBridge] Converting hashtags to categories (#755)
* [VkBridge] Converting hashtags to categories
2018-07-22 16:43:00 +02:00
logmanoriginal bae7a5879f [FlickrBridge] Fixed broken bridge
Following changes in the JSON data and selecting images for the
content (320x240 or bigger) and enclosure (largest version). All of
the data is now extracted from the JSON data instead of parsing the
DOM.

References #754
2018-07-22 14:06:04 +02:00
LogMANOriginal bd760cbcee
[README] Add docker build status 2018-07-21 21:59:48 +02:00
LogMANOriginal cd20b4476f
[README] Add label for latest release 2018-07-21 21:54:46 +02:00
LogMANOriginal d83f2f285b
Separate index and bridge card generating code into a separate classes (#734)
[html] Generate index and bridge cards using separate clases

Move HTML generating code from 'index.php' to 'Index.php', separating components into static functions.

Move HTML generation code for bridge cards from 'html.php' to 'BridgeCard.php', separating components into static functions.
2018-07-21 18:15:07 +02:00
logmanoriginal 15e6d77569 [FierPandaBridge] Fix bridge
This bridge now returns all articles from the front page, following
layout changes in the past.

References #679
2018-07-21 18:07:03 +02:00
logmanoriginal f97d2ef254 [Torrent9Bridge] Remove bridge
The site moved from www.torrent9.pe to www.t9.pe and is now protected
by Cloudflare challenges, making it inaccessible to RSS-Bridge.
2018-07-21 17:45:22 +02:00
logmanoriginal 91ae2a23d7 [CpasbienBridge] Remove bridge
Removing this bridge for two reasons:

1) The service moved from www.cpasbien.cm to www.torrents9.blue,
changing the layout in the process (incompatible).

2) The new site is permanently protected by Cloudflare IUAM, making
it inaccessible by RSS-Bridge.

While it would certainly be possible to rewrite the bridge to work
with the new layout, the site is still inaccessible.

References #605
2018-07-21 17:43:29 +02:00
logmanoriginal 066ef1d7db [contents] Add Cloudflare challenge detection
Adds detection for servers responding with Cloudflare challenges,
throwing a server error if detected:

"The server responded with a Cloudflare challenge, which is not
supported by RSS-Bridge! If this error persists longer than a week,
please consider opening an issue on GitHub!"

This is supposed to support maintainers to identify broken bridges
for sites with Cloudflare enabled permanently. It doesn't circumvent
the protection in any form or shape!

The Cloudflare challenge is detected by analyzing the last response
header received from the server. If the HTTP Code is not 200 (OK)
and the server name contains 'cloudflare' ('Server: cloudflare'),
RSS-Bridge assumes the server responded with a challenge.

The header parsing is based on https://stackoverflow.com/a/18682872
2018-07-21 17:43:29 +02:00
LogMANOriginal 4facbf32e3
[InstructableBridge] Add new bridge (#724)
This commit adds a new bridge for http://www.instructables.com. This bridge
currently supports fetching content by category (all categories available 200+),
using available filters (featured, recent, popular, views, contest winners).
2018-07-21 15:25:13 +02:00
logmanoriginal 6bd76af326 [YoutubeBridge] Add duration limits for all modes
Adds duration limits (minimum duration, maximum duration) for all
modes (user/id/playlist/search). Duration limits are optional, so
existing subscriptions don't break.

The limits are specified by two separate parameters, each of which
is optional:

- `&duration_min=` (minimum duration in minutes, default: -1)
- `&duration_max=` (maximum duration in minutes, default: INF)

If duration limits are specified in either user, id or playlist mode,
the bridge defaults to fetching data from HTML intead of XML feeds,
which requires more bandwidth and takes longer, because each video is
loaded individually!

References #670
2018-07-21 14:33:07 +02:00
logmanoriginal caa622ffec [search] Support searching by URI
Adds matching for URIs to the search bar, using the format
<scheme>://<host>/<path>

Searching by URI scheme is also supported:

"http://"  (returns all bridges with 'http'  scheme)
"https://" (returns all bridges with 'https' scheme)

The following examples are equivalent and will return both of the
Facebook bridges (FacebookBridge and FB2Bridge):

"https://www.facebook.com/facebook"
"https://www.facebook.com/facebook?..."
"https://www.facebook.com"
"http://www.facebook.com"
"https://facebook.com"
"http://facebook.com"
"facebook.com"
"facebook"

Notice: When the URI scheme is omitted, the search algorithm falls back
to regex matching. Searching for "www.facebook.com" doesn't work, as it
is missing the schema and doesn't match via regex!

Omitting the 'www.', however, does work. This was a design decision for
some bridges specify their URI with and others without 'www.'

A search term can still be specified in the browser URL using parameter
'q' => '?q=searchterm'.

References #743
2018-07-20 22:44:13 +02:00
teromene c4d489f018 Add URI to ElloBridge elements. 2018-07-19 17:07:54 +02:00
logmanoriginal 6a98293fb3 [Configuration] Bump version to 2018-07-17 2018-07-17 20:44:01 +02:00
logmanoriginal d79630e3b8 [Configuration] Remove check for allow_url_fopen
This commit follows the changes done in commits

fbf874cb29
ead7b2e8de
2018-07-17 20:39:14 +02:00
teromene 1f2fe25471 Fix LeBonCoinBridge, now uses getContents correctly, 2018-07-17 10:50:30 +02:00
Antoine Cadoret 87fc9e9156 fix LeBonCoin bridge (#747) 2018-07-16 20:13:08 +02:00
Nemo c7b0c9fd31 Amazon Price Tracker Bridge (#741)
* [amazonprice] Adds AmazonPriceTracker bridge
2018-07-16 14:54:52 +02:00
Teromene fbf874cb29
Update README.md
Remove allow_url_fopen requirement. This should no longer be necessary. Added requirement for curl.
2018-07-16 12:37:09 +02:00
Eugene Molotov 049ee52fb5 Implemented feed item categories (#746) 2018-07-16 12:32:24 +02:00
TheRadialActive 3f41d0593a Added RSS bridge for zenodo.org (#749)
* added RSS bridge for zenodo.org
2018-07-16 12:02:41 +02:00
sysadminstory 7126f5e838 [DealabsBridge] First version of the generic "Pepper" Bridge (#726)
* [DealabsBridge] First version of the generic "Pepper" Bridge
2018-07-13 00:35:13 +01:00
Nemo ead7b2e8de [fb2] Switches to getContents (#742) 2018-07-10 02:29:47 +01:00
LogMANOriginal 0d80a19e84
[FacebookBridge] Add context for public Facebook groups (#739)
The previous context is now labeled 'User', while the new context is
labeled 'Group'. The existing code was not changed, instead new group*
functions were implemented to handle groups.

The general principle of capturing groups is the same as done for users
with adjustments to account for different HTML structures.

Captcha responses are currently not supported for groups! There doesn't
seem to be a way to trigger them consistently, which makes it hard to
handle them properly.

Features of the group context:

- The feed title is based on the group name
- The group URI used for capturing is returned for the feed URI
- Author names and timestamps are reproduced from the source
- Post titles are reproduced from the source if they exist, otherwise
the title is build manually from the author name and the content
- Original contents are included with the feed
- All images are attached as enclosures as well

Closes #
2018-07-08 17:16:00 +02:00
logmanoriginal 42c699f474 formats: Fix favicon not found if url contains path 2018-06-30 10:27:05 +02:00
logmanoriginal 2bc8daa101 [JustETFBridge] Add new bridge
Supports latest news and profiling a given ETF in Englisch, German
or Italian language. Cover images are attached as enclosures and not
as part of the content.

News:

Optionally loads the full article for each news item. Some articles
may include scripts to provide interactive graphs. These scripts are
removed as they would be rendered as pure text and a message is shown
instead: "[Content removed! Visit site to see full contents!]"

Profile:

Optionally includes the ETF strategy and description.
2018-06-30 10:27:05 +02:00
logmanoriginal bca79d3f88 [KununuBridge] Fix broken page layout and sort reviews 2018-06-30 10:27:05 +02:00
logmanoriginal 90dc968fd1 Fix PHPCS error 2018-06-30 10:26:48 +02:00
Teromene da6b98851c Add recuperation of the current version from git if available (#731)
* Add recuperation of the current version from git if available
* Include version when auto-reporting an error
2018-06-30 10:24:22 +02:00
teromene 71c29d4192 Fix phpcs for master. 2018-06-29 23:15:22 +01:00
LogMANOriginal 193ca87afa [phpcs] enforce single quotes (#732)
* [phpcs] Add rule to enforce single quoted strings
2018-06-29 22:55:33 +01:00
Nemo 5ea79ac1fc Add markdown support to Container Linux Feed (#730) 2018-06-28 20:54:42 +02:00
Teromene 937ea49271 Add basic authentication support (#728)
* Move configuration in its own class in order to reduce the verbosity of index.php
* Add authentication mechanism using HTTP auth
* Add a method to get the config parameters
* Remove the installation checks from the index page
* Log all failed authentication attempts
2018-06-27 19:09:41 +02:00
logmanoriginal 95686b803c [IsoHuntBridge] Remove bridge
isoHunt has discontinued services due to legal reasons and is now
accessible via https://isohunts.to

While it is certainly possible to rewrite the bridge to fetch some
information from the new site, it wouldn't be able to provide as
much functionality as before. This is due to isoHunt having removed
all searching and filtering options, only providing static HTML pages
for general categories (anime, movies, etc...). Those pages, however,
are heavily broken.

Unless someone is interested in monitoring the general categories
the effort of upgrading the bridge to the new site is not worth taking
time for.

Users of isoHunt are asked to make use of their client application,
as they don't provide online services anymore (it's now in the darknet)

Here is the statement from isoHunt:

"Due to hard regulations and security issues for bittorrent users, we
have moved into a more secure and even faster district of the internet!

[...]

Torrent Downloads have a high risk of getting legal problems. That is
why we do not offer torrentfiles any more. [...]"

-- source: https://isohunts.to
2018-06-24 18:33:50 +02:00
logmanoriginal 5087f5f79e [FacebookBridge] Support facebook links as user name
Allows users to paste facebook links as user name. The link must contain
the correct host (www.facebook.com) and a valid path (/user-name/...).
The first part of the path is used for the user name. Errors are returned
in case something went wrong.

References #706
2018-06-24 11:14:08 +02:00
logmanoriginal 4a5f190e0e [FacebookBridge] Add option to skip reviews
Reviews are provided the same way as summary posts and therefore returned
as separate feed item for each review. This commit adds a new option
'&skip_reviews=on' to skip reviews entirely.

References #706
2018-06-24 10:52:22 +02:00
logmanoriginal 01a2746715 [YoutubeBridge] Fix sniff violation
This is a fix for a sniff violation not detected by newer versions
of phpcs (not sure why though, it's detected in version 2.7.1).
2018-06-23 21:28:30 +02:00
Nemo f4a60c1777 Add dockerfile to create an official docker image (#720) 2018-06-23 16:51:48 +02:00
sysadminstory 1b08bce779 [DealabsBridge] Follow site changes (#721)
- Changed some CSS class to follow the website changes (again)
2018-06-21 13:14:59 +01:00
Nemo 9fa74a36c6 Adds Container Linux releases RSS Feed (#718)
* Adds Container Linux releases RSS Feed
2018-06-19 19:39:08 +01:00
Corentin Garcia 7493e2b5b8 [GrandComicsDatabaseBridge] Add bridge (#717)
closes #709
2018-06-15 21:09:09 +02:00
Corentin Garcia 8e468a9ca7 [SuperSmashBlogBridge] Added bridge (#716) 2018-06-15 21:05:31 +02:00
Joe Digilio 50924b9213 Abort on parse error of config.default.ini.php (#714)
If there is an error parsing the default config file, then abort.
2018-06-15 21:02:06 +02:00
124 changed files with 11268 additions and 3360 deletions

8
.dockerignore Normal file
View File

@ -0,0 +1,8 @@
.git
cache/*
DEBUG
Dockerfile
whitelist.txt
phpcs.xml
CHANGELOG.md
CONTRIBUTING.md

View File

@ -3,12 +3,26 @@ sudo: false
language: php
install:
- pear channel-update pear.php.net
- pear install PHP_CodeSniffer
- if [[ $TRAVIS_PHP_VERSION == "hhvm" ]]; then
composer global require squizlabs/PHP_CodeSniffer;
else
pear channel-update pear.php.net;
pear install PHP_CodeSniffer;
fi
- if [[ $TRAVIS_PHP_VERSION == "7.0" ]]; then
composer global require phpunit/phpunit ^6;
fi
script:
- phpenv rehash
- phpcs . --standard=phpcs.xml --warning-severity=0 --extensions=php -p
- if [[ $TRAVIS_PHP_VERSION == "hhvm" ]]; then
/home/travis/.composer/vendor/bin/phpcs . --standard=phpcs.xml --warning-severity=0 --extensions=php -p;
else
phpcs . --standard=phpcs.xml --warning-severity=0 --extensions=php -p;
fi
- if [[ $TRAVIS_PHP_VERSION == "7.0" ]]; then
phpunit --configuration=phpunit.xml --include-path=lib/;
fi
matrix:
fast_finish: true

5
Dockerfile Normal file
View File

@ -0,0 +1,5 @@
FROM ulsmith/alpine-apache-php7
COPY ./ /app/public/
RUN chown -R apache:root /app/public

224
README.md
View File

@ -1,10 +1,10 @@
rss-bridge
===
[![LICENSE](https://img.shields.io/badge/license-UNLICENSE-blue.svg)](UNLICENSE) [![Build Status](https://travis-ci.org/RSS-Bridge/rss-bridge.svg?branch=master)](https://travis-ci.org/RSS-Bridge/rss-bridge)
[![LICENSE](https://img.shields.io/badge/license-UNLICENSE-blue.svg)](UNLICENSE) [![GitHub release](https://img.shields.io/github/release/rss-bridge/rss-bridge.svg)](https://github.com/rss-bridge/rss-bridge/releases/latest) [![Debian Release](https://img.shields.io/badge/dynamic/json.svg?label=debian%20release&url=https%3A%2F%2Fsources.debian.org%2Fapi%2Fsrc%2Frss-bridge%2F&query=%24.versions%5B0%5D.version&colorB=blue)](https://tracker.debian.org/pkg/rss-bridge) [![Guix Release](https://img.shields.io/badge/guix%20release-unknown-light--gray.svg)](https://www.gnu.org/software/guix/packages/R/) [![Build Status](https://travis-ci.org/RSS-Bridge/rss-bridge.svg?branch=master)](https://travis-ci.org/RSS-Bridge/rss-bridge) [![Docker Build Status](https://img.shields.io/docker/build/rssbridge/rss-bridge.svg)](https://hub.docker.com/r/rssbridge/rss-bridge/)
rss-bridge is a PHP project capable of generating ATOM feeds for websites which don't have one.
RSS-Bridge is a PHP project capable of generating RSS and Atom feeds for websites which don't have one. It can be used on webservers or as stand alone application in CLI mode.
Supported sites/pages (main)
Supported sites/pages (examples)
===
* `Bandcamp` : Returns last release from [bandcamp](https://bandcamp.com/) for a tag
@ -25,106 +25,188 @@ Supported sites/pages (main)
* `Wikipedia`: highlighted articles from [Wikipedia](https://wikipedia.org/) in English, German, French or Esperanto
* `YouTube` : YouTube user channel, playlist or search
Plus [many other bridges](bridges/) to enable, thanks to the community
And [many more](bridges/), thanks to the community!
Output format
===
Output format can take several forms:
* `Atom` : ATOM Feed, for use in RSS/Feed readers
* `Html` : Simple html page.
* `Json` : Json, for consumption by other applications.
* `Mrss` : MRSS Feed, for use in RSS/Feed readers
* `Plaintext` : raw text (php object, as returned by print_r)
RSS-Bridge is capable of producing several output formats:
* `Atom` : Atom feed, for use in feed readers
* `Html` : Simple HTML page
* `Json` : JSON, for consumption by other applications
* `Mrss` : MRSS feed, for use in feed readers
* `Plaintext` : Raw text, for consumption by other applications
You can extend RSS-Bridge with your own format, using the [Format API](https://github.com/RSS-Bridge/rss-bridge/wiki/Format-API)!
Screenshot
===
Welcome screen:
![Screenshot](https://github.com/RSS-Bridge/rss-bridge/wiki/images/screenshot_rss-bridge_welcome.png)
RSS-Bridge hashtag (#rss-bridge) search on Twitter, in ATOM format (as displayed by Firefox):
***
RSS-Bridge hashtag (#rss-bridge) search on Twitter, in Atom format (as displayed by Firefox):
![Screenshot](https://github.com/RSS-Bridge/rss-bridge/wiki/images/screenshot_twitterbridge_atom.png)
Requirements
===
* PHP 5.6, e.g. `AddHandler application/x-httpd-php56 .php` in `.htaccess`
* `openssl` extension enabled in PHP config (`php.ini`)
* `allow_url_fopen=1` in `php.ini`
RSS-Bridge requires PHP 5.6 or higher with following extensions enabled:
Enabling/Disabling bridges
- [`openssl`](https://secure.php.net/manual/en/book.openssl.php)
- [`libxml`](https://secure.php.net/manual/en/book.libxml.php)
- [`mbstring`](https://secure.php.net/manual/en/book.mbstring.php)
- [`simplexml`](https://secure.php.net/manual/en/book.simplexml.php)
- [`curl`](https://secure.php.net/manual/en/book.curl.php)
- [`json`](https://secure.php.net/manual/en/book.json.php)
Find more information on our [Wiki](https://github.com/rss-bridge/rss-bridge/wiki)
Enable / Disable bridges
===
By default, the script creates `whitelist.txt` and adds the main bridges (see above). `whitelist.txt` is ignored by git, you can edit it:
* to enable extra bridges (one bridge per line)
* to disable main bridges (remove the line)
* to enable all bridges (just one wildcard `*` as file content)
RSS-Bridge allows you to take full control over which bridges are displayed to the user. That way you can host your own RSS-Bridge service with your favorite collection of bridges!
New bridges are disabled by default, so make sure to check regularly what's new and whitelist what you want!
Find more information on the [Wiki](https://github.com/RSS-Bridge/rss-bridge/wiki/Whitelisting)
**Notice**: By default RSS-Bridge will only show a small subset of bridges. Make sure to read up on [whitelisting](https://github.com/RSS-Bridge/rss-bridge/wiki/Whitelisting) to unlock the full potential of RSS-Bridge!
Deploy
===
Thanks to the community, hosting your own instance of RSS-Bridge is as easy as clicking a button!
[![Deploy on Scalingo](https://cdn.scalingo.com/deploy/button.svg)](https://my.scalingo.com/deploy?source=https://github.com/sebsauvage/rss-bridge)
[![Deploy to Docker Cloud](https://files.cloud.docker.com/images/deploy-to-dockercloud.svg)](https://cloud.docker.com/stack/deploy/?repo=https://github.com/rss-bridge/rss-bridge)
Getting involved
===
There are many ways for you to getting involved with RSS-Bridge. Here are a few things:
- Share RSS-Bridge with your friends (Twitter, Facebook, ..._you name it_...)
- Report broken bridges or bugs by opening [Issues](https://github.com/RSS-Bridge/rss-bridge/issues) on GitHub
- Request new features or suggest ideas (via [Issues](https://github.com/RSS-Bridge/rss-bridge/issues))
- Discuss bugs, features, ideas or [issues](https://github.com/RSS-Bridge/rss-bridge/issues)
- Add new bridges or improve the API
- Improve the [Wiki](https://github.com/RSS-Bridge/rss-bridge/wiki)
- Host an instance of RSS-Bridge for your personal use or make it available to the community :sparkling_heart:
Authors
===
We are RSS Bridge Community, a group of developers continuing the project initiated by sebsauvage, webmaster of [sebsauvage.net](http://sebsauvage.net), author of [Shaarli](http://sebsauvage.net/wiki/doku.php?id=php:shaarli) and [ZeroBin](http://sebsauvage.net/wiki/doku.php?id=php:zerobin).
Patch/contributors :
We are RSS-Bridge community, a group of developers continuing the project initiated by sebsauvage, webmaster of [sebsauvage.net](http://sebsauvage.net), author of [Shaarli](http://sebsauvage.net/wiki/doku.php?id=php:shaarli) and [ZeroBin](http://sebsauvage.net/wiki/doku.php?id=php:zerobin).
* Yves ASTIER ([Draeli](https://github.com/Draeli)) : PHP optimizations, fixes, dynamic brigde/format list with all stuff behind and extend cache system. Mail : contact /at\ yves-astier.com
* [Mitsukarenai](https://github.com/Mitsukarenai) : Initial inspiration, collaborator
* [ArthurHoaro](https://github.com/ArthurHoaro)
* [BoboTiG](https://github.com/BoboTiG)
* [Astalaseven](https://github.com/Astalaseven)
* [qwertygc](https://github.com/qwertygc)
* [Djuuu](https://github.com/Djuuu)
* [Anadrark](https://github.com/Anadrark])
* [Grummfy](https://github.com/Grummfy)
* [Polopollo](https://github.com/Polopollo)
* [16mhz](https://github.com/16mhz)
* [kranack](https://github.com/kranack)
* [logmanoriginal](https://github.com/logmanoriginal)
* [polo2ro](https://github.com/polo2ro)
* [Riduidel](https://github.com/Riduidel)
* [superbaillot.net](http://superbaillot.net/)
* [vinzv](https://github.com/vinzv)
* [teromene](https://github.com/teromene)
* [nel50n](https://github.com/nel50n)
* [nyutag](https://github.com/nyutag)
* [ORelio](https://github.com/ORelio)
* [Pitchoule](https://github.com/Pitchoule)
* [pit-fgfjiudghdf](https://github.com/pit-fgfjiudghdf)
* [aledeg](https://github.com/aledeg)
* [alexAubin](https://github.com/alexAubin)
* [cnlpete](https://github.com/cnlpete)
* [corenting](https://github.com/corenting)
* [Daiyousei](https://github.com/Daiyousei)
* [erwang](https://github.com/erwang)
* [gsurrel](https://github.com/gsurrel)
* [kraoc](https://github.com/kraoc)
* [lagaisse](https://github.com/lagaisse)
* [az5he6ch](https://github.com/az5he6ch)
* [niawag](https://github.com/niawag)
* [JeremyRand](https://github.com/JeremyRand)
* [mro](https://github.com/mro)
**Contributors** (sorted alphabetically):
<!--
Use this script to generate the list automatically (using the GitHub API):
https://gist.github.com/LogMANOriginal/da00cd1e5f0ca31cef8e193509b17fd8
-->
* [16mhz](https://api.github.com/users/16mhz)
* [Ahiles3005](https://api.github.com/users/Ahiles3005)
* [Albirew](https://api.github.com/users/Albirew)
* [AmauryCarrade](https://api.github.com/users/AmauryCarrade)
* [ArthurHoaro](https://api.github.com/users/ArthurHoaro)
* [Astalaseven](https://api.github.com/users/Astalaseven)
* [Astyan-42](https://api.github.com/users/Astyan-42)
* [Daiyousei](https://api.github.com/users/Daiyousei)
* [Djuuu](https://api.github.com/users/Djuuu)
* [Draeli](https://api.github.com/users/Draeli)
* [EtienneM](https://api.github.com/users/EtienneM)
* [Frenzie](https://api.github.com/users/Frenzie)
* [Ginko-Aloe](https://api.github.com/users/Ginko-Aloe)
* [Glandos](https://api.github.com/users/Glandos)
* [GregThib](https://api.github.com/users/GregThib)
* [Grummfy](https://api.github.com/users/Grummfy)
* [JackNUMBER](https://api.github.com/users/JackNUMBER)
* [JeremyRand](https://api.github.com/users/JeremyRand)
* [Jocker666z](https://api.github.com/users/Jocker666z)
* [LogMANOriginal](https://api.github.com/users/LogMANOriginal)
* [MonsieurPoutounours](https://api.github.com/users/MonsieurPoutounours)
* [ORelio](https://api.github.com/users/ORelio)
* [PaulVayssiere](https://api.github.com/users/PaulVayssiere)
* [Piranhaplant](https://api.github.com/users/Piranhaplant)
* [Riduidel](https://api.github.com/users/Riduidel)
* [Strubbl](https://api.github.com/users/Strubbl)
* [TheRadialActive](https://api.github.com/users/TheRadialActive)
* [TwizzyDizzy](https://api.github.com/users/TwizzyDizzy)
* [WalterBarrett](https://api.github.com/users/WalterBarrett)
* [ZeNairolf](https://api.github.com/users/ZeNairolf)
* [adamchainz](https://api.github.com/users/adamchainz)
* [aledeg](https://api.github.com/users/aledeg)
* [alexAubin](https://api.github.com/users/alexAubin)
* [az5he6ch](https://api.github.com/users/az5he6ch)
* [b1nj](https://api.github.com/users/b1nj)
* [benasse](https://api.github.com/users/benasse)
* [captn3m0](https://api.github.com/users/captn3m0)
* [chemel](https://api.github.com/users/chemel)
* [ckiw](https://api.github.com/users/ckiw)
* [cnlpete](https://api.github.com/users/cnlpete)
* [corenting](https://api.github.com/users/corenting)
* [da2x](https://api.github.com/users/da2x)
* [eMerzh](https://api.github.com/users/eMerzh)
* [em92](https://api.github.com/users/em92)
* [griffaurel](https://api.github.com/users/griffaurel)
* [hunhejj](https://api.github.com/users/hunhejj)
* [j0k3r](https://api.github.com/users/j0k3r)
* [jdigilio](https://api.github.com/users/jdigilio)
* [kranack](https://api.github.com/users/kranack)
* [kraoc](https://api.github.com/users/kraoc)
* [laBecasse](https://api.github.com/users/laBecasse)
* [lagaisse](https://api.github.com/users/lagaisse)
* [lalannev](https://api.github.com/users/lalannev)
* [ldidry](https://api.github.com/users/ldidry)
* [m0zes](https://api.github.com/users/m0zes)
* [matthewseal](https://api.github.com/users/matthewseal)
* [mcbyte-it](https://api.github.com/users/mcbyte-it)
* [mdemoss](https://api.github.com/users/mdemoss)
* [melangue](https://api.github.com/users/melangue)
* [metaMMA](https://api.github.com/users/metaMMA)
* [mickael-bertrand](https://api.github.com/users/mickael-bertrand)
* [mitsukarenai](https://api.github.com/users/mitsukarenai)
* [mro](https://api.github.com/users/mro)
* [mxmehl](https://api.github.com/users/mxmehl)
* [nel50n](https://api.github.com/users/nel50n)
* [niawag](https://api.github.com/users/niawag)
* [pellaeon](https://api.github.com/users/pellaeon)
* [pit-fgfjiudghdf](https://api.github.com/users/pit-fgfjiudghdf)
* [pitchoule](https://api.github.com/users/pitchoule)
* [pmaziere](https://api.github.com/users/pmaziere)
* [prysme01](https://api.github.com/users/prysme01)
* [quentinus95](https://api.github.com/users/quentinus95)
* [qwertygc](https://api.github.com/users/qwertygc)
* [regisenguehard](https://api.github.com/users/regisenguehard)
* [rogerdc](https://api.github.com/users/rogerdc)
* [sebsauvage](https://api.github.com/users/sebsauvage)
* [sublimz](https://api.github.com/users/sublimz)
* [sysadminstory](https://api.github.com/users/sysadminstory)
* [tameroski](https://api.github.com/users/tameroski)
* [teromene](https://api.github.com/users/teromene)
* [triatic](https://api.github.com/users/triatic)
* [wtuuju](https://api.github.com/users/wtuuju)
Licenses
===
Code is [Public Domain](UNLICENSE).
Including `PHP Simple HTML DOM Parser` under the [MIT License](http://opensource.org/licenses/MIT)
The source code for RSS-Bridge is [Public Domain](UNLICENSE).
RSS-Bridge uses third party libraries with their own license:
* [`PHP Simple HTML DOM Parser`](http://simplehtmldom.sourceforge.net/) licensed under the [MIT License](http://opensource.org/licenses/MIT)
* [`php-urljoin`](https://github.com/fluffy-critter/php-urljoin) licensed under the [MIT License](http://opensource.org/licenses/MIT)
Technical notes
===
* There is a cache so that source services won't ban you even if you hammer the rss-bridge with requests. Each bridge can have a different duration for the cache. The `cache` subdirectory will be automatically created and cached objects older than 24 hours get purged.
* To implement a new Bridge, [follow the specifications](https://github.com/RSS-Bridge/rss-bridge/wiki/Bridge-API) and take a look at existing Bridges for examples.
* To enable debug mode (disabling cache and enabling error reporting), create an empty file named `DEBUG` in the root directory (next to `index.php`).
* For more information refer to the [Wiki](https://github.com/RSS-Bridge/rss-bridge/wiki)
* RSS-Bridge uses caching to prevent services from banning your server for repeatedly updating feeds. The specific cache duration can be different between bridges. Cached files are deleted automatically after 24 hours.
* You can implement your own bridge, [following these instructions](https://github.com/RSS-Bridge/rss-bridge/wiki/Bridge-API).
* You can enable debug mode to disable caching. Find more information on the [Wiki](https://github.com/RSS-Bridge/rss-bridge/wiki/Debug-mode)
Rant
===
@ -133,10 +215,10 @@ Rant
Your catchword is "share", but you don't want us to share. You want to keep us within your walled gardens. That's why you've been removing RSS links from webpages, hiding them deep on your website, or removed feeds entirely, replacing it with crippled or demented proprietary API. **FUCK YOU.**
You're not social when you hamper sharing by removing feeds. You're happy to have customers creating content for your ecosystem, but you don't want this content out - a content you do not even own. Google Takeout is just a gimmick. We want our data to flow, we want RSS or ATOM feeds.
You're not social when you hamper sharing by removing feeds. You're happy to have customers creating content for your ecosystem, but you don't want this content out - a content you do not even own. Google Takeout is just a gimmick. We want our data to flow, we want RSS or Atom feeds.
We want to share with friends, using open protocols: RSS, ATOM, XMPP, whatever. Because no one wants to have *your* service with *your* applications using *your* API force-feeding them. Friends must be free to choose whatever software and service they want.
We want to share with friends, using open protocols: RSS, Atom, XMPP, whatever. Because no one wants to have *your* service with *your* applications using *your* API force-feeding them. Friends must be free to choose whatever software and service they want.
We are rebuilding bridges you have wilfully destroyed.
Get your shit together: Put RSS/ATOM back in.
Get your shit together: Put RSS/Atom back in.

View File

@ -0,0 +1,187 @@
<?php
class AmazonPriceTrackerBridge extends BridgeAbstract {
const MAINTAINER = 'captn3m0';
const NAME = 'Amazon Price Tracker';
const URI = 'https://www.amazon.com/';
const CACHE_TIMEOUT = 3600; // 1h
const DESCRIPTION = 'Tracks price for a single product on Amazon';
const PARAMETERS = array(
array(
'asin' => array(
'name' => 'ASIN',
'required' => true,
'exampleValue' => 'B071GB1VMQ',
// https://stackoverflow.com/a/12827734
'pattern' => 'B[\dA-Z]{9}|\d{9}(X|\d)',
),
'tld' => array(
'name' => 'Country',
'type' => 'list',
'required' => true,
'values' => array(
'Australia' => 'com.au',
'Brazil' => 'com.br',
'Canada' => 'ca',
'China' => 'cn',
'France' => 'fr',
'Germany' => 'de',
'India' => 'in',
'Italy' => 'it',
'Japan' => 'co.jp',
'Mexico' => 'com.mx',
'Netherlands' => 'nl',
'Spain' => 'es',
'United Kingdom' => 'co.uk',
'United States' => 'com',
),
'defaultValue' => 'com',
),
));
protected $title;
/**
* Generates domain name given a amazon TLD
*/
private function getDomainName() {
return 'https://www.amazon.' . $this->getInput('tld');
}
/**
* Generates URI for a Amazon product page
*/
public function getURI() {
if (!is_null($this->getInput('asin'))) {
return $this->getDomainName() . '/dp/' . $this->getInput('asin') . '/';
}
return parent::getURI();
}
/**
* Scrapes the product title from the html page
* returns the default title if scraping fails
*/
private function getTitle($html) {
$titleTag = $html->find('#productTitle', 0);
if (!$titleTag) {
return $this->getDefaultTitle();
} else {
return trim(html_entity_decode($titleTag->innertext, ENT_QUOTES));
}
}
/**
* Title used by the feed if none could be found
*/
private function getDefaultTitle() {
return 'Amazon.' . $this->getInput('tld') . ': ' . $this->getInput('asin');
}
/**
* Returns name for the feed
* Uses title (already scraped) if it has one
*/
public function getName() {
if (isset($this->title)) {
return $this->title;
} else {
return parent::getName();
}
}
private function parseDynamicImage($attribute) {
$json = json_decode(html_entity_decode($attribute), true);
if ($json and count($json) > 0) {
return array_keys($json)[0];
}
}
/**
* Returns a generated image tag for the product
*/
private function getImage($html) {
$imageSrc = $html->find('#main-image-container img', 0);
if ($imageSrc) {
$hiresImage = $imageSrc->getAttribute('data-old-hires');
$dynamicImageAttribute = $imageSrc->getAttribute('data-a-dynamic-image');
$image = $hiresImage ?: $this->parseDynamicImage($dynamicImageAttribute);
}
$image = $image ?: 'https://placekitten.com/200/300';
return <<<EOT
<img width="300" style="max-width:300;max-height:300" src="$image" alt="{$this->title}" />
EOT;
}
/**
* Return \simple_html_dom object
* for the entire html of the product page
*/
private function getHtml() {
$uri = $this->getURI();
return getSimpleHTMLDOM($uri) ?: returnServerError('Could not request Amazon.');
}
private function scrapePriceFromMetrics($html) {
$asinData = $html->find('#cerberus-data-metrics', 0);
// <div id="cerberus-data-metrics" style="display: none;"
// data-asin="B00WTHJ5SU" data-asin-price="14.99" data-asin-shipping="0"
// data-asin-currency-code="USD" data-substitute-count="-1" ... />
if ($asinData) {
return [
'price' => $asinData->getAttribute('data-asin-price'),
'currency' => $asinData->getAttribute('data-asin-currency-code'),
'shipping' => $asinData->getAttribute('data-asin-shipping')
];
}
return false;
}
private function scrapePriceGeneric($html) {
$priceDiv = $html->find('span.offer-price', 0) ?: $html->find('.a-color-price', 0);
preg_match('/^\s*([A-Z]{3}|£|\$)\s?([\d.,]+)\s*$/', $priceDiv->plaintext, $matches);
if (count($matches) === 3) {
return [
'price' => $matches[2],
'currency' => $matches[1],
'shipping' => '0'
];
}
return false;
}
/**
* Scrape method for Amazon product page
* @return [type] [description]
*/
public function collectData() {
$html = $this->getHtml();
$this->title = $this->getTitle($html);
$imageTag = $this->getImage($html);
$data = $this->scrapePriceFromMetrics($html) ?: $this->scrapePriceGeneric($html);
$item = array(
'title' => $this->title,
'uri' => $this->getURI(),
'content' => "$imageTag<br/>Price: {$data['price']} {$data['currency']}",
);
if ($data['shipping'] !== '0') {
$item['content'] .= "<br>Shipping: {$data['shipping']} {$data['currency']}</br>";
}
$this->items[] = $item;
}
}

207
bridges/AnidexBridge.php Normal file
View File

@ -0,0 +1,207 @@
<?php
class AnidexBridge extends BridgeAbstract {
const MAINTAINER = 'ORelio';
const NAME = 'Anidex';
const URI = 'https://anidex.info/';
const DESCRIPTION = 'Returns the newest torrents, with optional search criteria.';
const PARAMETERS = array(
array(
'id' => array(
'name' => 'Category',
'type' => 'list',
'values' => array(
'All categories' => '0',
'Anime' => '1,2,3',
'Anime - Sub' => '1',
'Anime - Raw' => '2',
'Anime - Dub' => '3',
'Live Action' => '4,5',
'Live Action - Sub' => '4',
'Live Action - Raw' => '5',
'Light Novel' => '6',
'Manga' => '7,8',
'Manga - Translated' => '7',
'Manga - Raw' => '8',
'Music' => '9,10,11',
'Music - Lossy' => '9',
'Music - Lossless' => '10',
'Music - Video' => '11',
'Games' => '12',
'Applications' => '13',
'Pictures' => '14',
'Adult Video' => '15',
'Other' => '16'
)
),
'lang_id' => array(
'name' => 'Language',
'type' => 'list',
'values' => array(
'All languages' => '0',
'English' => '1',
'Japanese' => '2',
'Polish' => '3',
'Serbo-Croatian' => '4',
'Dutch' => '5',
'Italian' => '6',
'Russian' => '7',
'German' => '8',
'Hungarian' => '9',
'French' => '10',
'Finnish' => '11',
'Vietnamese' => '12',
'Greek' => '13',
'Bulgarian' => '14',
'Spanish (Spain)' => '15',
'Portuguese (Brazil)' => '16',
'Portuguese (Portugal)' => '17',
'Swedish' => '18',
'Arabic' => '19',
'Danish' => '20',
'Chinese (Simplified)' => '21',
'Bengali' => '22',
'Romanian' => '23',
'Czech' => '24',
'Mongolian' => '25',
'Turkish' => '26',
'Indonesian' => '27',
'Korean' => '28',
'Spanish (LATAM)' => '29',
'Persian' => '30',
'Malaysian' => '31'
)
),
'group_id' => array(
'name' => 'Group ID',
'type' => 'number'
),
'r' => array(
'name' => 'Hide Remakes',
'type' => 'checkbox'
),
'b' => array(
'name' => 'Only Batches',
'type' => 'checkbox'
),
'a' => array(
'name' => 'Only Authorized',
'type' => 'checkbox'
),
'q' => array(
'name' => 'Keyword',
'description' => 'Keyword(s)',
'type' => 'text'
),
'h' => array(
'name' => 'Adult content',
'type' => 'list',
'values' => array(
'No filter' => '0',
'Hide +18' => '1',
'Only +18' => '2'
)
)
)
);
public function collectData() {
// Build Search URL from user-provided parameters
$search_url = self::URI . '?s=upload_timestamp&o=desc';
foreach (array('id', 'lang_id', 'group_id') as $param_name) {
$param = $this->getInput($param_name);
if (!empty($param) && intval($param) != 0 && ctype_digit(str_replace(',', '', $param))) {
$search_url .= '&' . $param_name . '=' . $param;
}
}
foreach (array('r', 'b', 'a') as $param_name) {
$param = $this->getInput($param_name);
if (!empty($param) && boolval($param)) {
$search_url .= '&' . $param_name . '=1';
}
}
$query = $this->getInput('q');
if (!empty($query)) {
$search_url .= '&q=' . urlencode($query);
}
$opt = array();
$h = $this->getInput('h');
if (!empty($h) && intval($h) != 0 && ctype_digit($h)) {
$opt[CURLOPT_COOKIE] = 'anidex_h_toggle=' . $h;
}
// Retrieve torrent listing from search results, which does not contain torrent description
$html = getSimpleHTMLDOM($search_url, array(), $opt)
or returnServerError('Could not request Anidex: ' . $search_url);
$links = $html->find('a');
$results = array();
foreach ($links as $link)
if (strpos($link->href, '/torrent/') === 0 && !in_array($link->href, $results))
$results[] = $link->href;
if (empty($results) && empty($this->getInput('q')))
returnServerError('No results from Anidex: '.$search_url);
//Process each item individually
foreach ($results as $element) {
//Limit total amount of requests
if(count($this->items) >= 20) {
break;
}
$torrent_id = str_replace('/torrent/', '', $element);
//Ignore entries without valid torrent ID
if ($torrent_id != 0 && ctype_digit($torrent_id)) {
//Retrieve data for this torrent ID
$item_uri = self::URI . 'torrent/'.$torrent_id;
//Retrieve full description from torrent page
if ($item_html = getSimpleHTMLDOMCached($item_uri)) {
//Retrieve data from page contents
$item_title = str_replace(' (Torrent) - AniDex ', '', $item_html->find('title', 0)->plaintext);
$item_desc = $item_html->find('div.panel-body', 0);
$item_author = trim($item_html->find('span.fa-user', 0)->parent()->plaintext);
$item_date = strtotime(trim($item_html->find('span.fa-clock', 0)->parent()->plaintext));
$item_image = $this->getURI() . 'images/user_logos/default.png';
//Check for description-less torrent andn optionally extract image
$desc_title_found = false;
foreach ($item_html->find('h3.panel-title') as $h3) {
if (strpos($h3, 'Description') !== false) {
$desc_title_found = true;
break;
}
}
if ($desc_title_found) {
//Retrieve image for thumbnail or generic logo fallback
foreach ($item_desc->find('img') as $img) {
if (strpos($img->src, 'prez') === false) {
$item_image = $img->src;
break;
}
}
$item_desc = trim($item_desc->innertext);
} else {
$item_desc = '<em>No description.</em>';
}
//Build and add final item
$item = array();
$item['uri'] = $item_uri;
$item['title'] = $item_title;
$item['author'] = $item_author;
$item['timestamp'] = $item_date;
$item['enclosures'] = array($item_image);
$item['content'] = $item_desc;
$this->items[] = $item;
}
}
$element = null;
}
$results = null;
}
}

View File

@ -5,7 +5,7 @@ class AnimeUltimeBridge extends BridgeAbstract {
const NAME = 'Anime-Ultime';
const URI = 'http://www.anime-ultime.net/';
const CACHE_TIMEOUT = 10800; // 3h
const DESCRIPTION = 'Returns the 10 newest releases posted on Anime-Ultime';
const DESCRIPTION = 'Returns the newest releases posted on Anime-Ultime.';
const PARAMETERS = array( array(
'type' => array(
'name' => 'Type',
@ -65,6 +65,13 @@ class AnimeUltimeBridge extends BridgeAbstract {
$item_link_element = $release->find('td', 0)->find('a', 0);
$item_uri = self::URI . $item_link_element->href;
$item_name = html_entity_decode($item_link_element->plaintext);
$item_image = self::URI . substr(
$item_link_element->onmouseover,
37,
strpos($item_link_element->onmouseover, ' ', 37) - 37
);
$item_episode = html_entity_decode(
str_pad(
$release->find('td', 1)->plaintext,
@ -79,8 +86,7 @@ class AnimeUltimeBridge extends BridgeAbstract {
if(!empty($item_uri)) {
// Retrieve description from description page and
// convert relative image src info absolute image src
// Retrieve description from description page
$html_item = getContents($item_uri)
or returnServerError('Could not request Anime-Ultime: ' . $item_uri);
$item_description = substr(
@ -91,10 +97,9 @@ class AnimeUltimeBridge extends BridgeAbstract {
0,
strpos($item_description, '<div id="table">')
);
$item_description = str_replace(
'src="images', 'src="' . self::URI . 'images',
$item_description
);
// Convert relative image src into absolute image src, remove line breaks
$item_description = defaultLinkTo($item_description, self::URI);
$item_description = str_replace("\r", '', $item_description);
$item_description = str_replace("\n", '', $item_description);
$item_description = utf8_encode($item_description);
@ -105,6 +110,7 @@ class AnimeUltimeBridge extends BridgeAbstract {
$item['title'] = $item_name . ' ' . $item_type . ' ' . $item_episode;
$item['author'] = $item_fansub;
$item['timestamp'] = $item_date;
$item['enclosures'] = array($item_image);
$item['content'] = $item_description;
$this->items[] = $item;
$processedOK++;

View File

@ -28,6 +28,13 @@ class Arte7Bridge extends BridgeAbstract {
)
)
),
'Collection (Français)' => array(
'colfr' => array(
'name' => 'Collection id',
'required' => true,
'title' => 'ex. RC-014095 pour https://www.arte.tv/fr/videos/RC-014095/blow-up/'
)
),
'Catégorie (Allemand)' => array(
'catde' => array(
'type' => 'list',
@ -45,6 +52,13 @@ class Arte7Bridge extends BridgeAbstract {
'Sonstiges' => 'AUT'
)
)
),
'Collection (Allemand)' => array(
'colde' => array(
'name' => 'Collection id',
'required' => true,
'title' => 'ex. RC-014095 pour https://www.arte.tv/de/videos/RC-014095/blow-up/'
)
)
);
@ -54,15 +68,24 @@ class Arte7Bridge extends BridgeAbstract {
$category = $this->getInput('catfr');
$lang = 'fr';
break;
case 'Collection (Français)':
$lang = 'fr';
$collectionId = $this->getInput('colfr');
break;
case 'Catégorie (Allemand)':
$category = $this->getInput('catde');
$lang = 'de';
break;
case 'Collection (Allemand)':
$lang = 'de';
$collectionId = $this->getInput('colde');
break;
}
$url = 'https://api.arte.tv/api/opa/v3/videos?sort=-lastModified&limit=10&language='
. $lang
. ($category != null ? '&category.code=' . $category : '');
. ($category != null ? '&category.code=' . $category : '')
. ($collectionId != null ? '&collections.collectionId=' . $collectionId : '');
$header = array(
'Authorization: Bearer ' . self::API_TOKEN

62
bridges/AutoJMBridge.php Normal file
View File

@ -0,0 +1,62 @@
<?php
class AutoJMBridge extends BridgeAbstract {
const NAME = 'AutoJM';
const URI = 'http://www.autojm.fr/';
const DESCRIPTION = 'Suivre les offres de véhicules proposés par AutoJM en fonction des critères de filtrages';
const MAINTAINER = 'sysadminstory';
const PARAMETERS = array(
'Afficher les offres de véhicules disponible en fonction des critères du site AutoJM' => array(
'url' => array(
'name' => 'URL de la recherche',
'type' => 'text',
'required' => true,
'title' => 'URL d\'une recherche avec filtre de véhicules sans le http://www.autojm.fr/',
'exampleValue' => 'gammes/index/398?order_by=finition_asc&energie[]=3&transmission[]=2&dispo=all'
)
)
);
const CACHE_TIMEOUT = 3600;
public function collectData() {
$html = getSimpleHTMLDOM(self::URI . $this->getInput('url'))
or returnServerError('Could not request AutoJM.');
$list = $html->find('div[class*=ligne_modele]');
foreach($list as $element) {
$image = $element->find('img[class=width-100]', 0)->src;
$serie = $element->find('div[class=serie]', 0)->find('span', 0)->plaintext;
$url = $element->find('div[class=serie]', 0)->find('a[class=btn_ligne color-black]', 0)->href;
if($element->find('div[class*=hasStock-info]', 0) != null) {
$dispo = 'Disponible';
} else {
$dispo = 'Sur commande';
}
$carburant = str_replace('dispo |', '', $element->find('div[class=carburant]', 0)->plaintext);
$transmission = $element->find('div[class*=bv]', 0)->plaintext;
$places = $element->find('div[class*=places]', 0)->plaintext;
$portes = $element->find('div[class*=nb_portes]', 0)->plaintext;
$carosserie = $element->find('div[class*=coloris]', 0)->plaintext;
$remise = $element->find('div[class*=remise]', 0)->plaintext;
$prix = $element->find('div[class*=prixjm]', 0)->plaintext;
$item = array();
$item['uri'] = $url;
$item['title'] = $serie;
$item['content'] = '<p><img style="vertical-align:middle ; padding: 10px" src="' . $image . '" />'. $serie . '</p>';
$item['content'] .= '<ul><li>Disponibilité : ' . $dispo . '</li>';
$item['content'] .= '<li>Carburant : ' . $carburant . '</li>';
$item['content'] .= '<li>Transmission : ' . $transmission . '</li>';
$item['content'] .= '<li>Nombre de places : ' . $places . '</li>';
$item['content'] .= '<li>Nombre de portes : ' . $portes . '</li>';
$item['content'] .= '<li>Série : ' . $serie . '</li>';
$item['content'] .= '<li>Carosserie : ' . $carosserie . '</li>';
$item['content'] .= '<li>Remise : ' . $remise . '</li>';
$item['content'] .= '<li>Prix : ' . $prix . '</li></ul>';
$this->items[] = $item;
}
}
}
?>

View File

@ -1,31 +1,43 @@
<?php
class BlaguesDeMerdeBridge extends BridgeAbstract {
const MAINTAINER = 'superbaillot.net';
const MAINTAINER = 'superbaillot.net, logmanoriginal';
const NAME = 'Blagues De Merde';
const URI = 'http://www.blaguesdemerde.fr/';
const CACHE_TIMEOUT = 7200; // 2h
const DESCRIPTION = 'Blagues De Merde';
public function collectData(){
$html = getSimpleHTMLDOM(self::URI)
or returnServerError('Could not request BDM.');
foreach($html->find('article.joke_contener') as $element) {
$item = array();
$temp = $element->find('a');
foreach($html->find('div.blague') as $element) {
$item = array();
$item['uri'] = static::URI . '#' . $element->id;
$item['author'] = $element->find('div[class="blague-footer"] p strong', 0)->plaintext;
// Let the title be everything up to the first <br>
$item['title'] = trim(explode("\n", $element->find('div.text', 0)->plaintext)[0]);
$item['content'] = strip_tags($element->find('div.text', 0));
// timestamp is part of:
// <p>Par <strong>{author}</strong> le {date} dans <strong>{category}</strong></p>
preg_match(
'/.+le(.+)dans.*/',
$element->find('div[class="blague-footer"]', 0)->plaintext,
$matches
);
$item['timestamp'] = strtotime($matches[1]);
$this->items[] = $item;
if(isset($temp[2])) {
$item['content'] = trim($element->find('div.joke_text_contener', 0)->innertext);
$uri = $temp[2]->href;
$item['uri'] = $uri;
$item['title'] = substr($uri, (strrpos($uri, "/") + 1));
$date = $element->find('li.bdm_date', 0)->innertext;
$time = mktime(0, 0, 0, substr($date, 3, 2), substr($date, 0, 2), substr($date, 6, 4));
$item['timestamp'] = $time;
$item['author'] = $element->find('li.bdm_pseudo', 0)->innertext;
$this->items[] = $item;
}
}
}
}

View File

@ -0,0 +1,83 @@
<?php
class BundesbankBridge extends BridgeAbstract {
const PARAM_LANG = 'lang';
const LANG_EN = 'en';
const LANG_DE = 'de';
const NAME = 'Bundesbank Bridge';
const URI = 'https://www.bundesbank.de/';
const DESCRIPTION = 'Returns the latest studies of the Bundesbank (Germany)';
const MAINTAINER = 'logmanoriginal';
const CACHE_TIMEOUT = 86400; // 24 hours
const PARAMETERS = array(
array(
self::PARAM_LANG => array(
'name' => 'Language',
'type' => 'list',
'required' => true,
'defaultValue' => self::LANG_DE,
'values' => array(
'English' => self::LANG_EN,
'Deutsch' => self::LANG_DE
)
)
)
);
public function getURI() {
switch($this->getInput(self::PARAM_LANG)) {
case self::LANG_EN: return self::URI . 'en/publications/reports/studies';
case self::LANG_DE: return self::URI . 'de/publikationen/berichte/studien';
}
return parent::getURI();
}
public function collectData() {
$html = getSimpleHTMLDOM($this->getURI())
or returnServerError('No response for ' . $this->getURI());
$html = defaultLinkTo($html, $this->getURI());
foreach($html->find('ul.resultlist li') as $study) {
$item = array();
$item['uri'] = $study->find('.teasable__link', 0)->href;
// Get title without child elements (i.e. subtitle)
$title = $study->find('.teasable__title div.h2', 0);
foreach($title->children as &$child) {
$child->outertext = '';
}
$item['title'] = $title->innertext;
// Add subtitle to the content if it exists
$item['content'] = '';
if($subtitle = $study->find('.teasable__subtitle', 0)) {
$item['content'] .= '<strong>' . $study->find('.teasable__subtitle', 0)->plaintext . '</strong>';
}
$item['content'] .= '<p>' . $study->find('.teasable__text', 0)->plaintext . '</p>';
$item['timestamp'] = strtotime($study->find('.teasable__date', 0)->plaintext);
// Downloads and older studies don't have images
if($study->find('.teasable__image', 0)) {
$item['enclosures'] = array(
$study->find('.teasable__image img', 0)->src
);
}
$this->items[] = $item;
}
}
}

View File

@ -1,45 +0,0 @@
<?php
class CADBridge extends FeedExpander {
const MAINTAINER = 'nyutag';
const NAME = 'CAD Bridge';
const URI = 'http://www.cad-comic.com/';
const CACHE_TIMEOUT = 7200; //2h
const DESCRIPTION = 'Returns the newest articles.';
public function collectData(){
$this->collectExpandableDatas('http://cdn2.cad-comic.com/rss.xml', 10);
}
protected function parseItem($newsItem){
$item = parent::parseItem($newsItem);
$item['content'] = $this->extractCADContent($item['uri']);
return $item;
}
private function extractCADContent($url) {
$html3 = getSimpleHTMLDOMCached($url);
// The request might fail due to missing https support or wrong URL
if($html3 == false)
return 'Daily comic not released yet';
$htmlpart = explode("/", $url);
switch ($htmlpart[3]) {
case 'cad':
preg_match_all("/http:\/\/cdn2\.cad-comic\.com\/comics\/cad-\S*png/", $html3, $url2);
break;
case 'sillies':
preg_match_all("/http:\/\/cdn2\.cad-comic\.com\/comics\/sillies-\S*gif/", $html3, $url2);
break;
default:
return 'Daily comic not released yet';
}
$img = implode($url2[0]);
$html3->clear();
unset($html3);
if ($img == '')
return 'Daily comic not released yet';
return '<img src="' . $img . '"/>';
}
}

View File

@ -3,91 +3,107 @@ class CNETBridge extends BridgeAbstract {
const MAINTAINER = 'ORelio';
const NAME = 'CNET News';
const URI = 'http://www.cnet.com/';
const CACHE_TIMEOUT = 1800; // 30min
const DESCRIPTION = 'Returns the newest articles. <br /> You may specify a
topic found in some section URLs, else all topics are selected.';
const PARAMETERS = array( array(
'topic' => array(
'name' => 'Topic name'
const URI = 'https://www.cnet.com/';
const CACHE_TIMEOUT = 3600; // 1h
const DESCRIPTION = 'Returns the newest articles.';
const PARAMETERS = array(
array(
'topic' => array(
'name' => 'Topic',
'type' => 'list',
'values' => array(
'All articles' => '',
'Apple' => 'apple',
'Google' => 'google',
'Microsoft' => 'tags-microsoft',
'Computers' => 'topics-computers',
'Mobile' => 'topics-mobile',
'Sci-Tech' => 'topics-sci-tech',
'Security' => 'topics-security',
'Internet' => 'topics-internet',
'Tech Industry' => 'topics-tech-industry'
)
)
)
));
);
public function collectData(){
private function cleanArticle($article_html) {
$offset_p = strpos($article_html, '<p>');
$offset_figure = strpos($article_html, '<figure');
$offset = ($offset_figure < $offset_p ? $offset_figure : $offset_p);
$article_html = substr($article_html, $offset);
$article_html = str_replace('href="/', 'href="' . self::URI, $article_html);
$article_html = str_replace(' height="0"', '', $article_html);
$article_html = str_replace('<noscript>', '', $article_html);
$article_html = str_replace('</noscript>', '', $article_html);
$article_html = StripWithDelimiters($article_html, '<a class="clickToEnlarge', '</a>');
$article_html = stripWithDelimiters($article_html, '<span class="nowPlaying', '</span>');
$article_html = stripWithDelimiters($article_html, '<span class="duration', '</span>');
$article_html = stripWithDelimiters($article_html, '<script', '</script>');
$article_html = stripWithDelimiters($article_html, '<svg', '</svg>');
return $article_html;
}
function extractFromDelimiters($string, $start, $end){
if(strpos($string, $start) !== false) {
$section_retrieved = substr($string, strpos($string, $start) + strlen($start));
$section_retrieved = substr($section_retrieved, 0, strpos($section_retrieved, $end));
return $section_retrieved;
public function collectData() {
// Retrieve and check user input
$topic = str_replace('-', '/', $this->getInput('topic'));
if (!empty($topic) && (substr_count($topic, '/') > 1 || !ctype_alpha(str_replace('/', '', $topic))))
returnClientError('Invalid topic: ' . $topic);
// Retrieve webpage
$pageUrl = self::URI . (empty($topic) ? 'news/' : $topic.'/');
$html = getSimpleHTMLDOM($pageUrl)
or returnServerError('Could not request CNET: '.$pageUrl);
// Process articles
foreach($html->find('div.assetBody, div.riverPost') as $element) {
if(count($this->items) >= 10) {
break;
}
return false;
}
$article_title = trim($element->find('h2, h3', 0)->plaintext);
$article_uri = self::URI . substr($element->find('a', 0)->href, 1);
$article_thumbnail = $element->parent()->find('img[src]', 0)->src;
$article_timestamp = strtotime($element->find('time.assetTime, div.timeAgo', 0)->plaintext);
$article_author = trim($element->find('a[rel=author], a.name', 0)->plaintext);
$article_content = '<p><b>' . trim($element->find('p.dek', 0)->plaintext) . '</b></p>';
function stripWithDelimiters($string, $start, $end){
while(strpos($string, $start) !== false) {
$section_to_remove = substr($string, strpos($string, $start));
$section_to_remove = substr($section_to_remove, 0, strpos($section_to_remove, $end) + strlen($end));
$string = str_replace($section_to_remove, '', $string);
}
if (is_null($article_thumbnail))
$article_thumbnail = extractFromDelimiters($element->innertext, '<img src="', '"');
return $string;
}
if (!empty($article_title) && !empty($article_uri) && strpos($article_uri, self::URI . 'news/') !== false) {
function cleanArticle($article_html){
$article_html = '<p>' . substr($article_html, strpos($article_html, '<p>') + 3);
$article_html = stripWithDelimiters($article_html, '<span class="credit">', '</span>');
$article_html = stripWithDelimiters($article_html, '<script', '</script>');
$article_html = stripWithDelimiters($article_html, '<div class="shortcode related-links', '</div>');
$article_html = stripWithDelimiters($article_html, '<a class="clickToEnlarge">', '</a>');
return $article_html;
}
$article_html = getSimpleHTMLDOMCached($article_uri) or $article_html = null;
$pageUrl = self::URI . (empty($this->getInput('topic')) ? '' : 'topics/' . $this->getInput('topic') . '/');
$html = getSimpleHTMLDOM($pageUrl) or returnServerError('Could not request CNET: ' . $pageUrl);
$limit = 0;
if (!is_null($article_html)) {
foreach($html->find('div.assetBody') as $element) {
if($limit < 8) {
$article_title = trim($element->find('h2', 0)->plaintext);
$article_uri = self::URI . ($element->find('a', 0)->href);
$article_timestamp = strtotime($element->find('time.assetTime', 0)->plaintext);
$article_author = trim($element->find('a[rel=author]', 0)->plaintext);
if (empty($article_thumbnail))
$article_thumbnail = $article_html->find('div.originalImage', 0);
if (empty($article_thumbnail))
$article_thumbnail = $article_html->find('span.imageContainer', 0);
if (is_object($article_thumbnail))
$article_thumbnail = $article_thumbnail->find('img', 0)->src;
if(!empty($article_title) && !empty($article_uri) && strpos($article_uri, '/news/') !== false) {
$article_html = getSimpleHTMLDOM($article_uri)
or returnServerError('Could not request CNET: ' . $article_uri);
$article_content = trim(
cleanArticle(
$article_content .= trim(
$this->cleanArticle(
extractFromDelimiters(
$article_html,
'<div class="articleContent',
'<footer>'
$article_html, '<article', '<footer'
)
)
);
$item = array();
$item['uri'] = $article_uri;
$item['title'] = $article_title;
$item['author'] = $article_author;
$item['timestamp'] = $article_timestamp;
$item['content'] = $article_content;
$this->items[] = $item;
$limit++;
}
$item = array();
$item['uri'] = $article_uri;
$item['title'] = $article_title;
$item['author'] = $article_author;
$item['timestamp'] = $article_timestamp;
$item['enclosures'] = array($article_thumbnail);
$item['content'] = $article_content;
$this->items[] = $item;
}
}
}
public function getName(){
if(!is_null($this->getInput('topic'))) {
$topic = $this->getInput('topic');
return 'CNET News Bridge' . (empty($topic) ? '' : ' - ' . $topic);
}
return parent::getName();
}
}

View File

@ -0,0 +1,93 @@
<?php
class ContainerLinuxReleasesBridge extends BridgeAbstract {
const MAINTAINER = 'captn3m0';
const NAME = 'Core OS Container Linux Releases Bridge';
const URI = 'https://coreos.com/releases/';
const DESCRIPTION = 'Returns the releases notes for Container Linux';
const STABLE = 'stable';
const BETA = 'beta';
const ALPHA = 'alpha';
const PARAMETERS = [
[
'channel' => [
'name' => 'Release Channel',
'type' => 'list',
'required' => true,
'defaultValue' => self::STABLE,
'values' => [
'Stable' => self::STABLE,
'Beta' => self::BETA,
'Alpha' => self::ALPHA,
],
]
]
];
private function getReleaseFeed($jsonUrl) {
$json = getContents($jsonUrl)
or returnServerError('Could not request Core OS Website.');
return json_decode($json, true);
}
public function collectData() {
$data = $this->getReleaseFeed($this->getJsonUri());
foreach ($data as $releaseVersion => $release) {
$item = [];
$item['uri'] = "https://coreos.com/releases/#$releaseVersion";
$item['title'] = $releaseVersion;
$content = $release['release_notes'];
$content .= <<<EOT
Major Software:
* Kernel: {$release['major_software']['kernel'][0]}
* Docker: {$release['major_software']['docker'][0]}
* etcd: {$release['major_software']['etcd'][0]}
EOT;
$item['timestamp'] = strtotime($release['release_date']);
// Based on https://gist.github.com/jbroadway/2836900
// Links
$regex = '/\[([^\[]+)\]\(([^\)]+)\)/';
$replacement = '<a href=\'\2\'>\1</a>';
$item['content'] = preg_replace($regex, $replacement, $content);
// Headings
$regex = '/^(.*)\:\s?$/m';
$replacement = '<h3>\1</h3>';
$item['content'] = preg_replace($regex, $replacement, $item['content']);
// Lists
$regex = '/\n\s*[\*|\-](.*)/';
$item['content'] = preg_replace_callback ($regex, function($regs) {
$item = $regs[1];
return sprintf ('<ul><li>%s</li></ul>', trim ($item));
}, $item['content']);
$this->items[] = $item;
}
}
private function getJsonUri() {
$channel = $this->getInput('channel');
return "https://coreos.com/releases/releases-$channel.json";
}
public function getURI() {
return self::URI;
}
public function getName(){
if(!is_null($this->getInput('channel'))) {
return 'Container Linux Releases: ' . $this->getInput('channel') . ' Channel';
}
return parent::getName();
}
}

View File

@ -25,7 +25,7 @@ class CopieDoubleBridge extends BridgeAbstract {
} elseif(strpos($element->innertext, '/images/suivant.gif') === false) {
$a = $element->find('a', 0);
$item['uri'] = self::URI . $a->href;
$content = str_replace('src="/', 'src="/' . self::URI, $element->find("td", 0)->innertext);
$content = str_replace('src="/', 'src="/' . self::URI, $element->find('td', 0)->innertext);
$content = str_replace('href="/', 'href="' . self::URI, $content);
$item['content'] = $content;
$this->items[] = $item;

View File

@ -11,7 +11,7 @@ class CourrierInternationalBridge extends BridgeAbstract {
$html = getSimpleHTMLDOM(self::URI)
or returnServerError('Error.');
$element = $html->find("article");
$element = $html->find('article');
$article_count = 1;
foreach($element as $article) {

View File

@ -1,74 +0,0 @@
<?php
class CpasbienBridge extends BridgeAbstract {
const MAINTAINER = 'lagaisse';
const NAME = 'Cpasbien Bridge';
const URI = 'http://www.cpasbien.cm';
const CACHE_TIMEOUT = 86400; // 24h
const DESCRIPTION = 'Returns latest torrents from a request query';
const PARAMETERS = array( array(
'q' => array(
'name' => 'Search',
'required' => true,
'title' => 'Type your search'
)
));
public function collectData(){
$request = str_replace(" ", "-", trim($this->getInput('q')));
$html = getSimpleHTMLDOM(self::URI . '/recherche/' . urlencode($request) . '.html')
or returnServerError('No results for this query.');
foreach($html->find('#gauche', 0)->find('div') as $episode) {
if($episode->getAttribute('class') == 'ligne0'
|| $episode->getAttribute('class') == 'ligne1') {
$urlepisode = $episode->find('a', 0)->getAttribute('href');
$htmlepisode = getSimpleHTMLDOMCached($urlepisode, 86400 * 366 * 30);
$item = array();
$item['author'] = $episode->find('a', 0)->text();
$item['title'] = $episode->find('a', 0)->text();
$item['pubdate'] = $this->getCachedDate($urlepisode);
$textefiche = $htmlepisode->find('#textefiche', 0)->find('p', 1);
if(isset($textefiche)) {
$item['content'] = $textefiche->text();
} else {
$p = $htmlepisode->find('#textefiche', 0)->find('p');
if(!empty($p)) {
$item['content'] = $htmlepisode->find('#textefiche', 0)->find('p', 0)->text();
}
}
$item['id'] = $episode->find('a', 0)->getAttribute('href');
$item['uri'] = self::URI . $htmlepisode->find('#telecharger', 0)->getAttribute('href');
$this->items[] = $item;
}
}
}
public function getName(){
if(!is_null($this->getInput('q'))) {
return $this->getInput('q') . ' : ' . self::NAME;
}
return parent::getName();
}
private function getCachedDate($url){
debugMessage('getting pubdate from url ' . $url . '');
// Initialize cache
$cache = Cache::create('FileCache');
$cache->setPath(CACHE_DIR . '/pages');
$params = [$url];
$cache->setParameters($params);
// Get cachefile timestamp
$time = $cache->getTime();
return ($time !== false ? $time : time());
}
}

View File

@ -1,7 +1,7 @@
<?php
class DanbooruBridge extends BridgeAbstract {
const MAINTAINER = 'mitsukarenai';
const MAINTAINER = 'mitsukarenai, logmanoriginal';
const NAME = 'Danbooru';
const URI = 'http://donmai.us/';
const CACHE_TIMEOUT = 1800; // 30min
@ -41,7 +41,7 @@ class DanbooruBridge extends BridgeAbstract {
$item = array();
$item['uri'] = $element->find('a', 0)->href;
$item['postid'] = (int)preg_replace("/[^0-9]/", '', $element->getAttribute(static::IDATTRIBUTE));
$item['postid'] = (int)preg_replace('/[^0-9]/', '', $element->getAttribute(static::IDATTRIBUTE));
$item['timestamp'] = time();
$thumbnailUri = $element->find('img', 0)->src;
$item['tags'] = $this->getTags($element);
@ -57,11 +57,80 @@ class DanbooruBridge extends BridgeAbstract {
}
public function collectData(){
$html = getSimpleHTMLDOM($this->getFullURI())
$content = getContents($this->getFullURI())
or returnServerError('Could not request ' . $this->getName());
$html = Fix_Simple_Html_Dom::str_get_html($content);
foreach($html->find(static::PATHTODATA) as $element) {
$this->items[] = $this->getItemFromElement($element);
}
}
}
/**
* This class is a monkey patch to 'extend' simplehtmldom to recognize <source>
* tags (HTML5) as self closing tag. This patch should be removed once
* simplehtmldom was fixed. This seems to be a issue with more tags:
* https://sourceforge.net/p/simplehtmldom/bugs/83/
*
* The tag itself is valid according to Mozilla:
*
* The HTML <picture> element serves as a container for zero or more <source>
* elements and one <img> element to provide versions of an image for different
* display device scenarios. The browser will consider each of the child <source>
* elements and select one corresponding to the best match found; if no matches
* are found among the <source> elements, the file specified by the <img>
* element's src attribute is selected. The selected image is then presented in
* the space occupied by the <img> element.
*
* -- https://developer.mozilla.org/en-US/docs/Web/HTML/Element/picture
*
* Notice: This class uses parts of the original simplehtmldom, adjusted to pass
* the guidelines of RSS-Bridge (formatting)
*/
final class Fix_Simple_Html_Dom extends simple_html_dom {
/* copy from simple_html_dom, added 'source' at the end */
protected $self_closing_tags = array(
'img' => 1,
'br' => 1,
'input' => 1,
'meta' => 1,
'link' => 1,
'hr' => 1,
'base' => 1,
'embed' => 1,
'spacer' => 1,
'source' => 1
);
/* copy from simplehtmldom, changed 'simple_html_dom' to 'Fix_Simple_Html_Dom' */
public static function str_get_html($str,
$lowercase = true,
$forceTagsClosed = true,
$target_charset = DEFAULT_TARGET_CHARSET,
$stripRN = true,
$defaultBRText = DEFAULT_BR_TEXT,
$defaultSpanText = DEFAULT_SPAN_TEXT)
{
$dom = new Fix_Simple_Html_Dom(null,
$lowercase,
$forceTagsClosed,
$target_charset,
$stripRN,
$defaultBRText,
$defaultSpanText);
if (empty($str) || strlen($str) > MAX_FILE_SIZE) {
$dom->clear();
return false;
}
$dom->load($str, $lowercase, $stripRN);
return $dom;
}
}

View File

@ -3,7 +3,7 @@ class DauphineLibereBridge extends FeedExpander {
const MAINTAINER = 'qwertygc';
const NAME = 'Dauphine Bridge';
const URI = 'http://www.ledauphine.com/';
const URI = 'https://www.ledauphine.com/';
const CACHE_TIMEOUT = 7200; // 2h
const DESCRIPTION = 'Returns the newest articles.';
@ -49,8 +49,9 @@ class DauphineLibereBridge extends FeedExpander {
private function extractContent($url){
$html2 = getSimpleHTMLDOMCached($url);
$text = $html2->find('div.column', 0)->innertext;
$text = preg_replace('@<script[^>]*?>.*?</script>@si', '', $text);
return $text;
foreach ($html2->find('.noprint, link, script, iframe, .shareTool, .contentInfo') as $remove) {
$remove->outertext = '';
}
return $html2->find('div.content', 0)->innertext;
}
}

File diff suppressed because it is too large Load Diff

View File

@ -35,11 +35,11 @@ class DemoBridge extends BridgeAbstract {
public function collectData(){
$item = array();
$item['author'] = "Me!";
$item['title'] = "Test";
$item['content'] = "Awesome content !";
$item['id'] = "Lalala";
$item['uri'] = "http://example.com/test";
$item['author'] = 'Me!';
$item['title'] = 'Test';
$item['content'] = 'Awesome content !';
$item['id'] = 'Lalala';
$item['uri'] = 'http://example.com/test';
$this->items[] = $item;
}

240
bridges/DesoutterBridge.php Normal file
View File

@ -0,0 +1,240 @@
<?php
class DesoutterBridge extends BridgeAbstract {
const CATEGORY_NEWS = 'News & Events';
const CATEGORY_INDUSTRY = 'Industry 4.0 News';
const NAME = 'Desoutter Bridge';
const URI = 'https://www.desouttertools.com';
const DESCRIPTION = 'Returns feeds for news from Desoutter';
const MAINTAINER = 'logmanoriginal';
const CACHE_TIMEOUT = 86400; // 24 hours
const PARAMETERS = array(
self::CATEGORY_NEWS => array(
'news_lang' => array(
'name' => 'Language',
'type' => 'list',
'required' => true,
'title' => 'Select your language',
'defaultValue' => 'Corporate',
'values' => array(
'Corporate'
=> 'https://www.desouttertools.com/about-desoutter/news-events',
'Česko'
=> 'https://www.desouttertools.cz/o-desoutter/aktuality-udalsoti',
'Deutschland'
=> 'https://www.desoutter.de/ueber-desoutter/news-events',
'España'
=> 'https://www.desouttertools.es/sobre-desoutter/noticias-eventos',
'México'
=> 'https://www.desouttertools.mx/acerca-desoutter/noticias-eventos',
'France'
=> 'https://www.desouttertools.fr/a-propos-de-desoutter/actualites-evenements',
'Magyarország'
=> 'https://www.desouttertools.hu/a-desoutter-vallalatrol/hirek-esemenyek',
'Italia'
=> 'https://www.desouttertools.it/su-desoutter/news-eventi',
'日本'
=> 'https://www.desouttertools.jp/desotanituite/niyusu-ibento',
'대한민국'
=> 'https://www.desouttertools.co.kr/desoteoe-daehaeseo/nyuseu-mic-ibenteu',
'Polska'
=> 'https://www.desouttertools.pl/o-desoutter/aktualnosci-wydarzenia',
'Brasil'
=> 'https://www.desouttertools.com.br/sobre-desoutter/noti%C2%ADcias-eventos',
'Portugal'
=> 'https://www.desouttertools.pt/sobre-desoutter/notIcias-eventos',
'România'
=> 'https://www.desouttertools.ro/despre-desoutter/noutati-evenimente',
'Российская Федерация'
=> 'https://www.desouttertools.com.ru/o-desoutter/novosti-mieropriiatiia',
'Slovensko'
=> 'https://www.desouttertools.sk/o-spolocnosti-desoutter/novinky-udalosti',
'Slovenija'
=> 'https://www.desouttertools.si/o-druzbi-desoutter/novice-dogodki',
'Sverige'
=> 'https://www.desouttertools.se/om-desoutter/nyheter-evenemang',
'Türkiye'
=> 'https://www.desoutter.com.tr/desoutter-hakkinda/haberler-etkinlikler',
'中国'
=> 'https://www.desouttertools.com.cn/guan-yu-ma-tou/xin-wen-he-huo-dong',
)
),
),
self::CATEGORY_INDUSTRY => array(
'industry_lang' => array(
'name' => 'Language',
'type' => 'list',
'required' => true,
'title' => 'Select your language',
'defaultValue' => 'Corporate',
'values' => array(
'Corporate'
=> 'https://www.desouttertools.com/industry-4-0/news',
'Česko'
=> 'https://www.desouttertools.cz/prumysl-4-0/novinky',
'Deutschland'
=> 'https://www.desoutter.de/industrie-4-0/news',
'España'
=> 'https://www.desouttertools.es/industria-4-0/noticias',
'México'
=> 'https://www.desouttertools.mx/industria-4-0/noticias',
'France'
=> 'https://www.desouttertools.fr/industrie-4-0/actualites',
'Magyarország'
=> 'https://www.desouttertools.hu/industry-4-0/hirek',
'Italia'
=> 'https://www.desouttertools.it/industry-4-0/news',
'日本'
=> 'https://www.desouttertools.jp/industry-4-0/news',
'대한민국'
=> 'https://www.desouttertools.co.kr/industry-4-0/news',
'Polska'
=> 'https://www.desouttertools.pl/przemysl-4-0/wiadomosci',
'Brasil'
=> 'https://www.desouttertools.com.br/industria-4-0/noticias',
'Portugal'
=> 'https://www.desouttertools.pt/industria-4-0/noticias',
'România'
=> 'https://www.desouttertools.ro/industry-4-0/noutati',
'Российская Федерация'
=> 'https://www.desouttertools.com.ru/industry-4-0/news',
'Slovensko'
=> 'https://www.desouttertools.sk/priemysel-4-0/novinky',
'Slovenija'
=> 'https://www.desouttertools.si/industrija-4-0/novice',
'Sverige'
=> 'https://www.desouttertools.se/industri-4-0/nyheter',
'Türkiye'
=> 'https://www.desoutter.com.tr/endustri-4-0/haberler',
'中国'
=> 'https://www.desouttertools.com.cn/industry-4-0/news',
)
),
),
'global' => array(
'full' => array(
'name' => 'Load full articles',
'type' => 'checkbox',
'required' => false,
'title' => 'Enable to load the full article for each item'
)
)
);
private $title;
public function getURI() {
switch($this->queriedContext) {
case self::CATEGORY_NEWS:
return $this->getInput('news_lang') ?: parent::getURI();
case self::CATEGORY_INDUSTRY:
return $this->getInput('industry_lang') ?: parent::getURI();
}
return parent::getURI();
}
public function getName() {
return isset($this->title) ? $this->title . ' - ' . parent::getName() : parent::getName();
}
public function collectData() {
// Uncomment to generate list of languages automtically (dev mode)
/*
switch($this->queriedContext) {
case self::CATEGORY_NEWS:
$this->extractNewsLanguages(); die;
case self::CATEGORY_INDUSTRY:
$this->extractIndustryLanguages(); die;
}
*/
$html = getSimpleHTMLDOM($this->getURI())
or returnServerError('Could not request ' . $this->getURI());
$html = defaultLinkTo($html, $this->getURI());
$this->title = html_entity_decode($html->find('title', 0)->plaintext, ENT_QUOTES);
foreach($html->find('article') as $article) {
$item = array();
$item['uri'] = $article->find('[itemprop="name"]', 0)->href;
$item['title'] = $article->find('[itemprop="name"]', 0)->title;
if($this->getInput('full')) {
$item['content'] = $this->getFullNewsArticle($item['uri']);
} else {
$item['content'] = $article->find('[itemprop="description"]', 0)->plaintext;
}
$this->items[] = $item;
}
}
private function getFullNewsArticle($uri) {
$html = getSimpleHTMLDOMCached($uri)
or returnServerError('Unable to load full article!');
$html = defaultLinkTo($html, $this->getURI());
return $html->find('section.article', 0);
}
/**
* Generates a HTML page with a PHP formatted array of languages,
* pointing to the corresponding news pages. Implementation is based
* on the 'Corporate' site.
* @return void
*/
private function extractNewsLanguages() {
$html = getSimpleHTMLDOMCached('https://www.desouttertools.com/about-desoutter/news-events')
or returnServerError('Error loading news!');
$html = defaultLinkTo($html, static::URI);
$items = $html->find('ul[class="dropdown-menu"] li');
$list = "\t'Corporate'\n\t=> 'https://www.desouttertools.com/about-desoutter/news-events',\n";
foreach($items as $item) {
$lang = trim($item->plaintext);
$uri = $item->find('a', 0)->href;
$list .= "\t'{$lang}'\n\t=> '{$uri}',\n";
}
echo $list;
}
/**
* Generates a HTML page with a PHP formatted array of languages,
* pointing to the corresponding news pages. Implementation is based
* on the 'Corporate' site.
* @return void
*/
private function extractIndustryLanguages() {
$html = getSimpleHTMLDOMCached('https://www.desouttertools.com/industry-4-0/news')
or returnServerError('Error loading news!');
$html = defaultLinkTo($html, static::URI);
$items = $html->find('ul[class="dropdown-menu"] li');
$list = "\t'Corporate'\n\t=> 'https://www.desouttertools.com/industry-4-0/news',\n";
foreach($items as $item) {
$lang = trim($item->plaintext);
$uri = $item->find('a', 0)->href;
$list .= "\t'{$lang}'\n\t=> '{$uri}',\n";
}
echo $list;
}
}

105
bridges/DevToBridge.php Normal file
View File

@ -0,0 +1,105 @@
<?php
class DevToBridge extends BridgeAbstract {
const CONTEXT_BY_TAG = 'By tag';
const NAME = 'dev.to Bridge';
const URI = 'https://dev.to';
const DESCRIPTION = 'Returns feeds for tags';
const MAINTAINER = 'logmanoriginal';
const CACHE_TIMEOUT = 10800; // 15 min.
const PARAMETERS = array(
self::CONTEXT_BY_TAG => array(
'tag' => array(
'name' => 'Tag',
'type' => 'text',
'required' => true,
'title' => 'Insert your tag',
'exampleValue' => 'python'
),
'full' => array(
'name' => 'Full article',
'type' => 'checkbox',
'required' => false,
'title' => 'Enable to receive the full article for each item',
'defaultValue' => false
)
)
);
public function getURI() {
switch($this->queriedContext) {
case self::CONTEXT_BY_TAG:
if($tag = $this->getInput('tag')) {
return static::URI . '/t/' . urlencode($tag);
}
break;
}
return parent::getURI();
}
public function getIcon() {
return 'https://practicaldev-herokuapp-com.freetls.fastly.net/assets/
apple-icon-5c6fa9f2bce280428589c6195b7f1924206a53b782b371cfe2d02da932c8c173.png';
}
public function collectData() {
$html = getSimpleHTMLDOMCached($this->getURI())
or returnServerError('Could not request ' . $this->getURI());
$html = defaultLinkTo($html, static::URI);
$articles = $html->find('div[class="single-article"]')
or returnServerError('Could not find articles!');
foreach($articles as $article) {
if($article->find('[class*="cta"]', 0)) { // Skip ads
continue;
}
$item = array();
$item['uri'] = $article->find('a[id*=article-link]', 0)->href;
$item['title'] = $article->find('h3', 0)->plaintext;
// i.e. "Charlie Harrington・Sep 21"
$item['timestamp'] = strtotime(explode('・', $article->find('h4 a', 0)->plaintext, 2)[1]);
$item['author'] = explode('・', $article->find('h4 a', 0)->plaintext, 2)[0];
// Profile image
$item['enclosures'] = array($article->find('img', 0)->src);
if($this->getInput('full')) {
$fullArticle = $this->getFullArticle($item['uri']);
$item['content'] = <<<EOD
<img src="{$item['enclosures'][0]}" alt="{$item['author']}">
<p>{$fullArticle}</p>
EOD;
} else {
$item['content'] = <<<EOD
<img src="{$item['enclosures'][0]}" alt="{$item['author']}">
<p>{$item['title']}</p>
EOD;
}
$item['categories'] = array_map(function($e){ return $e->plaintext; }, $article->find('div.tags span.tag'));
$this->items[] = $item;
}
}
private function getFullArticle($url) {
$html = getSimpleHTMLDOMCached($url)
or returnServerError('Unable to load article from "' . $url . '"!');
$html = defaultLinkTo($html, static::URI);
return $html->find('[id="article-body"]', 0);
}
}

View File

@ -9,8 +9,8 @@ class DilbertBridge extends BridgeAbstract {
public function collectData(){
$html = getSimpleHTMLDOM($this->getURI())
or returnServerError('Could not request Dilbert: ' . $this->getURI());
$html = getSimpleHTMLDOM(self::URI)
or returnServerError('Could not request Dilbert: ' . self::URI);
foreach($html->find('section.comic-item') as $element) {

View File

@ -42,59 +42,59 @@ class DiscogsBridge extends BridgeAbstract {
if(!empty($this->getInput('artistid')) || !empty($this->getInput('labelid'))) {
if(!empty($this->getInput('artistid'))) {
$data = getContents("https://api.discogs.com/artists/"
$data = getContents('https://api.discogs.com/artists/'
. $this->getInput('artistid')
. "/releases?sort=year&sort_order=desc")
or returnServerError("Unable to query discogs !");
. '/releases?sort=year&sort_order=desc')
or returnServerError('Unable to query discogs !');
} elseif(!empty($this->getInput('labelid'))) {
$data = getContents("https://api.discogs.com/labels/"
$data = getContents('https://api.discogs.com/labels/'
. $this->getInput('labelid')
. "/releases?sort=year&sort_order=desc")
or returnServerError("Unable to query discogs !");
. '/releases?sort=year&sort_order=desc')
or returnServerError('Unable to query discogs !');
}
$jsonData = json_decode($data, true);
foreach($jsonData["releases"] as $release) {
foreach($jsonData['releases'] as $release) {
$item = array();
$item["author"] = $release["artist"];
$item["title"] = $release["title"];
$item["id"] = $release["id"];
$resId = array_key_exists("main_release", $release) ? $release["main_release"] : $release["id"];
$item["uri"] = self::URI . $this->getInput('artistid') . "/release/" . $resId;
$item["timestamp"] = DateTime::createFromFormat("Y", $release["year"])->getTimestamp();
$item["content"] = $item["author"] . " - " . $item["title"];
$item['author'] = $release['artist'];
$item['title'] = $release['title'];
$item['id'] = $release['id'];
$resId = array_key_exists('main_release', $release) ? $release['main_release'] : $release['id'];
$item['uri'] = self::URI . $this->getInput('artistid') . '/release/' . $resId;
$item['timestamp'] = DateTime::createFromFormat('Y', $release['year'])->getTimestamp();
$item['content'] = $item['author'] . ' - ' . $item['title'];
$this->items[] = $item;
}
} elseif(!empty($this->getInput("username_wantlist")) || !empty($this->getInput("username_folder"))) {
} elseif(!empty($this->getInput('username_wantlist')) || !empty($this->getInput('username_folder'))) {
if(!empty($this->getInput("username_wantlist"))) {
$data = getContents("https://api.discogs.com/users/"
if(!empty($this->getInput('username_wantlist'))) {
$data = getContents('https://api.discogs.com/users/'
. $this->getInput('username_wantlist')
. "/wants?sort=added&sort_order=desc")
or returnServerError("Unable to query discogs !");
$jsonData = json_decode($data, true)["wants"];
. '/wants?sort=added&sort_order=desc')
or returnServerError('Unable to query discogs !');
$jsonData = json_decode($data, true)['wants'];
} elseif(!empty($this->getInput("username_folder"))) {
$data = getContents("https://api.discogs.com/users/"
} elseif(!empty($this->getInput('username_folder'))) {
$data = getContents('https://api.discogs.com/users/'
. $this->getInput('username_folder')
. "/collection/folders/"
. $this->getInput("folderid")
."/releases?sort=added&sort_order=desc")
or returnServerError("Unable to query discogs !");
$jsonData = json_decode($data, true)["releases"];
. '/collection/folders/'
. $this->getInput('folderid')
.'/releases?sort=added&sort_order=desc')
or returnServerError('Unable to query discogs !');
$jsonData = json_decode($data, true)['releases'];
}
foreach($jsonData as $element) {
$infos = $element["basic_information"];
$infos = $element['basic_information'];
$item = array();
$item["title"] = $infos["title"];
$item["author"] = $infos["artists"][0]["name"];
$item["id"] = $infos["artists"][0]["id"];
$item["uri"] = self::URI . $infos["artists"][0]["id"] . "/release/" . $infos["id"];
$item["timestamp"] = strtotime($element["date_added"]);
$item["content"] = $item["author"] . " - " . $item["title"];
$item['title'] = $infos['title'];
$item['author'] = $infos['artists'][0]['name'];
$item['id'] = $infos['artists'][0]['id'];
$item['uri'] = self::URI . $infos['artists'][0]['id'] . '/release/' . $infos['id'];
$item['timestamp'] = strtotime($element['date_added']);
$item['content'] = $item['author'] . ' - ' . $item['title'];
$this->items[] = $item;
}

View File

@ -1,7 +1,7 @@
<?php
class ETTVBridge extends BridgeAbstract {
const MAINTAINER = "GregThib";
const MAINTAINER = 'GregThib';
const NAME = 'ETTV';
const URI = 'https://www.ettv.tv/';
const DESCRIPTION = 'Returns list of 20 latest torrents for a specific search.';
@ -94,17 +94,20 @@ class ETTVBridge extends BridgeAbstract {
)
));
protected $results_link;
public function collectData(){
// No control on inputs, because all have defaultValue set
// No control on inputs, because all defaultValue are set
$query_str = 'torrents-search.php';
$query_str .= '?search=' . urlencode('+'.str_replace(' ', ' +', $this->getInput('query')));
$query_str .= '&cat=' . $this->getInput('cat');
$query_str .= 'incldead&=' . $this->getInput('status');
$query_str .= '&incldead=' . $this->getInput('status');
$query_str .= '&lang=' . $this->getInput('lang');
$query_str .= '&sort=id&order=desc';
// Get results page
$html = getSimpleHTMLDOM(self::URI . $query_str)
$this->results_link = self::URI . $query_str;
$html = getSimpleHTMLDOM($this->results_link)
or returnServerError('Could not request ' . $this->getName());
// Loop on each entry
@ -113,7 +116,7 @@ class ETTVBridge extends BridgeAbstract {
$entry = $element->find('td', 1)->find('a', 0);
// retrieve result page to get more details
$link = rtrim(self::URI, "/") . $entry->href;
$link = rtrim(self::URI, '/') . $entry->href;
$page = getSimpleHTMLDOM($link)
or returnServerError('Could not request page ' . $link);
@ -125,7 +128,7 @@ class ETTVBridge extends BridgeAbstract {
$item = array();
$item['author'] = $details->children(6)->children(1)->plaintext;
$item['title'] = $entry->title;
$item['uri'] = $dllinks->children(0)->children(0)->children(0)->href;
$item['uri'] = $link;
$item['timestamp'] = strtotime($details->children(7)->children(1)->plaintext);
$item['content'] = '';
$item['content'] .= '<br/><b>Name: </b>' . $details->children(0)->children(1)->innertext;
@ -139,4 +142,20 @@ class ETTVBridge extends BridgeAbstract {
$this->items[] = $item;
}
}
public function getName(){
if($this->getInput('query')) {
return '[' . self::NAME . '] ' . $this->getInput('query');
}
return self::NAME;
}
public function getURI(){
if(isset($this->results_link) && !empty($this->results_link)) {
return $this->results_link;
}
return self::URI;
}
}

View File

@ -1,7 +1,7 @@
<?php
class EZTVBridge extends BridgeAbstract {
const MAINTAINER = "alexAubin";
const MAINTAINER = 'alexAubin';
const NAME = 'EZTV';
const URI = 'https://eztv.ch/';
const DESCRIPTION = 'Returns list of *recent* torrents for a specific show
@ -23,15 +23,15 @@ on EZTV. Get showID from URLs in https://eztv.ch/shows/showID/show-full-name.';
$relativeDays = 0;
$relativeHours = 0;
foreach(explode(" ", $relativeReleaseTime) as $relativeTimeElement) {
if(substr($relativeTimeElement, -1) == "d") $relativeDays = substr($relativeTimeElement, 0, -1);
if(substr($relativeTimeElement, -1) == "h") $relativeHours = substr($relativeTimeElement, 0, -1);
foreach(explode(' ', $relativeReleaseTime) as $relativeTimeElement) {
if(substr($relativeTimeElement, -1) == 'd') $relativeDays = substr($relativeTimeElement, 0, -1);
if(substr($relativeTimeElement, -1) == 'h') $relativeHours = substr($relativeTimeElement, 0, -1);
}
return mktime(date('h') - $relativeHours, 0, 0, date('m'), date('d') - $relativeDays, date('Y'));
}
// Loop on show ids
$showList = explode(",", $this->getInput('i'));
$showList = explode(',', $this->getInput('i'));
foreach($showList as $showID) {
// Get show page

View File

@ -45,9 +45,10 @@ class ElloBridge extends BridgeAbstract {
$item = array();
$item['author'] = $this->getUsername($post, $postData);
$item['timestamp'] = strtotime($post->created_at);
$item['title'] = $this->findText($post->summary);
$item['title'] = strip_tags($this->findText($post->summary));
$item['content'] = $this->getPostContent($post->body);
$item['enclosures'] = $this->getEnclosures($post, $postData);
$item['uri'] = self::URI . $item['author'] . '/post/' . $post->token;
$content = $post->body;
$this->items[] = $item;
@ -57,7 +58,7 @@ class ElloBridge extends BridgeAbstract {
}
public function findText($path) {
private function findText($path) {
foreach($path as $summaryElement) {
@ -71,7 +72,7 @@ class ElloBridge extends BridgeAbstract {
}
public function getPostContent($path) {
private function getPostContent($path) {
$content = '';
foreach($path as $summaryElement) {
@ -92,7 +93,7 @@ class ElloBridge extends BridgeAbstract {
}
public function getEnclosures($post, $postData) {
private function getEnclosures($post, $postData) {
$assets = [];
foreach($post->links->assets as $asset) {
@ -108,7 +109,7 @@ class ElloBridge extends BridgeAbstract {
}
public function getUsername($post, $postData) {
private function getUsername($post, $postData) {
foreach($postData->linked->users as $user) {
if($user->id == $post->links->author->id) {
@ -118,7 +119,7 @@ class ElloBridge extends BridgeAbstract {
}
public function getAPIKey() {
private function getAPIKey() {
$cache = Cache::create('FileCache');
$cache->setPath(CACHE_DIR);
$cache->setParameters(['key']);

View File

@ -7,19 +7,9 @@ class EstCeQuonMetEnProdBridge extends BridgeAbstract {
const CACHE_TIMEOUT = 21600; // 6h
const DESCRIPTION = 'Should we put a website in production today? (French)';
public function collectData(){
function extractFromDelimiters($string, $start, $end){
if(strpos($string, $start) !== false) {
$section_retrieved = substr($string, strpos($string, $start) + strlen($start));
$section_retrieved = substr($section_retrieved, 0, strpos($section_retrieved, $end));
return $section_retrieved;
}
return false;
}
$html = getSimpleHTMLDOM($this->getURI())
or returnServerError('Could not request EstCeQuonMetEnProd: ' . $this->getURI());
public function collectData() {
$html = getSimpleHTMLDOM(self::URI)
or returnServerError('Could not request EstCeQuonMetEnProd: ' . self::URI);
$item = array();
$item['uri'] = $this->getURI() . '#' . date('Y-m-d');
@ -28,8 +18,8 @@ class EstCeQuonMetEnProdBridge extends BridgeAbstract {
$item['timestamp'] = strtotime('today midnight');
$item['content'] = str_replace(
'src="/',
'src="' . $this->getURI(),
trim(extractFromDelimiters($html->outertext, '<body role="document">', '<br /><br />'))
'src="' . self::URI,
trim(extractFromDelimiters($html->outertext, '<body role="document">', '<div id="share'))
);
$this->items[] = $item;

View File

@ -0,0 +1,104 @@
<?php
class ExtremeDownloadBridge extends BridgeAbstract {
const NAME = 'Extreme Download';
const URI = 'https://ww1.extreme-d0wn.com/';
const DESCRIPTION = 'Suivi de série sur Extreme Download';
const MAINTAINER = 'sysadminstory';
const PARAMETERS = array(
'Suivre la publication des épisodes d\'une série en cours de diffusion' => array(
'url' => array(
'name' => 'URL de la série',
'type' => 'text',
'required' => true,
'title' => 'URL d\'une série sans le https://ww1.extreme-d0wn.com/',
'exampleValue' => 'series-hd/hd-series-vostfr/46631-halt-and-catch-fire-saison-04-vostfr-hdtv-720p.html'),
'filter' => array(
'name' => 'Type de contenu',
'type' => 'list',
'required' => 'true',
'title' => 'Type de contenu à suivre : Téléchargement, Streaming ou les deux',
'values' => array(
'Streaming et Téléchargement' => 'both',
'Téléchargement' => 'download',
'Streaming' => 'streaming'
)
)
)
);
public function collectData(){
$html = getSimpleHTMLDOM(self::URI . $this->getInput('url'))
or returnServerError('Could not request Extreme Download.');
$filter = $this->getInput('filter');
$typesText = array(
'download' => 'Téléchargement',
'streaming' => 'Streaming'
);
// Get the TV show title
$this->showTitle = trim($html->find('span[id=news-title]', 0)->plaintext);
$list = $html->find('div[class=prez_7]');
foreach($list as $element) {
$add = false;
// Link type is needed is needed to generate an unique link
$type = $this->findLinkType($element);
if($filter == 'both') {
$add = true;
} else {
if($type == $filter) {
$add = true;
}
}
if($add == true) {
$item = array();
// Get the element name
$title = $element->plaintext;
// Get thee element links
$links = $element->next_sibling()->innertext;
$item['content'] = $links;
$item['title'] = $this->showTitle . ' ' . $title . ' - ' . $typesText[$type];
// As RSS Bridge use the URI as GUID they need to be unique : adding a md5 hash of the title element
// should geneerate unique URI to prevent confusion for RSS readers
$item['uri'] = self::URI . $this->getInput('url') . '#' . hash('md5', $item['title']);
$this->items[] = $item;
}
}
}
public function getName(){
switch($this->queriedContext) {
case 'Suivre la publication des épisodes d\'une série en cours de diffusion':
return $this->showTitle . ' - ' . self::NAME;
break;
default:
return self::NAME;
}
}
private function findLinkType($element)
{
$return = '';
// Walk through all elements in the reverse order until finding one with class 'presz_2'
while($element->class != 'prez_2') {
$element = $element->prev_sibling();
}
$text = html_entity_decode($element->plaintext);
// Regarding the text of the element, return the according link type
if(stristr($text, 'téléchargement') != false) {
$return = 'download';
} else if(stristr($text, 'streaming') != false) {
$return = 'streaming';
}
return $return;
}
}

View File

@ -17,22 +17,12 @@ class FB2Bridge extends BridgeAbstract {
public function collectData(){
function extractFromDelimiters($string, $start, $end){
if(strpos($string, $start) !== false) {
$section_retrieved = substr($string, strpos($string, $start) + strlen($start));
$section_retrieved = substr($section_retrieved, 0, strpos($section_retrieved, $end));
return $section_retrieved;
}
return false;
}
//Utility function for cleaning a Facebook link
$unescape_fb_link = function($matches){
if(is_array($matches) && count($matches) > 1) {
$link = $matches[1];
if(strpos($link, '/') === 0)
$link = self::URI . $link . '"';
$link = self::URI . substr($link, 1);
if(strpos($link, 'facebook.com/l.php?u=') !== false)
$link = urldecode(extractFromDelimiters($link, 'facebook.com/l.php?u=', '&'));
return ' href="' . $link . '"';
@ -95,7 +85,7 @@ EOD;
. $pageID
. '&cursor={"card_id"%3A"videos"%2C"has_next_page"%3Atrue}&surface=mobile_page_home&unit_count=8';
$fileContent = file_get_contents($requestString);
$fileContent = getContents($requestString);
$articleIndex = 0;
$maxArticle = 3;
@ -103,23 +93,34 @@ EOD;
$html = $this->buildContent($fileContent);
$author = $this->getInput('u');
foreach($html->find("article") as $content) {
foreach($html->find('article') as $content) {
$item = array();
$item['uri'] = "http://touch.facebook.com"
. $content->find("div[class='_52jc _5qc4 _24u0 _36xo']", 0)->find("a", 0)->getAttribute("href");
preg_match('/publish_time\\\&quot;:([0-9]+),/', $content->getAttribute('data-store', 0), $match);
if(isset($match[1]))
$timestamp = $match[1];
else
$timestamp = 0;
if($content->find("header", 0) !== null) {
$content->find("header", 0)->innertext = "";
$item['uri'] = html_entity_decode('http://touch.facebook.com'
. $content->find("div[class='_52jc _5qc4 _24u0 _36xo']", 0)->find('a', 0)->getAttribute('href'), ENT_QUOTES);
if($content->find('header', 0) !== null) {
$content->find('header', 0)->innertext = '';
}
if($content->find("footer", 0) !== null) {
$content->find("footer", 0)->innertext = "";
if($content->find('footer', 0) !== null) {
$content->find('footer', 0)->innertext = '';
}
// Replace emoticon images by their textual representation (part of the span)
foreach($content->find('span[title*="emoticon"]') as $emoticon) {
$emoticon->innertext = $emoticon->find('span[aria-hidden="true"]', 0)->innertext;
}
//Remove html nodes, keep only img, links, basic formatting
$content = strip_tags($content, '<a><img><i><u><br><p>');
$content = strip_tags($content, '<a><img><i><u><br><p><h3><h4>');
//Adapt link hrefs: convert relative links into absolute links and bypass external link redirection
$content = preg_replace_callback('/ href=\"([^"]+)\"/i', $unescape_fb_link, $content);
@ -145,7 +146,7 @@ EOD;
// "<i><u>smile emoticon</u></i>" back to ASCII emoticons eg ":)"
$content = preg_replace_callback('/<i><u>([^ <>]+) ([^<>]+)<\/u><\/i>/i', $unescape_fb_emote, $content);
$item['content'] = $content;
$item['content'] = html_entity_decode($content, ENT_QUOTES);
$title = $author;
if (strlen($title) > 24)
@ -154,10 +155,12 @@ EOD;
if (strlen($title) > 64)
$title = substr($title, 0, strpos(wordwrap($title, 64), "\n")) . '...';
$item['title'] = $title;
$item['author'] = $author;
$item['title'] = html_entity_decode($title, ENT_QUOTES);
$item['author'] = html_entity_decode($author, ENT_QUOTES);
$item['timestamp'] = html_entity_decode($timestamp, ENT_QUOTES);
array_push($this->items, $item);
if($item['timestamp'] != 0)
array_push($this->items, $item);
}
}
@ -168,7 +171,7 @@ EOD;
$regex = implode(
'',
array(
"/timeline_unit",
'/timeline_unit',
"\\\\\\\\u00253A1",
"\\\\\\\\u00253A([0-9]*)",
"\\\\\\\\u00253A([0-9]*)",
@ -182,29 +185,29 @@ EOD;
return implode(
'',
array(
"https://touch.facebook.com/pages_reaction_units/more/?page_id=",
'https://touch.facebook.com/pages_reaction_units/more/?page_id=',
$pageID,
"&cursor=%7B%22timeline_cursor%22%3A%22timeline_unit%3A1%3A",
'&cursor=%7B%22timeline_cursor%22%3A%22timeline_unit%3A1%3A',
$result[1],
"%3A",
'%3A',
$result[2],
"%3A",
'%3A',
$result[3],
"%3A",
'%3A',
$result[4],
"%22%2C%22timeline_section_cursor%22%3A%7B%7D%2C%22",
"has_next_page%22%3Atrue%7D&surface=mobile_page_home&unit_count=3"
'%22%2C%22timeline_section_cursor%22%3A%7B%7D%2C%22',
'has_next_page%22%3Atrue%7D&surface=mobile_page_home&unit_count=3'
)
);
}
//Builds the HTML from the encoded JS that Facebook provides.
private function buildContent($pageContent){
$regex = "/\\\"html\\\":\\\"(.*?)\\\",\\\"replace/";
// The html ends with:
// /div>","replaceifexists
$regex = '/\\"html\\":(\".+\/div>"),"replace/';
preg_match($regex, $pageContent, $result);
return str_get_html(html_entity_decode(json_decode('"' . $result[1] . '"')));
return str_get_html(json_decode($result[1]));
}
@ -214,7 +217,7 @@ EOD;
$ctx = stream_context_create(array(
'http' => array(
'user_agent' => "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:46.0) Gecko/20100101 Firefox/46.0",
'user_agent' => 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:46.0) Gecko/20100101 Firefox/46.0',
'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'
)
)
@ -222,12 +225,12 @@ EOD;
$a = file_get_contents($pageURL, 0, $ctx);
//First request to get the cookie
$cookies = "";
$cookies = '';
foreach($http_response_header as $hdr) {
if(strpos($hdr, "Set-Cookie") !== false) {
$cLine = explode(":", $hdr)[1];
$cLine = explode(";", $cLine)[0];
$cookies .= ";" . $cLine;
if(strpos($hdr, 'Set-Cookie') !== false) {
$cLine = explode(':', $hdr)[1];
$cLine = explode(';', $cLine)[0];
$cookies .= ';' . $cLine;
}
}
@ -239,7 +242,7 @@ EOD;
$context = stream_context_create(array(
'http' => array(
'user_agent' => "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:46.0) Gecko/20100101 Firefox/46.0",
'user_agent' => 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:46.0) Gecko/20100101 Firefox/46.0',
'header' => 'Cookie: ' . $cookies
)
)
@ -247,12 +250,12 @@ EOD;
$pageContent = file_get_contents($page, 0, $context);
if(strpos($pageContent, "signup-button") != false) {
if(strpos($pageContent, 'signup-button') != false) {
return -1;
}
//Get the page ID if we don't have a captcha
$regex = "/page_id=([0-9]*)&/";
$regex = '/page_id=([0-9]*)&/';
preg_match($regex, $pageContent, $matches);
if(count($matches) > 0) {
@ -260,7 +263,7 @@ EOD;
}
//Get the page ID if we do have a captcha
$regex = "/\"pageID\":\"([0-9]*)\"/";
$regex = '/"pageID":"([0-9]*)"/';
preg_match($regex, $pageContent, $matches);
return $matches[1];
@ -275,7 +278,4 @@ EOD;
return 'http://facebook.com';
}
public function getCacheDuration(){
return 60 * 60 * 3; // 5 minutes
}
}

View File

@ -1,171 +1,507 @@
<?php
class FacebookBridge extends BridgeAbstract {
const MAINTAINER = 'teromene';
const NAME = 'Facebook';
const MAINTAINER = 'teromene, logmanoriginal';
const NAME = 'Facebook Bridge';
const URI = 'https://www.facebook.com/';
const CACHE_TIMEOUT = 300; // 5min
const DESCRIPTION = 'Input a page title or a profile log. For a profile log,
please insert the parameter as follow : myExamplePage/132621766841117';
const PARAMETERS = array( array(
'u' => array(
'name' => 'Username',
'required' => true
),
'media_type' => array(
'name' => 'Media type',
'type' => 'list',
'required' => false,
'values' => array(
'All' => 'all',
'Video' => 'video',
'No Video' => 'novideo'
const PARAMETERS = array(
'User' => array(
'u' => array(
'name' => 'Username',
'required' => true
),
'defaultValue' => 'all'
'media_type' => array(
'name' => 'Media type',
'type' => 'list',
'required' => false,
'values' => array(
'All' => 'all',
'Video' => 'video',
'No Video' => 'novideo'
),
'defaultValue' => 'all'
),
'skip_reviews' => array(
'name' => 'Skip reviews',
'type' => 'checkbox',
'required' => false,
'defaultValue' => false,
'title' => 'Feed includes reviews when checked'
)
),
'Group' => array(
'g' => array(
'name' => 'Group',
'type' => 'text',
'required' => true,
'exampleValue' => 'https://www.facebook.com/groups/743149642484225',
'title' => 'Insert group name or facebook group URL'
)
),
'global' => array(
'limit' => array(
'name' => 'Limit',
'type' => 'number',
'required' => false,
'title' => 'Specify the number of items to return (default: -1)',
'defaultValue' => -1
)
)
));
);
private $authorName = '';
private $groupName = '';
public function collectData(){
public function getName(){
//Extract a string using start and end delimiters
function extractFromDelimiters($string, $start, $end){
if(strpos($string, $start) !== false) {
$section_retrieved = substr($string, strpos($string, $start) + strlen($start));
$section_retrieved = substr($section_retrieved, 0, strpos($section_retrieved, $end));
return $section_retrieved;
}
switch($this->queriedContext) {
case 'User':
if(!empty($this->authorName)) {
return isset($this->extraInfos['name']) ? $this->extraInfos['name'] : $this->authorName
. ' - ' . static::NAME;
}
break;
case 'Group':
if(!empty($this->groupName)) {
return $this->groupName . ' - ' . static::NAME;
}
break;
return false;
}
//Utility function for cleaning a Facebook link
$unescape_fb_link = function($matches){
return parent::getName();
}
public function getURI() {
$uri = self::URI;
switch($this->queriedContext) {
case 'Group':
// Discover groups via https://www.facebook.com/groups/
// Example group: https://www.facebook.com/groups/sailors.worldwide
$uri .= 'groups/' . $this->sanitizeGroup(filter_var($this->getInput('g'), FILTER_SANITIZE_URL));
break;
case 'User':
// Example user 1: https://www.facebook.com/artetv/
// Example user 2: artetv
$user = $this->sanitizeUser($this->getInput('u'));
if(!strpos($user, '/')) {
$uri .= '/pg/' . urlencode($user) . '/posts';
} else {
$uri .= 'pages/' . $user;
}
break;
}
// Request the mobile version to reduce page size (no javascript)
// More information: https://stackoverflow.com/a/11103592
return $uri .= '?_fb_noscript=1';
}
public function collectData() {
switch($this->queriedContext) {
case 'Group':
$this->collectGroupData();
break;
case 'User':
$this->collectUserData();
break;
default:
returnClientError('Unknown context: "' . $this->queriedContext . '"!');
}
$limit = $this->getInput('limit') ?: -1;
if($limit > 0 && count($this->items) > $limit) {
$this->items = array_slice($this->items, 0, $limit);
}
}
#region Group
private function collectGroupData() {
$header = array('Accept-Language: ' . getEnv('HTTP_ACCEPT_LANGUAGE') . "\r\n");
$html = getSimpleHTMLDOM($this->getURI(), $header)
or returnServerError('Failed loading facebook page: ' . $this->getURI());
if(!$this->isPublicGroup($html)) {
returnClientError('This group is not public! RSS-Bridge only supports public groups!');
}
defaultLinkTo($html, substr(self::URI, 0, strlen(self::URI) - 1));
$this->groupName = $this->extractGroupName($html);
$posts = $html->find('div.userContentWrapper')
or returnServerError('Failed finding posts!');
foreach($posts as $post) {
$item = array();
$item['uri'] = $this->extractGroupURI($post);
$item['title'] = $this->extractGroupTitle($post);
$item['author'] = $this->extractGroupAuthor($post);
$item['content'] = $this->extractGroupContent($post);
$item['timestamp'] = $this->extractGroupTimestamp($post);
$item['enclosures'] = $this->extractGroupEnclosures($post);
$this->items[] = $item;
}
}
private function sanitizeGroup($group) {
if(filter_var(
$group,
FILTER_VALIDATE_URL,
FILTER_FLAG_HOST_REQUIRED | FILTER_FLAG_PATH_REQUIRED)) {
// User provided a URL
$urlparts = parse_url($group);
if($urlparts['host'] !== parse_url(self::URI)['host']
&& 'www.' . $urlparts['host'] !== parse_url(self::URI)['host']) {
returnClientError('The host you provided is invalid! Received "'
. $urlparts['host']
. '", expected "'
. parse_url(self::URI)['host']
. '"!');
}
return explode('/', $urlparts['path'])[2];
} elseif(strpos($group, '/') !== false) {
returnClientError('The group you provided is invalid: ' . $group);
} else {
return $group;
}
}
private function isPublicGroup($html) {
// Facebook redirects to the groups about page for non-public groups
$about = $html->find('#pagelet_group_about', 0);
return !($about);
}
private function extractGroupName($html) {
$ogtitle = $html->find('meta[property="og:title"]', 0)
or returnServerError('Unable to find group title!');
return htmlspecialchars_decode($ogtitle->content, ENT_QUOTES);
}
private function extractGroupURI($post) {
$elements = $post->find('a')
or returnServerError('Unable to find URI!');
foreach($elements as $anchor) {
// Find the one that is a permalink
if(strpos($anchor->href, 'permalink') !== false) {
return $anchor->href;
}
}
return null;
}
private function extractGroupContent($post) {
$content = $post->find('div.userContent', 0)
or returnServerError('Unable to find user content!');
return $content->innertext . $content->next_sibling()->innertext;
}
private function extractGroupTimestamp($post) {
$element = $post->find('abbr[data-utime]', 0)
or returnServerError('Unable to find timestamp!');
return $element->getAttribute('data-utime');
}
private function extractGroupAuthor($post) {
$element = $post->find('img', 0)
or returnServerError('Unable to find author information!');
return $element->{'aria-label'};
}
private function extractGroupEnclosures($post) {
$elements = $post->find('div.userContent', 0)->next_sibling()->find('img');
$enclosures = array();
foreach($elements as $enclosure) {
$enclosures[] = $enclosure->src;
}
return empty($enclosures) ? null : $enclosures;
}
private function extractGroupTitle($post) {
$element = $post->find('h5', 0)
or returnServerError('Unable to find title!');
if(strpos($element->plaintext, 'shared') === false) {
$content = strip_tags($this->extractGroupContent($post));
return $this->extractGroupAuthor($post)
. ' posted: '
. substr(
$content,
0,
strpos(wordwrap($content, 64), "\n")
)
. '...';
}
return $element->plaintext;
}
#endregion (Group)
#region User
/**
* Checks if $user is a valid username or URI and returns the username
*/
private function sanitizeUser($user) {
if (filter_var($user, FILTER_VALIDATE_URL)) {
$urlparts = parse_url($user);
if($urlparts['host'] !== parse_url(self::URI)['host']) {
returnClientError('The host you provided is invalid! Received "'
. $urlparts['host']
. '", expected "'
. parse_url(self::URI)['host']
. '"!');
}
if(!array_key_exists('path', $urlparts)
|| $urlparts['path'] === '/') {
returnClientError('The URL you provided doesn\'t contain the user name!');
}
return explode('/', $urlparts['path'])[1];
} else {
// First character cannot be a forward slash
if(strpos($user, '/') === 0) {
returnClientError('Remove leading slash "/" from the username!');
}
return $user;
}
}
/**
* Bypass external link redirection
*/
private function unescape_fb_link($content){
return preg_replace_callback('/ href=\"([^"]+)\"/i', function($matches){
if(is_array($matches) && count($matches) > 1) {
$link = $matches[1];
if(strpos($link, '/') === 0)
$link = self::URI . $link;
if(strpos($link, 'facebook.com/l.php?u=') !== false)
$link = urldecode(extractFromDelimiters($link, 'facebook.com/l.php?u=', '&'));
return ' href="' . $link . '"';
}
};
}, $content);
}
//Utility function for converting facebook emoticons
$unescape_fb_emote = function($matches){
static $facebook_emoticons = array(
'smile' => ':)',
'frown' => ':(',
'tongue' => ':P',
'grin' => ':D',
'gasp' => ':O',
'wink' => ';)',
'pacman' => ':<',
'grumpy' => '>_<',
'unsure' => ':/',
'cry' => ':\'(',
'kiki' => '^_^',
'glasses' => '8-)',
'sunglasses' => 'B-)',
'heart' => '<3',
'devil' => ']:D',
'angel' => '0:)',
'squint' => '-_-',
'confused' => 'o_O',
'upset' => 'xD',
'colonthree' => ':3',
'like' => '&#x1F44D;');
$len = count($matches);
if ($len > 1)
for ($i = 1; $i < $len; $i++)
foreach ($facebook_emoticons as $name => $emote)
if ($matches[$i] === $name)
return $emote;
return $matches[0];
};
/**
* Convert textual representation of emoticons back to ASCII emoticons.
* i.e. "<i><u>smile emoticon</u></i>" => ":)"
*/
private function unescape_fb_emote($content){
return preg_replace_callback('/<i><u>([^ <>]+) ([^<>]+)<\/u><\/i>/i', function($matches){
static $facebook_emoticons = array(
'smile' => ':)',
'frown' => ':(',
'tongue' => ':P',
'grin' => ':D',
'gasp' => ':O',
'wink' => ';)',
'pacman' => ':<',
'grumpy' => '>_<',
'unsure' => ':/',
'cry' => ':\'(',
'kiki' => '^_^',
'glasses' => '8-)',
'sunglasses' => 'B-)',
'heart' => '<3',
'devil' => ']:D',
'angel' => '0:)',
'squint' => '-_-',
'confused' => 'o_O',
'upset' => 'xD',
'colonthree' => ':3',
'like' => '&#x1F44D;');
$html = null;
$len = count($matches);
//Handle captcha response sent by the viewer
if ($len > 1)
for ($i = 1; $i < $len; $i++)
foreach ($facebook_emoticons as $name => $emote)
if ($matches[$i] === $name)
return $emote;
return $matches[0];
}, $content);
}
/**
* Returns the captcha message for the given captcha
*/
private function returnCaptchaMessage($captcha) {
// Save form for submitting after getting captcha response
if (session_status() == PHP_SESSION_NONE) {
session_start();
}
$captcha_fields = array();
foreach ($captcha->find('input, button') as $input) {
$captcha_fields[$input->name] = $input->value;
}
$_SESSION['captcha_fields'] = $captcha_fields;
$_SESSION['captcha_action'] = $captcha->find('form', 0)->action;
// Show captcha filling form to the viewer, proxying the captcha image
$img = base64_encode(getContents($captcha->find('img', 0)->src));
http_response_code(500);
header('Content-Type: text/html');
$message = <<<EOD
<form method="post" action="?{$_SERVER['QUERY_STRING']}">
<h2>Facebook captcha challenge</h2>
<p>Unfortunately, rss-bridge cannot fetch the requested page.<br />
Facebook wants rss-bridge to resolve the following captcha:</p>
<p><img src="data:image/png;base64,{$img}" /></p>
<p><b>Response:</b> <input name="captcha_response" placeholder="please fill in" />
<input type="submit" value="Submit!" /></p>
</form>
EOD;
die($message);
}
/**
* Checks if a capture response was received and tries to load the contents
* @return mixed null if no capture response was received, simplhtmldom document otherwise
*/
private function handleCaptchaResponse() {
if (isset($_POST['captcha_response'])) {
if (session_status() == PHP_SESSION_NONE)
session_start();
if (isset($_SESSION['captcha_fields'], $_SESSION['captcha_action'])) {
$captcha_action = $_SESSION['captcha_action'];
$captcha_fields = $_SESSION['captcha_fields'];
$captcha_fields['captcha_response'] = preg_replace("/[^a-zA-Z0-9]+/", "", $_POST['captcha_response']);
$captcha_fields['captcha_response'] = preg_replace('/[^a-zA-Z0-9]+/', '', $_POST['captcha_response']);
$header = array(
'Content-type: application/x-www-form-urlencoded',
'Referer: ' . $captcha_action,
'Cookie: noscript=1'
);
$header = array("Content-type:
application/x-www-form-urlencoded\r\nReferer: $captcha_action\r\nCookie: noscript=1\r\n");
$opts = array(
CURLOPT_POST => 1,
CURLOPT_POSTFIELDS => http_build_query($captcha_fields)
);
$html = getContents($captcha_action, $header, $opts);
$html = getSimpleHTMLDOM($captcha_action, $header, $opts)
or returnServerError('Failed to submit captcha response back to Facebook');
if($html === false) {
returnServerError('Failed to submit captcha response back to Facebook');
}
unset($_SESSION['captcha_fields']);
$html = str_get_html($html);
return $html;
}
unset($_SESSION['captcha_fields']);
unset($_SESSION['captcha_action']);
}
//Retrieve page contents
return null;
}
private function collectUserData(){
$html = $this->handleCaptchaResponse();
// Retrieve page contents
if(is_null($html)) {
$header = array('Accept-Language: ' . getEnv('HTTP_ACCEPT_LANGUAGE') . "\r\n");
// First character cannot be a forward slash
if(strpos($this->getInput('u'), "/") === 0) {
returnClientError('Remove leading slash "/" from the username!');
}
$header = array('Accept-Language: ' . getEnv('HTTP_ACCEPT_LANGUAGE'));
$html = getSimpleHTMLDOM($this->getURI(), $header)
or returnServerError('No results for this query.');
if(!strpos($this->getInput('u'), "/")) {
$html = getSimpleHTMLDOM(self::URI . urlencode($this->getInput('u')) . '?_fb_noscript=1', $header)
or returnServerError('No results for this query.');
} else {
$html = getSimpleHTMLDOM(self::URI . 'pages/' . $this->getInput('u') . '?_fb_noscript=1', $header)
or returnServerError('No results for this query.');
}
}
//Handle captcha form?
// Handle captcha form?
$captcha = $html->find('div.captcha_interstitial', 0);
if (!is_null($captcha)) {
//Save form for submitting after getting captcha response
if (session_status() == PHP_SESSION_NONE)
session_start();
$captcha_fields = array();
foreach ($captcha->find('input, button') as $input)
$captcha_fields[$input->name] = $input->value;
$_SESSION['captcha_fields'] = $captcha_fields;
$_SESSION['captcha_action'] = $captcha->find('form', 0)->action;
//Show captcha filling form to the viewer, proxying the captcha image
$img = base64_encode(getContents($captcha->find('img', 0)->src));
http_response_code(500);
header('Content-Type: text/html');
$message = <<<EOD
<form method="post" action="?{$_SERVER['QUERY_STRING']}">
<h2>Facebook captcha challenge</h2>
<p>Unfortunately, rss-bridge cannot fetch the requested page.<br />
Facebook wants rss-bridge to resolve the following captcha:</p>
<p><img src="data:image/png;base64,{$img}" /></p>
<p><b>Response:</b> <input name="captcha_response" placeholder="please fill in" />
<input type="submit" value="Submit!" /></p>
</form>
EOD;
die($message);
if (!is_null($captcha)) {
$this->returnCaptchaMessage($captcha);
}
//No captcha? We can carry on retrieving page contents :)
//First, we check wether the page is public or not
// No captcha? We can carry on retrieving page contents :)
// First, we check wether the page is public or not
$loginForm = $html->find('._585r', 0);
if($loginForm != null) {
returnServerError('You must be logged in to view this page. This is not supported by RSS-Bridge.');
}
@ -174,16 +510,14 @@ EOD;
->find('#pagelet_timeline_main_column')[0]
->children(0)
->children(0)
->children(0)
->next_sibling()
->children(0);
if(isset($element)) {
$author = str_replace(' | Facebook', '', $html->find('title#pageTitle', 0)->innertext);
$profilePic = 'https://graph.facebook.com/'
. $this->getInput('u')
. '/picture?width=200&amp;height=200';
$profilePic = $html->find('meta[property="og:image"]', 0)->content;
$this->authorName = $author;
@ -195,6 +529,12 @@ EOD;
$posts = array($cell);
}
// Optionally skip reviews
if($this->getInput('skip_reviews')
&& !is_null($cell->find('#review_composer_container', 0))) {
continue;
}
foreach($posts as $post) {
// Check media type
switch($this->getInput('media_type')) {
@ -233,13 +573,18 @@ EOD;
'',
$content);
//Remove html nodes, keep only img, links, basic formatting
// Remove "SpSonsSoriSsés"
$content = preg_replace(
'/(?iU)<a [^>]+ href="#" role="link" [^>}]+>.+<\/a>/iU',
'',
$content);
// Remove html nodes, keep only img, links, basic formatting
$content = strip_tags($content, '<a><img><i><u><br><p>');
//Adapt link hrefs: convert relative links into absolute links and bypass external link redirection
$content = preg_replace_callback('/ href=\"([^"]+)\"/i', $unescape_fb_link, $content);
$content = $this->unescape_fb_link($content);
//Clean useless html tag properties and fix link closing tags
// Clean useless html tag properties and fix link closing tags
foreach (array(
'onmouseover',
'onclick',
@ -252,35 +597,45 @@ EOD;
'aria-[^=]*',
'role',
'rel',
'id') as $property_name)
$content = preg_replace('/ ' . $property_name . '=\"[^"]*\"/i', '', $content);
'id') as $property_name) {
$content = preg_replace('/ ' . $property_name . '=\"[^"]*\"/i', '', $content);
}
$content = preg_replace('/<\/a [^>]+>/i', '</a>', $content);
//Convert textual representation of emoticons eg
//"<i><u>smile emoticon</u></i>" back to ASCII emoticons eg ":)"
$content = preg_replace_callback(
'/<i><u>([^ <>]+) ([^<>]+)<\/u><\/i>/i',
$unescape_fb_emote,
$content
);
$this->unescape_fb_emote($content);
// Restore links in the post before further parsing
$post = defaultLinkTo($post, self::URI);
// Restore links in the content before adding to the item
$content = defaultLinkTo($content, self::URI);
// Retrieve date of the post
$date = $post->find('abbr')[0];
//Retrieve date of the post
$date = $post->find("abbr")[0];
if(isset($date) && $date->hasAttribute('data-utime')) {
$date = $date->getAttribute('data-utime');
} else {
$date = 0;
}
//Build title from username and content
// Build title from username and content
$title = $author;
if(strlen($title) > 24)
$title = substr($title, 0, strpos(wordwrap($title, 24), "\n")) . '...';
$title = $title . ' | ' . strip_tags($content);
if(strlen($title) > 64)
$title = substr($title, 0, strpos(wordwrap($title, 64), "\n")) . '...';
$uri = self::URI . $post->find('abbr')[0]->parent()->getAttribute('href');
$uri = $post->find('abbr')[0]->parent()->getAttribute('href');
if (false !== strpos($uri, '?')) {
$uri = substr($uri, 0, strpos($uri, '?'));
}
//Build and add final item
$item['uri'] = htmlspecialchars_decode($uri);
@ -288,6 +643,11 @@ EOD;
$item['title'] = $title;
$item['author'] = $author;
$item['timestamp'] = $date;
if(strpos($item['content'], '<img') === false) {
$item['enclosures'] = array($profilePic);
}
$this->items[] = $item;
}
}
@ -295,12 +655,6 @@ EOD;
}
}
public function getName(){
if(!empty($this->authorName)) {
return isset($this->extraInfos['name']) ? $this->extraInfos['name'] : $this->authorName
. ' - Facebook Bridge';
}
#endregion (User)
return parent::getName();
}
}

View File

@ -8,17 +8,22 @@ class FierPandaBridge extends BridgeAbstract {
const DESCRIPTION = 'Returns latest articles from Fier Panda.';
public function collectData(){
$html = getSimpleHTMLDOM(self::URI)
or returnServerError('Could not request Fier Panda.');
foreach($html->find('div.container-content article') as $element) {
defaultLinkTo($html, static::URI);
foreach($html->find('article') as $article) {
$item = array();
$item['uri'] = $this->getURI() . $element->find('a', 0)->href;
$item['title'] = trim($element->find('h1 a', 0)->innertext);
// Remove the link at the end of the article
$element->find('p a', 0)->outertext = '';
$item['content'] = $element->find('p', 0)->innertext;
$item['uri'] = $article->find('a', 0)->href;
$item['title'] = $article->find('a', 0)->title;
$this->items[] = $item;
}
}
}

View File

@ -6,6 +6,7 @@ class FilterBridge extends FeedExpander {
const NAME = 'Filter';
const CACHE_TIMEOUT = 3600; // 1h
const DESCRIPTION = 'Filters a feed of your choice';
const URI = 'https://github.com/rss-bridge/rss-bridge';
const PARAMETERS = array(array(
'url' => array(
@ -26,11 +27,34 @@ class FilterBridge extends FeedExpander {
),
'defaultValue' => 'permit',
),
'title_from_content' => array(
'name' => 'Generate title from content',
'type' => 'checkbox',
'required' => false,
)
));
protected function parseItem($newItem){
$item = parent::parseItem($newItem);
if($this->getInput('title_from_content') && array_key_exists('content', $item)) {
$content = str_get_html($item['content']);
$pos = strpos($item['content'], ' ', 50);
$item['title'] = substr(
$content->plaintext,
0,
$pos
);
if(strlen($content->plaintext) >= $pos) {
$item['title'] .= '...';
}
}
switch(true) {
case $this->getFilterType() === 'permit':
if (preg_match($this->getFilter(), $item['title'])) {

View File

@ -30,30 +30,76 @@ class FlickrBridge extends BridgeAbstract {
'title' => 'Insert username (as shown in the address bar)',
'exampleValue' => 'flickr'
)
),
)
);
public function collectData(){
switch($this->queriedContext) {
case 'Explore':
$key = 'photos';
$filter = 'photo-lite-models';
$html = getSimpleHTMLDOM(self::URI . 'explore')
or returnServerError('Could not request Flickr.');
break;
case 'By keyword':
$key = 'photos';
$filter = 'photo-lite-models';
$html = getSimpleHTMLDOM(self::URI . 'search/?q=' . urlencode($this->getInput('q')) . '&s=rec')
or returnServerError('No results for this query.');
break;
case 'By username':
$key = 'photoPageList';
$filter = 'photo-models';
$html = getSimpleHTMLDOM(self::URI . 'photos/' . urlencode($this->getInput('u')))
or returnServerError('Requested username can\'t be found.');
break;
default:
returnClientError('Invalid context: ' . $this->queriedContext);
}
$model_json = $this->extractJsonModel($html);
$photo_models = $this->getPhotoModels($model_json, $filter);
foreach($photo_models as $model) {
$item = array();
/* Author name depends on scope. On a keyword search the
* author is part of the picture data. On a username search
* the author is part of the owner data.
*/
if(array_key_exists('username', $model)) {
$item['author'] = $model['username'];
} elseif (array_key_exists('owner', reset($model_json)[0])) {
$item['author'] = reset($model_json)[0]['owner']['username'];
}
$item['title'] = (array_key_exists('title', $model) ? $model['title'] : 'Untitled');
$item['uri'] = self::URI . 'photo.gne?id=' . $model['id'];
$description = (array_key_exists('description', $model) ? $model['description'] : '');
$item['content'] = '<a href="'
. $item['uri']
. '"><img src="'
. $this->extractContentImage($model)
. '" style="max-width: 640px; max-height: 480px;"/></a><br><p>'
. $description
. '</p>';
$item['enclosures'] = $this->extractEnclosures($model);
$this->items[] = $item;
}
}
private function extractJsonModel($html) {
// Find SCRIPT containing JSON data
$model = $html->find('.modelExport', 0);
$model_text = $model->innertext;
@ -62,59 +108,79 @@ class FlickrBridge extends BridgeAbstract {
$start = strpos($model_text, 'modelExport:') + strlen('modelExport:');
$end = strpos($model_text, 'auth:') - strlen('auth:');
// Dissect JSON data and remove trailing comma
// Extract JSON data, remove trailing comma
$model_text = trim(substr($model_text, $start, $end - $start));
$model_text = substr($model_text, 0, strlen($model_text) - 1);
$model_json = json_decode($model_text, true);
return json_decode($model_text, true);
foreach($html->find('.photo-list-photo-view') as $element) {
// Get the styles
$style = explode(';', $element->style);
// Get the background-image style
$backgroundImage = explode(':', end($style));
// URI type : url(//cX.staticflickr.com/X/XXXXX/XXXXXXXXX.jpg)
$imageURI = trim(str_replace(['url(', ')'], '', end($backgroundImage)));
// Get the image ID
$imageURIs = explode('_', basename($imageURI));
$imageID = reset($imageURIs);
// Use JSON data to build items
foreach(reset($model_json)[0][$key]['_data'] as $element) {
if($element['id'] === $imageID) {
$item = array();
/* Author name depends on scope. On a keyword search the
* author is part of the picture data. On a username search
* the author is part of the owner data.
*/
if(array_key_exists('username', $element)) {
$item['author'] = $element['username'];
} elseif (array_key_exists('owner', reset($model_json)[0])) {
$item['author'] = reset($model_json)[0]['owner']['username'];
}
$item['title'] = (array_key_exists('title', $element) ? $element['title'] : 'Untitled');
$item['uri'] = self::URI . 'photo.gne?id=' . $imageID;
$description = (array_key_exists('description', $element) ? $element['description'] : '');
$item['content'] = '<a href="'
. $item['uri']
. '"><img src="'
. $imageURI
. '" /></a><br><p>'
. $description
. '</p>';
$this->items[] = $item;
break;
}
}
}
}
private function getPhotoModels($json, $filter) {
// The JSON model contains a "legend" array, where each element contains
// the path to an element in the "main" object
$photo_models = array();
foreach($json['legend'] as $legend) {
$photo_model = $json['main'];
foreach($legend as $element) { // Traverse tree
$photo_model = $photo_model[$element];
}
// We are only interested in content
if($photo_model['_flickrModelRegistry'] === $filter) {
$photo_models[] = $photo_model;
}
}
return $photo_models;
}
private function extractEnclosures($model) {
$areas = array();
foreach($model['sizes'] as $size) {
$areas[$size['width'] * $size['height']] = $size['url'];
}
return array($this->fixURL(max($areas)));
}
private function extractContentImage($model) {
$areas = array();
$limit = 320 * 240;
foreach($model['sizes'] as $size) {
$image_area = $size['width'] * $size['height'];
if($image_area >= $limit) {
$areas[$image_area] = $size['url'];
}
}
return $this->fixURL(min($areas));
}
private function fixURL($url) {
// For some reason the image URLs don't include the protocol (https)
if(strpos($url, '//') === 0) {
$url = 'https:' . $url;
}
return $url;
}
}

View File

@ -15,47 +15,47 @@ class FootitoBridge extends BridgeAbstract {
$content = trim($element->innertext);
$content = str_replace(
"<img",
'<img',
"<img style='float : left;'",
$content );
$content = str_replace(
"class=\"logo\"",
'class="logo"',
"style='float : left;'",
$content );
$content = str_replace(
"class=\"contenu\"",
'class="contenu"',
"style='margin-left : 60px;'",
$content );
$content = str_replace(
"class=\"responsive-comment\"",
'class="responsive-comment"',
"style='border-top : 1px #DDD solid; background-color : white; padding : 10px;'",
$content );
$content = str_replace(
"class=\"jaime\"",
'class="jaime"',
"style='display : none;'",
$content );
$content = str_replace(
"class=\"auteur-event responsive\"",
'class="auteur-event responsive"',
"style='display : none;'",
$content );
$content = str_replace(
"class=\"report-abuse-button\"",
'class="report-abuse-button"',
"style='display : none;'",
$content );
$content = str_replace(
"class=\"reaction clearfix\"",
'class="reaction clearfix"',
"style='margin : 10px 0px; padding : 5px; border-bottom : 1px #DDD solid;'",
$content );
$content = str_replace(
"class=\"infos\"",
'class="infos"',
"style='font-size : 0.7em;'",
$content );

41
bridges/ForGifsBridge.php Normal file
View File

@ -0,0 +1,41 @@
<?php
class ForGifsBridge extends FeedExpander {
const MAINTAINER = 'logmanoriginal';
const NAME = 'forgifs Bridge';
const URI = 'https://forgifs.com';
const DESCRIPTION = 'Returns the forgifs feed with actual gifs instead of images';
public function collectData() {
$this->collectExpandableDatas('https://forgifs.com/gallery/srss/7');
}
protected function parseItem($feedItem) {
$item = parent::parseItem($feedItem);
$content = str_get_html($item['content']);
$img = $content->find('img', 0);
$poster = $img->src;
// The actual gif is the same path but its id must be decremented by one.
// Example:
// http://forgifs.com/gallery/d/279419-2/Reporter-videobombed-shoulder-checks.gif
// http://forgifs.com/gallery/d/279418-2/Reporter-videobombed-shoulder-checks.gif
// Notice how this changes ----------^
// Now let's extract that number and do some math
// Notice: Technically we could also load the content page but that would
// require unnecessary traffic. As long as it works...
$num = substr($img->src, 29, 6);
$num -= 1;
$img->src = substr_replace($img->src, $num, 29, strlen($num));
$img->width = 'auto';
$img->height = 'auto';
$item['content'] = $content;
return $item;
}
}

View File

@ -30,7 +30,7 @@ class FourchanBridge extends BridgeAbstract {
public function collectData(){
$html = getSimpleHTMLDOM($this->getURI())
or returnServerError("Could not request 4chan, thread not found");
or returnServerError('Could not request 4chan, thread not found');
foreach($html->find('div.postContainer') as $element) {
$item = array();

View File

@ -3,7 +3,7 @@ class FuturaSciencesBridge extends FeedExpander {
const MAINTAINER = 'ORelio';
const NAME = 'Futura-Sciences Bridge';
const URI = 'http://www.futura-sciences.com/';
const URI = 'https://www.futura-sciences.com/';
const DESCRIPTION = 'Returns the newest articles.';
const PARAMETERS = array( array(
@ -90,42 +90,11 @@ class FuturaSciencesBridge extends FeedExpander {
or returnServerError('Could not request Futura-Sciences: ' . $item['uri']);
$item['content'] = $this->extractArticleContent($article);
$author = $this->extractAuthor($article);
$item['author'] = empty($author) ? $item['author'] : $author;
if (!empty($author))
$item['author'] = $author;
return $item;
}
private function stripWithDelimiters($string, $start, $end){
while(strpos($string, $start) !== false) {
$section_to_remove = substr($string, strpos($string, $start));
$section_to_remove = substr($section_to_remove, 0, strpos($section_to_remove, $end) + strlen($end));
$string = str_replace($section_to_remove, '', $string);
} return $string;
}
private function stripRecursiveHTMLSection($string, $tag_name, $tag_start){
$open_tag = '<' . $tag_name;
$close_tag = '</' . $tag_name . '>';
$close_tag_length = strlen($close_tag);
if(strpos($tag_start, $open_tag) === 0) {
while(strpos($string, $tag_start) !== false) {
$max_recursion = 100;
$section_to_remove = null;
$section_start = strpos($string, $tag_start);
$search_offset = $section_start;
do {
$max_recursion--;
$section_end = strpos($string, $close_tag, $search_offset);
$search_offset = $section_end + $close_tag_length;
$section_to_remove = substr($string, $section_start, $section_end - $section_start + $close_tag_length);
$open_tag_count = substr_count($section_to_remove, $open_tag);
$close_tag_count = substr_count($section_to_remove, $close_tag);
} while ($open_tag_count > $close_tag_count && $max_recursion > 0);
$string = str_replace($section_to_remove, '', $string);
}
}
return $string;
}
private function extractArticleContent($article){
$contents = $article->find('section.article-text-classic', 0)->innertext;
$headline = trim($article->find('p.description', 0)->plaintext);
@ -137,6 +106,7 @@ class FuturaSciencesBridge extends FeedExpander {
'<div class="sharebar2',
'<div class="diaporamafullscreen"',
'<div class="module social-button',
'<div class="module social-share',
'<div style="margin-bottom:10px;" class="noprint"',
'<div class="ficheprevnext',
'<div class="bar noprint',
@ -148,16 +118,17 @@ class FuturaSciencesBridge extends FeedExpander {
'<div id="forumcomments',
'<div ng-if="active"'
) as $div_start) {
$contents = $this->stripRecursiveHTMLSection($contents, 'div', $div_start);
$contents = stripRecursiveHTMLSection($contents, 'div', $div_start);
}
$contents = $this->stripWithDelimiters($contents, '<hr ', '/>');
$contents = $this->stripWithDelimiters($contents, '<p class="content-date', '</p>');
$contents = $this->stripWithDelimiters($contents, '<h1 class="content-title', '</h1>');
$contents = $this->stripWithDelimiters($contents, 'fs:definition="', '"');
$contents = $this->stripWithDelimiters($contents, 'fs:xt:clicktype="', '"');
$contents = $this->stripWithDelimiters($contents, 'fs:xt:clickname="', '"');
$contents = $this->stripWithDelimiters($contents, '<script ', '</script>');
$contents = stripWithDelimiters($contents, '<hr ', '/>');
$contents = stripWithDelimiters($contents, '<p class="content-date', '</p>');
$contents = stripWithDelimiters($contents, '<h1 class="content-title', '</h1>');
$contents = stripWithDelimiters($contents, 'fs:definition="', '"');
$contents = stripWithDelimiters($contents, 'fs:xt:clicktype="', '"');
$contents = stripWithDelimiters($contents, 'fs:xt:clickname="', '"');
$contents = StripWithDelimiters($contents, '<section class="module-toretain module-propal-nl', '</section>');
$contents = stripWithDelimiters($contents, '<script ', '</script>');
return $headline . trim($contents);
}

View File

@ -20,50 +20,58 @@ class GBAtempBridge extends BridgeAbstract {
)
));
private function extractFromDelimiters($string, $start, $end){
if(strpos($string, $start) !== false) {
$section_retrieved = substr($string, strpos($string, $start) + strlen($start));
$section_retrieved = substr($section_retrieved, 0, strpos($section_retrieved, $end));
return $section_retrieved;
}
return false;
}
private function stripWithDelimiters($string, $start, $end){
while(strpos($string, $start) !== false) {
$section_to_remove = substr($string, strpos($string, $start));
$section_to_remove = substr($section_to_remove, 0, strpos($section_to_remove, $end) + strlen($end));
$string = str_replace($section_to_remove, '', $string);
}
return $string;
}
private function buildItem($uri, $title, $author, $timestamp, $content){
private function buildItem($uri, $title, $author, $timestamp, $thumbnail, $content){
$item = array();
$item['uri'] = $uri;
$item['title'] = $title;
$item['author'] = $author;
$item['timestamp'] = $timestamp;
$item['content'] = $content;
if (!empty($thumbnail)) {
$item['enclosures'] = array($thumbnail);
}
return $item;
}
private function cleanupPostContent($content, $site_url){
$content = str_replace(':arrow:', '&#x27a4;', $content);
$content = str_replace('href="attachments/', 'href="'.$site_url.'attachments/', $content);
$content = $this->stripWithDelimiters($content, '<script', '</script>');
$content = stripWithDelimiters($content, '<script', '</script>');
return $content;
}
private function findItemDate($item){
$time = 0;
$dateField = $item->find('abbr.DateTime', 0);
if (is_object($dateField)) {
$time = intval(
extractFromDelimiters(
$dateField->outertext,
'data-time="',
'"'
)
);
} else {
$dateField = $item->find('span.DateTime', 0);
$time = DateTime::createFromFormat(
'M j, Y \a\t g:i A',
extractFromDelimiters(
$dateField->outertext,
'title="',
'"'
)
)->getTimestamp();
}
return $time;
}
private function fetchPostContent($uri, $site_url){
$html = getSimpleHTMLDOM($uri);
$html = getSimpleHTMLDOMCached($uri);
if(!$html) {
return 'Could not request GBAtemp ' . $uri;
return 'Could not request GBAtemp: ' . $uri;
}
$content = $html->find('div.messageContent', 0)->innertext;
$content = $html->find('div.messageContent, blockquote.baseHtml', 0)->innertext;
return $this->cleanupPostContent($content, $site_url);
}
@ -76,70 +84,56 @@ class GBAtempBridge extends BridgeAbstract {
case 'N':
foreach($html->find('li[class=news_item full]') as $newsItem) {
$url = self::URI . $newsItem->find('a', 0)->href;
$time = intval(
$this->extractFromDelimiters(
$newsItem->find('abbr.DateTime', 0)->outertext,
'data-time="',
'"'
)
);
$img = $this->getURI() . $newsItem->find('img', 0)->src . '#.image';
$time = $this->findItemDate($newsItem);
$author = $newsItem->find('a.username', 0)->plaintext;
$title = $newsItem->find('a', 1)->plaintext;
$content = $this->fetchPostContent($url, self::URI);
$this->items[] = $this->buildItem($url, $title, $author, $time, $content);
$this->items[] = $this->buildItem($url, $title, $author, $time, $img, $content);
unset($newsItem); // Some items are heavy, freeing the item proactively helps saving memory
}
break;
case 'R':
foreach($html->find('li.portal_review') as $reviewItem) {
$url = self::URI . $reviewItem->find('a', 0)->href;
$img = $this->getURI() . extractFromDelimiters($reviewItem->find('a', 0)->style, 'image:url(', ')');
$title = $reviewItem->find('span.review_title', 0)->plaintext;
$content = getSimpleHTMLDOM($url)
or returnServerError('Could not request GBAtemp: ' . $uri);
$author = $content->find('a.username', 0)->plaintext;
$time = intval(
$this->extractFromDelimiters(
$content->find('abbr.DateTime', 0)->outertext,
'data-time="',
'"'
)
);
$time = $this->findItemDate($content);
$intro = '<p><b>' . ($content->find('div#review_intro', 0)->plaintext) . '</b></p>';
$review = $content->find('div#review_main', 0)->innertext;
$subheader = '<p><b>' . $content->find('div.review_subheader', 0)->plaintext . '</b></p>';
$procons = $content->find('table.review_procons', 0)->outertext;
$scores = $content->find('table.reviewscores', 0)->outertext;
$content = $this->cleanupPostContent($intro . $review . $subheader . $procons . $scores, self::URI);
$this->items[] = $this->buildItem($url, $title, $author, $time, $content);
$this->items[] = $this->buildItem($url, $title, $author, $time, $img, $content);
unset($reviewItem); // Free up memory
}
break;
case 'T':
foreach($html->find('li.portal-tutorial') as $tutorialItem) {
$url = self::URI . $tutorialItem->find('a', 0)->href;
$title = $tutorialItem->find('a', 0)->plaintext;
$time = intval(
$this->extractFromDelimiters(
$tutorialItem->find('abbr.DateTime', 0)->outertext,
'data-time="',
'"'
)
);
$time = $this->findItemDate($tutorialItem);
$author = $tutorialItem->find('a.username', 0)->plaintext;
$content = $this->fetchPostContent($url, self::URI);
$this->items[] = $this->buildItem($url, $title, $author, $time, $content);
$this->items[] = $this->buildItem($url, $title, $author, $time, null, $content);
unset($tutorialItem); // Free up memory
}
break;
case 'F':
foreach($html->find('li.rc_item') as $postItem) {
$url = self::URI . $postItem->find('a', 1)->href;
$title = $postItem->find('a', 1)->plaintext;
$time = intval(
$this->extractFromDelimiters(
$postItem->find('abbr.DateTime', 0)->outertext,
'data-time="',
'"'
)
);
$time = $this->findItemDate($postItem);
$author = $postItem->find('a.username', 0)->plaintext;
$content = $this->fetchPostContent($url, self::URI);
$this->items[] = $this->buildItem($url, $title, $author, $time, $content);
$this->items[] = $this->buildItem($url, $title, $author, $time, null, $content);
unset($postItem); // Free up memory
}
break;
}
}

66
bridges/GOGBridge.php Normal file
View File

@ -0,0 +1,66 @@
<?php
class GOGBridge extends BridgeAbstract {
const NAME = 'GOGBridge';
const MAINTAINER = 'teromene';
const URI = 'https://gog.com';
const DESCRIPTION = 'Returns the latest releases from GOG.com';
public function collectData() {
$values = getContents('https://www.gog.com/games/ajax/filtered?limit=25&sort=new') or
die('Unable to get the news pages from GOG !');
$decodedValues = json_decode($values);
$limit = 0;
foreach($decodedValues->products as $game) {
$item = array();
$item['author'] = $game->developer . ' / ' . $game->publisher;
$item['title'] = $game->title;
$item['id'] = $game->id;
$item['uri'] = self::URI . $game->url;
$item['content'] = $this->buildGameContentPage($game);
$item['timestamp'] = $game->globalReleaseDate;
foreach($game->gallery as $image) {
$item['enclosures'][] = $image . '.jpg';
}
$this->items[] = $item;
$limit += 1;
if($limit == 10) break;
}
}
private function buildGameContentPage($game) {
$gameDescriptionText = getContents('https://api.gog.com/products/' . $game->id . '?expand=description') or
die('Unable to get game description from GOG !');
$gameDescriptionValue = json_decode($gameDescriptionText);
$content = 'Genres: ';
$content .= implode(', ', $game->genres);
$content .= '<br />Supported Platforms: ';
if($game->worksOn->Windows) {
$content .= 'Windows ';
}
if($game->worksOn->Mac) {
$content .= 'Mac ';
}
if($game->worksOn->Linux) {
$content .= 'Linux ';
}
$content .= '<br />' . $gameDescriptionValue->description->full;
return $content;
}
}

View File

@ -0,0 +1,119 @@
<?php
/**
* An extension of the previous SexactuBridge to cover the whole GQMagazine.
* This one taks a page (as an example sexe/news or journaliste/maia-mazaurette) which is to be configured,
* reads all the articles visible on that page, and make a stream out of it.
* @author nicolas-delsaux
*
*/
class GQMagazineBridge extends BridgeAbstract
{
const MAINTAINER = 'Riduidel';
const NAME = 'GQMagazine';
// URI is no more valid, since we can address the whole gq galaxy
const URI = 'https://www.gqmagazine.fr';
const CACHE_TIMEOUT = 7200; // 2h
const DESCRIPTION = 'GQMagazine section extractor bridge. This bridge allows you get only a specific section.';
const PARAMETERS = array( array(
'domain' => array(
'name' => 'Domain to use',
'required' => true,
'values' => array(
'www.gqmagazine.fr' => 'www.gqmagazine.fr'
),
'defaultValue' => 'www.gqmagazine.fr'
),
'page' => array(
'name' => 'Initial page to load',
'required' => true
),
));
const REPLACED_ATTRIBUTES = array(
'href' => 'href',
'src' => 'src',
'data-original' => 'src'
);
private function getDomain() {
return $this->getInput('domain');
}
public function getURI()
{
return $this->getDomain() . '/' . $this->getInput('page');
}
public function collectData()
{
$html = getSimpleHTMLDOM($this->getURI()) or returnServerError('Could not request ' . $this->getURI());
// Since GQ don't want simple class scrapping, let's do it the hard way and ... discover content !
$main = $html->find('main', 0);
foreach ($main->find('a') as $link) {
$uri = $link->href;
$title = $link->find('h2', 0);
$date = $link->find('time', 0);
$item = array();
$author = $link->find('span[itemprop=name]', 0);
$item['author'] = $author->plaintext;
$item['title'] = $title->plaintext;
if(substr($uri, 0, 1) === 'h') { // absolute uri
$item['uri'] = $uri;
} else if(substr($uri, 0, 1) === '/') { // domain relative url
$item['uri'] = $this->getDomain() . $uri;
} else {
$item['uri'] = $this->getDomain() . '/' . $uri;
}
$article = $this->loadFullArticle($item['uri']);
if($article) {
$item['content'] = $this->replaceUriInHtmlElement($article);
} else {
$item['content'] = "<strong>Article body couldn't be loaded</strong>. It must be a bug!";
}
$short_date = $date->datetime;
$item['timestamp'] = strtotime($short_date);
$this->items[] = $item;
}
}
/**
* Loads the full article and returns the contents
* @param $uri The article URI
* @return The article content
*/
private function loadFullArticle($uri){
$html = getSimpleHTMLDOMCached($uri);
// Once again, that generated css classes madness is an obstacle ... which i can go over easily
foreach($html->find('div') as $div) {
// List the CSS classes of that div
$classes = $div->class;
// I can't directly lookup that class since GQ since to generate random names like "ArticleBodySection-fkggUW"
if(strpos($classes, 'ArticleBodySection') !== false) {
return $div;
}
}
return null;
}
/**
* Replaces all relative URIs with absolute ones
* @param $element A simplehtmldom element
* @return The $element->innertext with all URIs replaced
*/
private function replaceUriInHtmlElement($element){
$returned = $element->innertext;
foreach (self::REPLACED_ATTRIBUTES as $initial => $final) {
$returned = str_replace($initial . '="/', $final . '="' . self::URI . '/', $returned);
}
return $returned;
}
}

View File

@ -0,0 +1,164 @@
<?php
class GitHubGistBridge extends BridgeAbstract {
const NAME = 'GitHubGist comment bridge';
const URI = 'https://gist.github.com';
const DESCRIPTION = 'Generates feeds for Gist comments';
const MAINTAINER = 'logmanoriginal';
const CACHE_TIMEOUT = 3600;
const PARAMETERS = array(array(
'id' => array(
'name' => 'Gist',
'type' => 'text',
'required' => true,
'title' => 'Insert Gist ID or URI',
'exampleValue' => '2646763, https://gist.github.com/2646763'
)
));
private $filename;
public function getURI() {
$id = $this->getInput('id') ?: '';
$urlpath = parse_url($id, PHP_URL_PATH);
if($urlpath) {
$components = explode('/', $urlpath);
$id = end($components);
}
return static::URI . '/' . $id;
}
public function getName() {
return $this->filename ? $this->filename . ' - ' . static::NAME : static::NAME;
}
public function collectData() {
$html = getSimpleHTMLDOM($this->getURI(),
null,
null,
true,
true,
DEFAULT_TARGET_CHARSET,
false, // Do NOT remove line breaks
DEFAULT_BR_TEXT,
DEFAULT_SPAN_TEXT)
or returnServerError('Could not request ' . $this->getURI());
$html = defaultLinkTo($html, static::URI);
$fileinfo = $html->find('[class="file-info"]', 0)
or returnServerError('Could not find file info!');
$this->filename = $fileinfo->plaintext;
$comments = $html->find('div[class="timeline-comment-wrapper"]');
if(is_null($comments)) { // no comments yet
return;
}
foreach($comments as $comment) {
$uri = $comment->find('a[href^=#gistcomment]', 0)
or returnServerError('Could not find comment anchor!');
$title = $comment->find('div[class="unminimized-comment"] h3[class="timeline-comment-header-text"]', 0)
or returnServerError('Could not find comment header text!');
$datetime = $comment->find('[datetime]', 0)
or returnServerError('Could not find comment datetime!');
$author = $comment->find('a.author', 0)
or returnServerError('Could not find author name!');
$message = $comment->find('[class="comment-body"]', 0)
or returnServerError('Could not find comment body!');
$item = array();
$item['uri'] = $this->getURI() . $uri->href;
$item['title'] = str_replace('commented', 'commented on', $title->plaintext);
$item['timestamp'] = strtotime($datetime->datetime);
$item['author'] = '<a href="' . $author->href . '">' . $author->plaintext . '</a>';
$item['content'] = $this->fixContent($message);
// $item['enclosures'] = array();
// $item['categories'] = array();
$this->items[] = $item;
}
}
/** Removes all unnecessary tags and adds formatting */
private function fixContent($content){
// Restore code (inside <pre />) highlighting
foreach($content->find('pre') as $pre) {
$pre->style = <<<EOD
padding: 16px;
overflow: auto;
font-size: 85%;
line-height: 1.45;
background-color: #f6f8fa;
border-radius: 3px;
word-wrap: normal;
box-sizing: border-box;
margin-bottom: 16px;
EOD;
$code = $pre->find('code', 0);
if($code) {
$code->style = <<<EOD
white-space: pre;
word-break: normal;
EOD;
}
}
// find <code /> not inside <pre /> (`inline-code`)
foreach($content->find('code') as $code) {
if($code->parent()->tag === 'pre') {
continue;
}
$code->style = <<<EOD
background-color: rgba(27,31,35,0.05);
padding: 0.2em 0.4em;
border-radius: 3px;
EOD;
}
// restore text spacing
foreach($content->find('p') as $p) {
$p->style = 'margin-bottom: 16px;';
}
// Remove unnecessary tags
$content = strip_tags(
$content->innertext,
'<p><a><img><ol><ul><li><table><tr><th><td><string><pre><code><br><hr><h>'
);
return $content;
}
}

View File

@ -106,7 +106,7 @@ class GithubIssueBridge extends BridgeAbstract {
$content = $comment->parent()->innertext;
} else {
$title .= ' / ' . trim($comment->firstChild()->plaintext);
$content = "<pre>" . $comment->find('.comment-body', 0)->innertext . "</pre>";
$content = '<pre>' . $comment->find('.comment-body', 0)->innertext . '</pre>';
}
$item = array();

View File

@ -34,13 +34,29 @@ class GithubSearchBridge extends BridgeAbstract {
$title = $element->find('h3', 0)->plaintext;
$item['title'] = $title;
if (count($element->find('p')) == 2) {
$content = $element->find('p', 0)->innertext;
// Description
if (count($element->find('p.d-inline-block')) != 0) {
$content = $element->find('p.d-inline-block', 0)->innertext;
} else{
$content = '';
$content = 'No description';
}
$item['content'] = $content;
// Tags
$content = $content . '<br />';
$tags = $element->find('a.topic-tag');
$tags_array = array();
if (count($tags) != 0) {
$content = $content . 'Tags : ';
foreach($tags as $tag_element) {
$tag_link = 'https://github.com' . $tag_element->href;
$tag_name = trim($tag_element->innertext);
$content = $content . '<a href="' . $tag_link . '">' . $tag_name . '</a> ';
array_push($tags_array, $tag_element->innertext);
}
}
$item['categories'] = $tags_array;
$item['content'] = $content;
$date = $element->find('relative-time', 0)->datetime;
$item['timestamp'] = strtotime($date);

222
bridges/GlassdoorBridge.php Executable file
View File

@ -0,0 +1,222 @@
<?php
class GlassdoorBridge extends BridgeAbstract {
// Contexts
const CONTEXT_BLOG = 'Blogs';
const CONTEXT_REVIEW = 'Company Reviews';
const CONTEXT_GLOBAL = 'global';
// Global context parameters
const PARAM_LIMIT = 'limit';
// Blog context parameters
const PARAM_BLOG_TYPE = 'blog_type';
const PARAM_BLOG_FULL = 'full_article';
const BLOG_TYPE_HOME = 'Home';
const BLOG_TYPE_COMPANIES_HIRING = 'Companies Hiring';
const BLOG_TYPE_CAREER_ADVICE = 'Career Advice';
const BLOG_TYPE_INTERVIEWS = 'Interviews';
const BLOG_TYPE_GUIDE = 'Guides';
// Review context parameters
const PARAM_REVIEW_COMPANY = 'company';
const MAINTAINER = 'logmanoriginal';
const NAME = 'Glassdoor Bridge';
const URI = 'https://www.glassdoor.com/';
const DESCRIPTION = 'Returns feeds for blog posts and company reviews';
const CACHE_TIMEOUT = 86400; // 24 hours
const PARAMETERS = array(
self::CONTEXT_BLOG => array(
self::PARAM_BLOG_TYPE => array(
'name' => 'Blog type',
'type' => 'list',
'title' => 'Select the blog you want to follow',
'values' => array(
self::BLOG_TYPE_HOME => 'blog/',
self::BLOG_TYPE_COMPANIES_HIRING => 'blog/companies-hiring/',
self::BLOG_TYPE_CAREER_ADVICE => 'blog/career-advice/',
self::BLOG_TYPE_INTERVIEWS => 'blog/interviews/',
self::BLOG_TYPE_GUIDE => 'blog/guide/'
)
),
self::PARAM_BLOG_FULL => array(
'name' => 'Full article',
'type' => 'checkbox',
'title' => 'Enable to return the full article for each post'
),
),
self::CONTEXT_REVIEW => array(
self::PARAM_REVIEW_COMPANY => array(
'name' => 'Company URL',
'type' => 'text',
'required' => true,
'title' => 'Paste the company review page URL here!',
'exampleValue' => 'https://www.glassdoor.com/Reviews/GitHub-Reviews-E671945.htm'
)
),
self::CONTEXT_GLOBAL => array(
self::PARAM_LIMIT => array(
'name' => 'Limit',
'type' => 'number',
'defaultValue' => -1,
'title' => 'Specifies the maximum number of items to return (default: All)'
)
)
);
private $host = self::URI; // They redirect without notice :/
private $title = '';
public function getURI() {
switch($this->queriedContext) {
case self::CONTEXT_BLOG:
return self::URI . $this->getInput(self::PARAM_BLOG_TYPE);
case self::CONTEXT_REVIEW:
return $this->filterCompanyURI($this->getInput(self::PARAM_REVIEW_COMPANY));
}
return parent::getURI();
}
public function getName() {
return $this->title ? $this->title . ' - ' . self::NAME : parent::getName();
}
public function collectData() {
$html = getSimpleHTMLDOM($this->getURI())
or returnServerError('Failed loading contents!');
$this->host = $html->find('link[rel="canonical"]', 0)->href;
$html = defaultLinkTo($html, $this->host);
$this->title = $html->find('meta[property="og:title"]', 0)->content;
$limit = $this->getInput(self::PARAM_LIMIT);
switch($this->queriedContext) {
case self::CONTEXT_BLOG:
$this->collectBlogData($html, $limit);
break;
case self::CONTEXT_REVIEW:
$this->collectReviewData($html, $limit);
break;
}
}
private function collectBlogData($html, $limit) {
$posts = $html->find('section')
or returnServerError('Unable to find blog posts!');
foreach($posts as $post) {
$item = array();
$item['uri'] = $post->find('header a', 0)->href;
$item['title'] = $post->find('header', 0)->plaintext;
$item['content'] = $post->find('div[class="excerpt-content"]', 0)->plaintext;
$item['enclosures'] = array(
$this->getFullSizeImageURI($post->find('div[class="post-thumb"]', 0)->{'data-original'})
);
// optionally load full articles
if($this->getInput(self::PARAM_BLOG_FULL)) {
$full_html = getSimpleHTMLDOMCached($item['uri'])
or returnServerError('Unable to load full article!');
$full_html = defaultLinkTo($full_html, $this->host);
$item['author'] = $full_html->find('a[rel="author"]', 0);
$item['content'] = $full_html->find('article', 0);
$item['timestamp'] = strtotime($full_html->find('time.updated', 0)->datetime);
$item['categories'] = $full_html->find('span[class="post_tag"]');
}
$this->items[] = $item;
if($limit > 0 && count($this->items) >= $limit)
return;
}
}
private function collectReviewData($html, $limit) {
$reviews = $html->find('#EmployerReviews li[id^="empReview]')
or returnServerError('Unable to find reviews!');
foreach($reviews as $review) {
$item = array();
$item['uri'] = $review->find('a.reviewLink', 0)->href;
$item['title'] = $review->find('[class="summary"]', 0)->plaintext;
$item['author'] = $review->find('div.author span', 0)->plaintext;
$item['timestamp'] = strtotime($review->find('time', 0)->datetime);
$mainText = $review->find('p.mainText', 0)->plaintext;
$description = $review->find('div.prosConsAdvice', 0)->innertext;
$item['content'] = "<p>{$mainText}</p><p>{$description}</p>";
$this->items[] = $item;
if($limit > 0 && count($this->items) >= $limit)
return;
}
}
private function getFullSizeImageURI($uri) {
/* Images are scaled for display on the website. The scaling takes place
* on the host, who provides images in different sizes.
*
* For example:
* https://www.glassdoor.com/blog/app/uploads/sites/2/GettyImages-982402074-e1538092065712-390x193.jpg
*
* By removing the size information we receive the full sized image.
*
* For example:
* https://www.glassdoor.com/blog/app/uploads/sites/2/GettyImages-982402074-e1538092065712.jpg
*/
$uri = filter_var($uri, FILTER_SANITIZE_URL);
return preg_replace('/(.*)(\-\d+x\d+)(\.jpg)/', '$1$3', $uri);
}
private function filterCompanyURI($uri) {
/* Make sure the URI is a valid review page. Unfortunately there is no
* simple way to determine if the URI is valid, because of automagic
* redirection and strange naming conventions.
*/
if(!filter_var($uri,
FILTER_VALIDATE_URL,
FILTER_FLAG_SCHEME_REQUIRED | FILTER_FLAG_HOST_REQUIRED | FILTER_FLAG_PATH_REQUIRED)) {
returnClientError('The specified URL is invalid!');
}
$uri = filter_var($uri, FILTER_SANITIZE_URL);
$path = parse_url($uri, PHP_URL_PATH);
$parts = explode('/', $path);
$allowed_strings = array(
'de-DE' => 'Bewertungen',
'en-AU' => 'Reviews',
'nl-BE' => 'Reviews',
'fr-BE' => 'Avis',
'en-CA' => 'Reviews',
'fr-CA' => 'Avis',
'fr-FR' => 'Avis',
'en-IN' => 'Reviews',
'en-IE' => 'Reviews',
'nl-NL' => 'Reviews',
'de-AT' => 'Bewertungen',
'de-CH' => 'Bewertungen',
'fr-CH' => 'Avis',
'en-GB' => 'Reviews',
'en' => 'Reviews'
);
if(!in_array($parts[1], $allowed_strings)) {
returnClientError('Please specify a URL pointing to the companies review page!');
}
return $uri;
}
}

View File

@ -19,26 +19,26 @@ class GoComicsBridge extends BridgeAbstract {
or returnServerError('Could not request GoComics: ' . $this->getURI());
//Get info from first page
$author = preg_replace('/By /', '', $html->find(".media-subheading", 0)->plaintext);
$author = preg_replace('/By /', '', $html->find('.media-subheading', 0)->plaintext);
$link = self::URI . $html->find(".gc-deck--cta-0", 0)->find('a', 0)->href;
$link = self::URI . $html->find('.gc-deck--cta-0', 0)->find('a', 0)->href;
for($i = 0; $i < 5; $i++) {
$item = array();
$page = getSimpleHTMLDOM($link)
or returnServerError('Could not request GoComics: ' . $link);
$imagelink = $page->find(".img-fluid", 1)->src;
$date = explode("/", $link);
$imagelink = $page->find('.img-fluid', 1)->src;
$date = explode('/', $link);
$item['id'] = $imagelink;
$item['uri'] = $link;
$item['author'] = $author;
$item['title'] = 'GoComics ' . $this->getInput('comicname');
$item['timestamp'] = DateTime::createFromFormat("Ymd", $date[5] . $date[6] . $date[7])->getTimestamp();
$item['timestamp'] = DateTime::createFromFormat('Ymd', $date[5] . $date[6] . $date[7])->getTimestamp();
$item['content'] = '<img src="' . $imagelink . '" />';
$link = self::URI . $page->find(".js-previous-comic", 0)->href;
$link = self::URI . $page->find('.js-previous-comic', 0)->href;
$this->items[] = $item;
}
}

View File

@ -1,12 +1,12 @@
<?php
class GooglePlusPostBridge extends BridgeAbstract{
protected $_title;
protected $_url;
private $title;
private $url;
const MAINTAINER = 'Grummfy';
const MAINTAINER = 'Grummfy, logmanoriginal';
const NAME = 'Google Plus Post Bridge';
const URI = 'https://plus.google.com/';
const URI = 'https://plus.google.com';
const CACHE_TIMEOUT = 600; //10min
const DESCRIPTION = 'Returns user public post (without API).';
@ -14,10 +14,16 @@ class GooglePlusPostBridge extends BridgeAbstract{
'username' => array(
'name' => 'username or Id',
'required' => true
),
'include_media' => array(
'name' => 'Include media',
'type' => 'checkbox',
'title' => 'Enable to include media in the feed content'
)
));
public function collectData(){
$username = $this->getInput('username');
// Usernames start with a + if it's not an ID
@ -25,22 +31,20 @@ class GooglePlusPostBridge extends BridgeAbstract{
$username = '+' . $username;
}
// get content parsed
$html = getSimpleHTMLDOMCached(self::URI . urlencode($username) . '/posts')
$html = getSimpleHTMLDOM(static::URI . '/' . urlencode($username) . '/posts')
or returnServerError('No results for this query.');
// get title, url, ... there is a lot of intresting stuff in meta
$this->_title = $html->find('meta[property=og:title]', 0)->getAttribute('content');
$this->_url = $html->find('meta[property=og:url]', 0)->getAttribute('content');
$html = defaultLinkTo($html, static::URI);
$this->title = $html->find('meta[property=og:title]', 0)->getAttribute('content');
$this->url = $html->find('meta[property=og:url]', 0)->getAttribute('content');
// I don't even know where to start with this discusting html...
foreach($html->find('div[jsname=WsjYwc]') as $post) {
$item = array();
$item['author'] = $item['fullname'] = $post->find('div div div div a', 0)->innertext;
$item['id'] = $post->find('div div div', 0)->getAttribute('id');
$item['avatar'] = $post->find('div img', 0)->src;
$item['uri'] = self::URI . $post->find('div div div a', 1)->href;
$item['author'] = $post->find('div div div div a', 0)->innertext;
$item['uri'] = $post->find('div div div a', 1)->href;
$timestamp = $post->find('a.qXj2He span', 0);
@ -51,61 +55,151 @@ class GooglePlusPostBridge extends BridgeAbstract{
$timestamp->getAttribute('aria-label')));
}
// hashtag to treat : https://plus.google.com/explore/tag
// $hashtags = array();
// foreach($post->find('a.d-s') as $hashtag){
// $hashtags[trim($hashtag->plaintext)] = self::URI . $hashtag->href;
// }
$message = $post->find('div[jsname=EjRJtf]', 0);
$item['content'] = '';
// Empty messages are not supported right now
if(!$message) {
continue;
}
// avatar display
$item['content'] .= '<div style="float:left; margin: 0 0.5em 0.5em 0;"><a href="'
. self::URI
. urlencode($this->getInput('username'));
$item['content'] .= '"><img align="top" alt="'
$item['content'] = '<div style="float: left; padding: 0 10px 10px 0;"><a href="'
. $this->url
. '"><img align="top" alt="'
. $item['author']
. '" src="'
. $item['avatar']
. '" /></a></div>';
. $post->find('div img', 0)->src
. '" /></a></div><div>'
. trim(strip_tags($message, '<a><p><div><img>'))
. '</div>';
$content = $post->find('div[jsname=EjRJtf]', 0);
// extract plaintext
$item['content_simple'] = $content->plaintext;
$item['title'] = substr($item['content_simple'], 0, 72) . '...';
// XXX ugly but I don't have any idea how to do a better stuff,
// str_replace on link doesn't work as expected and ask too many checks
foreach($content->find('a') as $link) {
$hasHttp = strpos($link->href, 'http');
$hasDoubleSlash = strpos($link->href, '//');
if((!$hasHttp && !$hasDoubleSlash)
|| (false !== $hasHttp && strpos($link->href, 'http') != 0)
|| (false === $hasHttp && false !== $hasDoubleSlash && $hasDoubleSlash != 0)) {
// skipp bad link, for some hashtag or other stuff
if(strpos($link->href, '/') == 0) {
$link->href = substr($link->href, 1);
}
$link->href = self::URI . $link->href;
}
// Make title at least 50 characters long, but don't add '...' if it is shorter!
if(strlen($message->plaintext) > 50) {
$end = strpos($message->plaintext, ' ', 50) ?: strlen($message->plaintext);
} else {
$end = strlen($message->plaintext);
}
$content = $content->innertext;
$item['content'] .= '<div style="margin-top: -1.5em">' . $content . '</div>';
$item['content'] = trim(strip_tags($item['content'], '<a><p><div><img>'));
if(strlen(substr($message->plaintext, 0, $end)) === strlen($message->plaintext)) {
$item['title'] = $message->plaintext;
} else {
$item['title'] = substr($message->plaintext, 0, $end) . '...';
}
$media = $post->find('[jsname="MTOxpb"]', 0);
if($media) {
$item['enclosures'] = array();
foreach($media->find('img') as $img) {
$item['enclosures'][] = $this->fixImage($img)->src;
}
if($this->getInput('include_media') === true && count($item['enclosures'] > 0)) {
$item['content'] .= '<div style="clear: both;"><a href="'
. $item['enclosures'][0]
. '"><img src="'
. $item['enclosures'][0]
. '" /></a></div>';
}
}
// Add custom parameters (only useful for JSON or Plaintext)
$item['fullname'] = $item['author'];
$item['avatar'] = $post->find('div img', 0)->src;
$item['id'] = $post->find('div div div', 0)->getAttribute('id');
$item['content_simple'] = $message->plaintext;
$this->items[] = $item;
}
}
public function getName(){
return $this->_title ?: 'Google Plus Post Bridge';
return $this->title ?: 'Google Plus Post Bridge';
}
public function getURI(){
return $this->_url ?: parent::getURI();
return $this->url ?: parent::getURI();
}
private function fixImage($img) {
// There are certain images like .gif which link to a static picture and
// get replaced dynamically via JS in the browser. If we want the "real"
// image we need to account for that.
$urlparts = parse_url($img->src);
if(array_key_exists('host', $urlparts)) {
// For some reason some URIs don't contain the scheme, assume https
if(!array_key_exists('scheme', $urlparts)) {
$urlparts['scheme'] = 'https';
}
$pathelements = explode('/', $urlparts['path']);
switch($urlparts['host']) {
case 'lh3.googleusercontent.com':
if(pathinfo(end($pathelements), PATHINFO_EXTENSION)) {
// The second to last element of the path specifies the
// image format. The URL is still valid if we remove it.
unset($pathelements[count($pathelements) - 2]);
} elseif(strrpos(end($pathelements), '=') !== false) {
// Some images go throug a proxy. For those images they
// add size information after an equal sign.
// Example: '=w530-h298-n'. Again this can safely be
// removed to get the original image.
$pathelements[count($pathelements) - 1] = substr(
end($pathelements),
0,
strrpos(end($pathelements), '=')
);
}
break;
}
$urlparts['path'] = implode('/', $pathelements);
}
$img->src = $this->build_url($urlparts);
return $img;
}
/**
* From: https://gist.github.com/Ellrion/f51ba0d40ae1d62eeae44fd1adf7b704
* slightly adjusted to work with PHP < 7.0
* @param array $parts
* @return string
*/
private function build_url(array $parts)
{
$scheme = isset($parts['scheme']) ? ($parts['scheme'] . '://') : '';
$host = isset($parts['host']) ? $parts['host'] : '';
$port = isset($parts['port']) ? (':' . $parts['port']) : '';
$user = isset($parts['user']) ? $parts['user'] : '';
$pass = isset($parts['pass']) ? (':' . $parts['pass']) : '';
$pass = ($user || $pass) ? ($pass . '@') : '';
$path = isset($parts['path']) ? $parts['path'] : '';
$query = isset($parts['query']) ? ('?' . $parts['query']) : '';
$fragment = isset($parts['fragment']) ? ('#' . $parts['fragment']) : '';
return implode('', [$scheme, $user, $pass, $host, $port, $path, $query, $fragment]);
}
}

View File

@ -17,7 +17,7 @@ class GoogleSearchBridge extends BridgeAbstract {
const PARAMETERS = array(array(
'q' => array(
'name' => "keyword",
'name' => 'keyword',
'required' => true
)
));

View File

@ -0,0 +1,62 @@
<?php
class GrandComicsDatabaseBridge extends BridgeAbstract {
const MAINTAINER = 'corenting';
const NAME = 'Grand Comics Database Bridge';
const URI = 'https://www.comics.org/';
const CACHE_TIMEOUT = 7200; // 2h
const DESCRIPTION = 'Returns the latest comics added to a series timeline';
const PARAMETERS = array( array(
'series' => array(
'name' => 'Series id (from the timeline URL)',
'required' => true,
'exampleValue' => '63051',
),
));
public function collectData(){
$url = self::URI . 'series/' . $this->getInput('series') . '/details/timeline/';
$html = getSimpleHTMLDOM($url)
or returnServerError('Error while downloading the website content');
$table = $html->find('table', 0);
$list = array_reverse($table->find('[class^=row_even]'));
$seriesName = $html->find('span[id=series_name]', 0)->innertext;
// Get row headers
$rowHeaders = $table->find('th');
foreach($list as $article) {
// Skip empty rows
$emptyRow = $article->find('td.empty_month');
if (count($emptyRow) != 0) {
continue;
}
$rows = $article->find('td');
$key_date = $rows[0]->innertext;
// Get URL too
$uri = 'https://www.comics.org' . $article->find('a')[0]->href;
// Build content
$content = '';
for($i = 0; $i < count($rowHeaders); $i++) {
$headerItem = $rowHeaders[$i]->innertext;
$rowItem = $rows[$i]->innertext;
$content = $content . $headerItem . ': ' . $rowItem . '<br/>';
}
// Build final item
$content = str_replace('href="/', 'href="' . static::URI, $content);
$item = array();
$item['title'] = $seriesName . ' - ' . $key_date;
$item['timestamp'] = strtotime($key_date);
$item['content'] = str_get_html($content);
$item['uri'] = $uri;
$this->items[] = $item;
}
}
}

1397
bridges/HotUKDealsBridge.php Normal file

File diff suppressed because it is too large Load Diff

View File

@ -85,7 +85,7 @@ class InstagramBridge extends BridgeAbstract {
$item['content'] = $data[0];
$item['enclosures'] = $data[1];
} else {
$item['content'] = '<img src="' . htmlentities($media->display_url) . '" alt="'. $item["title"] . '" />';
$item['content'] = '<img src="' . htmlentities($media->display_url) . '" alt="'. $item['title'] . '" />';
$item['enclosures'] = array($media->display_url);
}

View File

@ -0,0 +1,370 @@
<?php
/**
* This class implements a bridge for http://www.instructables.com, supporting
* general feeds and feeds by category. Instructables doesn't support HTTPS as
* of now (23.06.2018), so all connections are insecure!
*
* Remarks:
* - For some reason it is very important to have the category URI end with a
* slash, otherwise the site defaults to the main category (i.e. Technology)!
* If you need to update the categories list, enable the 'listCategories'
* function (see comments below) and run the bridge with format=Html (see page
* source)
*/
class InstructablesBridge extends BridgeAbstract {
const NAME = 'Instructables Bridge';
const URI = 'http://www.instructables.com';
const DESCRIPTION = 'Returns general feeds and feeds by category';
const MAINTAINER = 'logmanoriginal';
const PARAMETERS = array(
'Category' => array(
'category' => array(
'name' => 'Category',
'type' => 'list',
'required' => true,
'values' => array(
'Play' => array(
'All' => '/play/',
'KNEX' => '/play/knex/',
'Offbeat' => '/play/offbeat/',
'Lego' => '/play/lego/',
'Airsoft' => '/play/airsoft/',
'Card Games' => '/play/card-games/',
'Guitars' => '/play/guitars/',
'Instruments' => '/play/instruments/',
'Magic Tricks' => '/play/magic-tricks/',
'Minecraft' => '/play/minecraft/',
'Music' => '/play/music/',
'Nerf' => '/play/nerf/',
'Nintendo' => '/play/nintendo/',
'Office Supplies' => '/play/office-supplies/',
'Paintball' => '/play/paintball/',
'Paper Airplanes' => '/play/paper-airplanes/',
'Party Tricks' => '/play/party-tricks/',
'PlayStation' => '/play/playstation/',
'Pranks and Humor' => '/play/pranks-and-humor/',
'Puzzles' => '/play/puzzles/',
'Siege Engines' => '/play/siege-engines/',
'Sports' => '/play/sports/',
'Table Top' => '/play/table-top/',
'Toys' => '/play/toys/',
'Video Games' => '/play/video-games/',
'Wii' => '/play/wii/',
'Xbox' => '/play/xbox/',
'Yo-Yo' => '/play/yo-yo/',
),
'Craft' => array(
'All' => '/craft/',
'Art' => '/craft/art/',
'Sewing' => '/craft/sewing/',
'Paper' => '/craft/paper/',
'Jewelry' => '/craft/jewelry/',
'Fashion' => '/craft/fashion/',
'Books & Journals' => '/craft/books-and-journals/',
'Cards' => '/craft/cards/',
'Clay' => '/craft/clay/',
'Duct Tape' => '/craft/duct-tape/',
'Embroidery' => '/craft/embroidery/',
'Felt' => '/craft/felt/',
'Fiber Arts' => '/craft/fiber-arts/',
'Gifts & Wrapping' => '/craft/gifts-and-wrapping/',
'Knitting & Crocheting' => '/craft/knitting-and-crocheting/',
'Leather' => '/craft/leather/',
'Mason Jars' => '/craft/mason-jars/',
'No-Sew' => '/craft/no-sew/',
'Parties & Weddings' => '/craft/parties-and-weddings/',
'Print Making' => '/craft/print-making/',
'Soap' => '/craft/soap/',
'Wallets' => '/craft/wallets/',
),
'Technology' => array(
'All' => '/technology/',
'Electronics' => '/technology/electronics/',
'Arduino' => '/technology/arduino/',
'Photography' => '/technology/photography/',
'Leds' => '/technology/leds/',
'Science' => '/technology/science/',
'Reuse' => '/technology/reuse/',
'Apple' => '/technology/apple/',
'Computers' => '/technology/computers/',
'3D Printing' => '/technology/3D-Printing/',
'Robots' => '/technology/robots/',
'Art' => '/technology/art/',
'Assistive Tech' => '/technology/assistive-technology/',
'Audio' => '/technology/audio/',
'Clocks' => '/technology/clocks/',
'CNC' => '/technology/cnc/',
'Digital Graphics' => '/technology/digital-graphics/',
'Gadgets' => '/technology/gadgets/',
'Kits' => '/technology/kits/',
'Laptops' => '/technology/laptops/',
'Lasers' => '/technology/lasers/',
'Linux' => '/technology/linux/',
'Microcontrollers' => '/technology/microcontrollers/',
'Microsoft' => '/technology/microsoft/',
'Mobile' => '/technology/mobile/',
'Raspberry Pi' => '/technology/raspberry-pi/',
'Remote Control' => '/technology/remote-control/',
'Sensors' => '/technology/sensors/',
'Software' => '/technology/software/',
'Soldering' => '/technology/soldering/',
'Speakers' => '/technology/speakers/',
'Steampunk' => '/technology/steampunk/',
'Tools' => '/technology/tools/',
'USB' => '/technology/usb/',
'Wearables' => '/technology/wearables/',
'Websites' => '/technology/websites/',
'Wireless' => '/technology/wireless/',
),
'Workshop' => array(
'All' => '/workshop/',
'Woodworking' => '/workshop/woodworking/',
'Tools' => '/workshop/tools/',
'Gardening' => '/workshop/gardening/',
'Cars' => '/workshop/cars/',
'Metalworking' => '/workshop/metalworking/',
'Cardboard' => '/workshop/cardboard/',
'Electric Vehicles' => '/workshop/electric-vehicles/',
'Energy' => '/workshop/energy/',
'Furniture' => '/workshop/furniture/',
'Home Improvement' => '/workshop/home-improvement/',
'Home Theater' => '/workshop/home-theater/',
'Hydroponics' => '/workshop/hydroponics/',
'Laser Cutting' => '/workshop/laser-cutting/',
'Lighting' => '/workshop/lighting/',
'Molds & Casting' => '/workshop/molds-and-casting/',
'Motorcycles' => '/workshop/motorcycles/',
'Organizing' => '/workshop/organizing/',
'Pallets' => '/workshop/pallets/',
'Repair' => '/workshop/repair/',
'Shelves' => '/workshop/shelves/',
'Solar' => '/workshop/solar/',
'Workbenches' => '/workshop/workbenches/',
),
'Home' => array(
'All' => '/home/',
'Halloween' => '/home/halloween/',
'Decorating' => '/home/decorating/',
'Organizing' => '/home/organizing/',
'Pets' => '/home/pets/',
'Life Hacks' => '/home/life-hacks/',
'Beauty' => '/home/beauty/',
'Christmas' => '/home/christmas/',
'Cleaning' => '/home/cleaning/',
'Education' => '/home/education/',
'Finances' => '/home/finances/',
'Gardening' => '/home/gardening/',
'Green' => '/home/green/',
'Health' => '/home/health/',
'Hiding Places' => '/home/hiding-places/',
'Holidays' => '/home/holidays/',
'Homesteading' => '/home/homesteading/',
'Kids' => '/home/kids/',
'Kitchen' => '/home/kitchen/',
'Life Skills' => '/home/life-skills/',
'Parenting' => '/home/parenting/',
'Pest Control' => '/home/pest-control/',
'Relationships' => '/home/relationships/',
'Reuse' => '/home/reuse/',
'Travel' => '/home/travel/',
),
'Outside' => array(
'All' => '/outside/',
'Bikes' => '/outside/bikes/',
'Survival' => '/outside/survival/',
'Backyard' => '/outside/backyard/',
'Beach' => '/outside/beach/',
'Birding' => '/outside/birding/',
'Boats' => '/outside/boats/',
'Camping' => '/outside/camping/',
'Climbing' => '/outside/climbing/',
'Fire' => '/outside/fire/',
'Fishing' => '/outside/fishing/',
'Hunting' => '/outside/hunting/',
'Kites' => '/outside/kites/',
'Knives' => '/outside/knives/',
'Knots' => '/outside/knots/',
'Paracord' => '/outside/paracord/',
'Rockets' => '/outside/rockets/',
'Skateboarding' => '/outside/skateboarding/',
'Snow' => '/outside/snow/',
'Water' => '/outside/water/',
),
'Food' => array(
'All' => '/food/',
'Dessert' => '/food/dessert/',
'Snacks & Appetizers' => '/food/snacks-and-appetizers/',
'Bacon' => '/food/bacon/',
'BBQ & Grilling' => '/food/bbq-and-grilling/',
'Beverages' => '/food/beverages/',
'Bread' => '/food/bread/',
'Breakfast' => '/food/breakfast/',
'Cake' => '/food/cake/',
'Candy' => '/food/candy/',
'Canning & Preserves' => '/food/canning-and-preserves/',
'Cocktails & Mocktails' => '/food/cocktails-and-mocktails/',
'Coffee' => '/food/coffee/',
'Cookies' => '/food/cookies/',
'Cupcakes' => '/food/cupcakes/',
'Homebrew' => '/food/homebrew/',
'Main Course' => '/food/main-course/',
'Pasta' => '/food/pasta/',
'Pie' => '/food/pie/',
'Pizza' => '/food/pizza/',
'Salad' => '/food/salad/',
'Sandwiches' => '/food/sandwiches/',
'Soups & Stews' => '/food/soups-and-stews/',
'Vegetarian & Vegan' => '/food/vegetarian-and-vegan/',
),
'Costumes' => array(
'All' => '/costumes/',
'Props' => '/costumes/props-and-accessories/',
'Animals' => '/costumes/animals/',
'Comics' => '/costumes/comics/',
'Fantasy' => '/costumes/fantasy/',
'For Kids' => '/costumes/for-kids/',
'For Pets' => '/costumes/for-pets/',
'Funny' => '/costumes/funny/',
'Games' => '/costumes/games/',
'Historic & Futuristic' => '/costumes/historic-and-futuristic/',
'Makeup' => '/costumes/makeup/',
'Masks' => '/costumes/masks/',
'Scary' => '/costumes/scary/',
'TV & Movies' => '/costumes/tv-and-movies/',
'Weapons & Armor' => '/costumes/weapons-and-armor/',
)
),
'title' => 'Select your category (required)',
'defaultValue' => 'Technology'
),
'filter' => array(
'name' => 'Filter',
'type' => 'list',
'required' => true,
'values' => array(
'Featured' => ' ',
'Recent' => 'recent/',
'Popular' => 'popular/',
'Views' => 'views/',
'Contest Winners' => 'winners/'
),
'title' => 'Select a filter',
'defaultValue' => 'Featured'
)
)
);
private $uri;
public function collectData() {
// Enable the following line to get the category list (dev mode)
// $this->listCategories();
$this->uri = static::URI;
switch($this->queriedContext) {
case 'Category': $this->uri .= $this->getInput('category') . $this->getInput('filter');
}
$html = getSimpleHTMLDOM($this->uri)
or returnServerError('Error loading category ' . $this->uri);
foreach($html->find('ul.explore-covers-list li') as $cover) {
$item = array();
$item['uri'] = static::URI . $cover->find('a.cover-image', 0)->href;
$item['title'] = $cover->find('.title', 0)->innertext;
$item['author'] = $this->getCategoryAuthor($cover);
$item['content'] = '<a href='
. $item['uri']
. '><img src='
. $cover->find('a.cover-image img', 0)->src
. '></a>';
$image = str_replace('.RECTANGLE1', '.LARGE', $cover->find('a.cover-image img', 0)->src);
$item['enclosures'] = [$image];
$this->items[] = $item;
}
}
public function getName() {
if(!is_null($this->getInput('category'))
&& !is_null($this->getInput('filter'))) {
foreach(self::PARAMETERS[$this->queriedContext]['category']['values'] as $key => $value) {
$subcategory = array_search($this->getInput('category'), $value);
if($subcategory !== false)
break;
}
$filter = array_search(
$this->getInput('filter'),
self::PARAMETERS[$this->queriedContext]['filter']['values']
);
return $subcategory . ' (' . $filter . ') - ' . static::NAME;
}
return parent::getName();
}
public function getURI() {
if(!is_null($this->getInput('category'))
&& !is_null($this->getInput('filter'))) {
return $this->uri;
}
return parent::getURI();
}
/**
* Returns a list of categories for development purposes (used to build the
* parameters list)
*/
private function listCategories(){
// Use arbitrary category to receive full list
$html = getSimpleHTMLDOM(self::URI . '/technology/');
foreach($html->find('.channel a') as $channel) {
$name = html_entity_decode(trim($channel->innertext));
// Remove unwanted entities
$name = str_replace("'", '', $name);
$name = str_replace('&#39;', '', $name);
$uri = $channel->href;
$category = explode('/', $uri)[1];
if(!isset($categories)
|| !array_key_exists($category, $categories)
|| !in_array($uri, $categories[$category]))
$categories[$category][$name] = $uri;
}
// Build PHP array manually
foreach($categories as $key => $value) {
$name = ucfirst($key);
echo "'{$name}' => array(\n";
echo "\t'All' => '/{$key}/',\n";
foreach($value as $name => $uri) {
echo "\t'{$name}' => '{$uri}',\n";
}
echo "),\n";
}
die;
}
/**
* Returns the author as anchor for a given cover.
*/
private function getCategoryAuthor($cover) {
return '<a href='
. static::URI . $cover->find('span.author a', 0)->href
. '>'
. $cover->find('span.author a', 0)->innertext
. '</a>';
}
}

View File

@ -1,465 +0,0 @@
<?php
class IsoHuntBridge extends BridgeAbstract {
const MAINTAINER = 'logmanoriginal';
const NAME = 'isoHunt Bridge';
const URI = 'https://isohunt.to/';
const CACHE_TIMEOUT = 300; //5min
const DESCRIPTION = 'Returns the latest results by category or search result';
const PARAMETERS = array(
/*
* Get feeds for one of the "latest" categories
* Notice: The categories "News" and "Top Searches" are received from the main page
* Elements are sorted by name ascending!
*/
'By "Latest" category' => array(
'latest_category' => array(
'name' => 'Latest category',
'type' => 'list',
'required' => true,
'title' => 'Select your category',
'defaultValue' => 'news',
'values' => array(
'Hot Torrents' => 'hot_torrents',
'News' => 'news',
'Releases' => 'releases',
'Torrents' => 'torrents'
)
)
),
/*
* Get feeds for one of the "torrent" categories
* Make sure to add new categories also to get_torrent_category_index($)!
* Elements are sorted by name ascending!
*/
'By "Torrent" category' => array(
'torrent_category' => array(
'name' => 'Torrent category',
'type' => 'list',
'required' => true,
'title' => 'Select your category',
'defaultValue' => 'anime',
'values' => array(
'Adult' => 'adult',
'Anime' => 'anime',
'Books' => 'books',
'Games' => 'games',
'Movies' => 'movies',
'Music' => 'music',
'Other' => 'other',
'Series & TV' => 'series_tv',
'Software' => 'software'
)
),
'torrent_popularity' => array(
'name' => 'Sort by popularity',
'type' => 'checkbox',
'title' => 'Activate to receive results by popularity'
)
),
/*
* Get feeds for a specific search request
*/
'Search torrent by name' => array(
'search_name' => array(
'name' => 'Name',
'required' => true,
'title' => 'Insert your search query',
'exampleValue' => 'Bridge'
),
'search_category' => array(
'name' => 'Category',
'type' => 'list',
'title' => 'Select your category',
'defaultValue' => 'all',
'values' => array(
'Adult' => 'adult',
'All' => 'all',
'Anime' => 'anime',
'Books' => 'books',
'Games' => 'games',
'Movies' => 'movies',
'Music' => 'music',
'Other' => 'other',
'Series & TV' => 'series_tv',
'Software' => 'software'
)
)
)
);
public function getURI(){
$uri = self::URI;
switch($this->queriedContext) {
case 'By "Latest" category':
switch($this->getInput('latest_category')) {
case 'hot_torrents':
$uri .= 'statistic/hot/torrents';
break;
case 'news':
break;
case 'releases':
$uri .= 'releases.php';
break;
case 'torrents':
$uri .= 'latest.php';
break;
}
break;
case 'By "Torrent" category':
$uri .= $this->buildCategoryUri(
$this->getInput('torrent_category'),
$this->getInput('torrent_popularity')
);
break;
case 'Search torrent by name':
$category = $this->getInput('search_category');
$uri .= $this->buildCategoryUri($category);
if($category !== 'movies')
$uri .= '&ihq=' . urlencode($this->getInput('search_name'));
break;
default: parent::getURI();
}
return $uri;
}
public function getName(){
switch($this->queriedContext) {
case 'By "Latest" category':
$categoryName = array_search(
$this->getInput('latest_category'),
self::PARAMETERS['By "Latest" category']['latest_category']['values']
);
$name = 'Latest ' . $categoryName . ' - ' . self::NAME;
break;
case 'By "Torrent" category':
$categoryName = array_search(
$this->getInput('torrent_category'),
self::PARAMETERS['By "Torrent" category']['torrent_category']['values']
);
$name = 'Category: ' . $categoryName . ' - ' . self::NAME;
break;
case 'Search torrent by name':
$categoryName = array_search(
$this->getInput('search_category'),
self::PARAMETERS['Search torrent by name']['search_category']['values']
);
$name = 'Search: "'
. $this->getInput('search_name')
. '" in category: '
. $categoryName . ' - '
. self::NAME;
break;
default: return parent::getName();
}
return $name;
}
public function collectData(){
$html = $this->loadHtml($this->getURI());
switch($this->queriedContext) {
case 'By "Latest" category':
switch($this->getInput('latest_category')) {
case 'hot_torrents':
$this->getLatestHotTorrents($html);
break;
case 'news':
$this->getLatestNews($html);
break;
case 'releases':
case 'torrents':
$this->getLatestTorrents($html);
break;
}
break;
case 'By "Torrent" category':
if($this->getInput('torrent_category') === 'movies') {
// This one is special (content wise)
$this->getMovieTorrents($html);
} else {
$this->getLatestTorrents($html);
}
break;
case 'Search torrent by name':
if($this->getInput('search_category') === 'movies') {
// This one is special (content wise)
$this->getMovieTorrents($html);
} else {
$this->getLatestTorrents($html);
}
break;
}
}
#region Helper functions for "Movie Torrents"
private function getMovieTorrents($html){
$container = $html->find('div#w0', 0);
if(!$container)
returnServerError('Unable to find torrent container!');
$torrents = $container->find('article');
if(!$torrents)
returnServerError('Unable to find torrents!');
foreach($torrents as $torrent) {
$anchor = $torrent->find('a', 0);
if(!$anchor)
returnServerError('Unable to find anchor!');
$date = $torrent->find('small', 0);
if(!$date)
returnServerError('Unable to find date!');
$item = array();
$item['uri'] = $this->fixRelativeUri($anchor->href);
$item['title'] = $anchor->title;
// $item['author'] =
$item['timestamp'] = strtotime($date->plaintext);
$item['content'] = $this->fixRelativeUri($torrent->innertext);
$this->items[] = $item;
}
}
#endregion
#region Helper functions for "Latest Hot Torrents"
private function getLatestHotTorrents($html){
$container = $html->find('div#serps', 0);
if(!$container)
returnServerError('Unable to find torrent container!');
$torrents = $container->find('tr');
if(!$torrents)
returnServerError('Unable to find torrents!');
// Remove first element (header row)
$torrents = array_slice($torrents, 1);
foreach($torrents as $torrent) {
$cell = $torrent->find('td', 0);
if(!$cell)
returnServerError('Unable to find cell!');
$element = $cell->find('a', 0);
if(!$element)
returnServerError('Unable to find element!');
$item = array();
$item['uri'] = $element->href;
$item['title'] = $element->plaintext;
// $item['author'] =
// $item['timestamp'] =
// $item['content'] =
$this->items[] = $item;
}
}
#endregion
#region Helper functions for "Latest News"
private function getLatestNews($html){
$container = $html->find('div#postcontainer', 0);
if(!$container)
returnServerError('Unable to find post container!');
$posts = $container->find('div.index-post');
if(!$posts)
returnServerError('Unable to find posts!');
foreach($posts as $post) {
$item = array();
$item['uri'] = $this->latestNewsExtractUri($post);
$item['title'] = $this->latestNewsExtractTitle($post);
$item['author'] = $this->latestNewsExtractAuthor($post);
$item['timestamp'] = $this->latestNewsExtractTimestamp($post);
$item['content'] = $this->latestNewsExtractContent($post);
$this->items[] = $item;
}
}
private function latestNewsExtractAuthor($post){
$author = $post->find('small', 0);
if(!$author)
returnServerError('Unable to find author!');
// The author is hidden within a string like: 'Posted by {author} on {date}'
preg_match('/Posted\sby\s(.*)\son/i', $author->innertext, $matches);
return $matches[1];
}
private function latestNewsExtractTimestamp($post){
$date = $post->find('small', 0);
if(!$date)
returnServerError('Unable to find date!');
// The date is hidden within a string like: 'Posted by {author} on {date}'
preg_match('/Posted\sby\s.*\son\s(.*)/i', $date->innertext, $matches);
$timestamp = strtotime($matches[1]);
// Make sure date is not in the future (dates are given like 'Nov. 20' without year)
if($timestamp > time()) {
$timestamp = strtotime('-1 year', $timestamp);
}
return $timestamp;
}
private function latestNewsExtractTitle($post){
$title = $post->find('a', 0);
if(!$title)
returnServerError('Unable to find title!');
return $title->plaintext;
}
private function latestNewsExtractUri($post){
$uri = $post->find('a', 0);
if(!$uri)
returnServerError('Unable to find uri!');
return $uri->href;
}
private function latestNewsExtractContent($post){
$content = $post->find('div', 0);
if(!$content)
returnServerError('Unable to find content!');
// Remove <h2>...</h2> (title)
foreach($content->find('h2') as $element) {
$element->outertext = '';
}
// Remove <small>...</small> (author)
foreach($content->find('small') as $element) {
$element->outertext = '';
}
return $content->innertext;
}
#endregion
#region Helper functions for "Latest Torrents", "Latest Releases" and "Torrent Category"
private function getLatestTorrents($html){
$container = $html->find('div#serps', 0);
if(!$container)
returnServerError('Unable to find torrent container!');
$torrents = $container->find('tr[data-key]');
if(!$torrents)
returnServerError('Unable to find torrents!');
foreach($torrents as $torrent) {
$item = array();
$item['uri'] = $this->latestTorrentsExtractUri($torrent);
$item['title'] = $this->latestTorrentsExtractTitle($torrent);
$item['author'] = $this->latestTorrentsExtractAuthor($torrent);
$item['timestamp'] = $this->latestTorrentsExtractTimestamp($torrent);
$item['content'] = ''; // There is no valuable content
$this->items[] = $item;
}
}
private function latestTorrentsExtractTitle($torrent){
$cell = $torrent->find('td.title-row', 0);
if(!$cell)
returnServerError('Unable to find title cell!');
$title = $cell->find('span', 0);
if(!$title)
returnServerError('Unable to find title!');
return $title->plaintext;
}
private function latestTorrentsExtractUri($torrent){
$cell = $torrent->find('td.title-row', 0);
if(!$cell)
returnServerError('Unable to find title cell!');
$uri = $cell->find('a', 0);
if(!$uri)
returnServerError('Unable to find uri!');
return $this->fixRelativeUri($uri->href);
}
private function latestTorrentsExtractAuthor($torrent){
$cell = $torrent->find('td.user-row', 0);
if(!$cell)
return; // No author
$user = $cell->find('a', 0);
if(!$user)
returnServerError('Unable to find user!');
return $user->plaintext;
}
private function latestTorrentsExtractTimestamp($torrent){
$cell = $torrent->find('td.date-row', 0);
if(!$cell)
returnServerError('Unable to find date cell!');
return strtotime('-' . $cell->plaintext, time());
}
#endregion
#region Generic helper functions
private function loadHtml($uri){
$html = getSimpleHTMLDOM($uri);
if(!$html)
returnServerError('Unable to load ' . $uri . '!');
return $html;
}
private function fixRelativeUri($uri){
return preg_replace('/\//i', self::URI, $uri, 1);
}
private function buildCategoryUri($category, $order_popularity = false){
switch($category) {
case 'anime': $index = 1; break;
case 'software' : $index = 2; break;
case 'games' : $index = 3; break;
case 'adult' : $index = 4; break;
case 'movies' : $index = 5; break;
case 'music' : $index = 6; break;
case 'other' : $index = 7; break;
case 'series_tv' : $index = 8; break;
case 'books': $index = 9; break;
case 'all':
default: $index = 0; break;
}
return 'torrents/?iht=' . $index . '&ihs=' . ($order_popularity ? 1 : 0) . '&age=0';
}
#endregion
}

View File

@ -3,7 +3,7 @@ class JapanExpoBridge extends BridgeAbstract {
const MAINTAINER = 'Ginko';
const NAME = 'Japan Expo Actualités';
const URI = 'http://www.japan-expo-paris.com/fr/actualites';
const URI = 'https://www.japan-expo-paris.com/fr/actualites';
const CACHE_TIMEOUT = 14400; // 4h
const DESCRIPTION = 'Returns most recent entries from Japan Expo actualités.';
const PARAMETERS = array( array(
@ -51,7 +51,7 @@ class JapanExpoBridge extends BridgeAbstract {
foreach($html->find('a._tile2') as $element) {
$url = $element->href;
$thumbnail = 'http://s.japan-expo.com/katana/images/JES049/paris.png';
$thumbnail = 'https://s.japan-expo.com/katana/images/JES049/paris.png';
preg_match('/url\(([^)]+)\)/', $element->find('img.rspvimgset', 0)->style, $img_search_result);
if(count($img_search_result) >= 2)
@ -62,7 +62,8 @@ class JapanExpoBridge extends BridgeAbstract {
break;
}
$article_html = getSimpleHTMLDOMCached('Could not request JapanExpo: ' . $url);
$article_html = getSimpleHTMLDOMCached($url)
or returnServerError('Could not request JapanExpo: ' . $url);
$header = $article_html->find('header.pageHeadBox', 0);
$timestamp = strtotime($header->find('time', 0)->datetime);
$title_html = $header->find('div.section', 0)->next_sibling();
@ -92,6 +93,7 @@ class JapanExpoBridge extends BridgeAbstract {
$item['uri'] = $url;
$item['title'] = $title;
$item['timestamp'] = $timestamp;
$item['enclosures'] = array($thumbnail);
$item['content'] = $content;
$this->items[] = $item;
$count++;

353
bridges/JustETFBridge.php Normal file
View File

@ -0,0 +1,353 @@
<?php
class JustETFBridge extends BridgeAbstract {
const NAME = 'justETF Bridge';
const URI = 'https://www.justetf.com';
const DESCRIPTION = 'Currently only supports the news feed';
const MAINTAINER = 'logmanoriginal';
const PARAMETERS = array(
'News' => array(
'full' => array(
'name' => 'Full Article',
'type' => 'checkbox',
'title' => 'Enable to load full articles'
)
),
'Profile' => array(
'isin' => array(
'name' => 'ISIN',
'type' => 'text',
'required' => true,
'pattern' => '[a-zA-Z]{2}[a-zA-Z0-9]{10}',
'title' => 'ISIN, consisting of 2-letter country code, 9-character identifier, check character'
),
'strategy' => array(
'name' => 'Include Strategy',
'type' => 'checkbox',
'defaultValue' => 'checked'
),
'description' => array(
'name' => 'Include Description',
'type' => 'checkbox',
'defaultValue' => 'checked'
)
),
'global' => array(
'lang' => array(
'name' => 'Language',
'required' => true,
'type' => 'list',
'values' => array(
'Englisch' => 'en',
'Deutsch' => 'de',
'Italiano' => 'it'
),
'defaultValue' => 'Englisch'
)
)
);
public function collectData() {
$html = getSimpleHTMLDOM($this->getURI())
or returnServerError('Failed loading contents from ' . $this->getURI());
defaultLinkTo($html, static::URI);
switch($this->queriedContext) {
case 'News':
$this->collectNews($html);
break;
case 'Profile':
$this->collectProfile($html);
break;
}
}
public function getURI() {
$uri = static::URI;
if($this->getInput('lang')) {
$uri .= '/' . $this->getInput('lang');
}
switch($this->queriedContext) {
case 'News':
$uri .= '/news';
break;
case 'Profile':
$uri .= '/etf-profile.html?' . http_build_query(array(
'isin' => strtoupper($this->getInput('isin'))
));
break;
}
return $uri;
}
public function getName() {
$name = static::NAME;
$name .= ($this->queriedContext) ? ' - ' . $this->queriedContext : '';
switch($this->queriedContext) {
case 'News': break;
case 'Profile':
if($this->getInput('isin')) {
$name .= ' ISIN ' . strtoupper($this->getInput('isin'));
}
}
if($this->getInput('lang')) {
$name .= ' (' . strtoupper($this->getInput('lang')) . ')';
}
return $name;
}
#region Common
/**
* Fixes dates depending on the choosen language:
*
* de : dd.mm.yy
* en : dd.mm.yy
* it : dd/mm/yy
*
* Basically strtotime doesn't convert dates correctly due to formats
* being hard to interpret. So we use the DateTime object, manually
* fixing dates and times (set to 00:00:00.000).
*
* We don't know the timezone, so just assume +00:00 (or whatever
* DateTime chooses)
*/
private function fixDate($date) {
switch($this->getInput('lang')) {
case 'en':
case 'de':
$df = date_create_from_format('d.m.y', $date);
break;
case 'it':
$df = date_create_from_format('d/m/y', $date);
break;
}
date_time_set($df, 0, 0);
// debugMessage(date_format($df, 'U'));
return date_format($df, 'U');
}
private function extractImages($article) {
// Notice: We can have zero or more images (though it should mostly be 1)
$elements = $article->find('img');
$images = array();
foreach($elements as $img) {
// Skip the logo (mostly provided part of a hidden div)
if(substr($img->src, strrpos($img->src, '/') + 1) === 'logo.png')
continue;
$images[] = $img->src;
}
return $images;
}
#endregion
#region News
private function collectNews($html) {
$articles = $html->find('div.newsTopArticle')
or returnServerError('No articles found! Layout might have changed!');
foreach($articles as $article) {
$item = array();
// Common data
$item['uri'] = $this->extractNewsUri($article);
$item['timestamp'] = $this->extractNewsDate($article);
$item['title'] = $this->extractNewsTitle($article);
if($this->getInput('full')) {
$uri = $this->extractNewsUri($article);
$html = getSimpleHTMLDOMCached($uri)
or returnServerError('Failed loading full article from ' . $uri);
$fullArticle = $html->find('div.article', 0)
or returnServerError('No content found! Layout might have changed!');
defaultLinkTo($fullArticle, static::URI);
$item['author'] = $this->extractFullArticleAuthor($fullArticle);
$item['content'] = $this->extractFullArticleContent($fullArticle);
$item['enclosures'] = $this->extractImages($fullArticle);
} else {
$item['content'] = $this->extractNewsDescription($article);
$item['enclosures'] = $this->extractImages($article);
}
$this->items[] = $item;
}
}
private function extractNewsUri($article) {
$element = $article->find('a', 0)
or returnServerError('Anchor not found!');
return $element->href;
}
private function extractNewsDate($article) {
$element = $article->find('div.subheadline', 0)
or returnServerError('Date not found!');
// debugMessage($element->plaintext);
$date = trim(explode('|', $element->plaintext)[0]);
return $this->fixDate($date);
}
private function extractNewsDescription($article) {
$element = $article->find('span.newsText', 0)
or returnServerError('Description not found!');
$element->find('a', 0)->onclick = '';
// debugMessage($element->innertext);
return $element->innertext;
}
private function extractNewsTitle($article) {
$element = $article->find('h3', 0)
or returnServerError('Title not found!');
return $element->plaintext;
}
private function extractFullArticleContent($article) {
$element = $article->find('div.article_body', 0)
or returnServerError('Article body not found!');
// Remove teaser image
$element->find('img.teaser-img', 0)->outertext = '';
// Remove self advertisements
foreach($element->find('.call-action') as $adv) {
$adv->outertext = '';
}
// Remove tips
foreach($element->find('.panel-edu') as $tip) {
$tip->outertext = '';
}
// Remove inline scripts (used for i.e. interactive graphs) as they are
// rendered as a long series of strings
foreach($element->find('script') as $script) {
$script->outertext = '[Content removed! Visit site to see full contents!]';
}
return $element->innertext;
}
private function extractFullArticleAuthor($article) {
$element = $article->find('span[itemprop=name]', 0)
or returnServerError('Author not found!');
return $element->plaintext;
}
#endregion
#region Profile
private function collectProfile($html) {
$item = array();
$item['uri'] = $this->getURI();
$item['timestamp'] = $this->extractProfileDate($html);
$item['title'] = $this->extractProfiletitle($html);
$item['author'] = $this->extractProfileAuthor($html);
$item['content'] = $this->extractProfileContent($html);
$this->items[] = $item;
}
private function extractProfileDate($html) {
$element = $html->find('div.infobox div.vallabel', 0)
or returnServerError('Date not found!');
// debugMessage($element->plaintext);
$date = trim(explode("\r\n", $element->plaintext)[1]);
return $this->fixDate($date);
}
private function extractProfileTitle($html) {
$element = $html->find('span.h1', 0)
or returnServerError('Title not found!');
return $element->plaintext;
}
private function extractProfileContent($html) {
// There are a few thins we are interested:
// - Investment Strategy
// - Description
// - Quote
$strategy = $html->find('div.tab-container div.col-sm-6 p', 0)
or returnServerError('Investment Strategy not found!');
// Description requires a bit of cleanup due to lack of propper identification
$description = $html->find('div.headline', 5)
or returnServerError('Description container not found!');
$description = $description->parent();
foreach($description->find('div') as $div) {
$div->outertext = '';
}
$quote = $html->find('div.infobox div.val', 0)
or returnServerError('Quote not found!');
$quote_html = '<strong>Quote</strong><br><p>' . $quote . '</p>';
$strategy_html = '';
$description_html = '';
if($this->getInput('strategy') === true) {
$strategy_html = '<strong>Strategy</strong><br><p>' . $strategy . '</p><br>';
}
if($this->getInput('description') === true) {
$description_html = '<strong>Description</strong><br><p>' . $description . '</p><br>';
}
return $strategy_html . $description_html . $quote_html;
}
private function extractProfileAuthor($html) {
// Use ISIN + WKN as author
// Notice: "identfier" is not a typo [sic]!
$element = $html->find('span.identfier', 0)
or returnServerError('Author not found!');
return $element->plaintext;
}
#endregion
}

View File

@ -58,13 +58,13 @@ class KununuBridge extends BridgeAbstract {
break;
}
return self::URI . $site . '/' . $company . '/' . $section;
return self::URI . $site . '/' . $company . '/' . $section . '?sort=update_time_desc';
}
return parent::getURI();
}
function getName(){
public function getName(){
if(!is_null($this->getInput('company'))) {
$company = $this->fixCompanyName($this->getInput('company'));
return ($this->companyName ?: $company) . ' - ' . self::NAME;
@ -73,52 +73,67 @@ class KununuBridge extends BridgeAbstract {
return parent::getName();
}
public function getIcon() {
return 'https://www.kununu.com/favicon-196x196.png';
}
public function collectData(){
$full = $this->getInput('full');
// Load page
$html = getSimpleHTMLDOMCached($this->getURI());
if(!$html)
returnServerError('Unable to receive data from ' . $this->getURI() . '!');
$html = getSimpleHTMLDOM($this->getURI())
or returnServerError('Unable to receive data from ' . $this->getURI() . '!');
$html = defaultLinkTo($html, static::URI);
// Update name for this request
$this->companyName = $this->extractCompanyName($html);
$company = $html->find('span[class="company-name"]', 0)
or returnServerError('Cannot find company name!');
$this->companyName = $company->innertext;
// Find the section with all the panels (reviews)
$section = $html->find('section.kununu-scroll-element', 0);
if($section === false)
returnServerError('Unable to find panel section!');
$section = $html->find('section.kununu-scroll-element', 0)
or returnServerError('Unable to find panel section!');
// Find all articles (within the panels)
$articles = $section->find('article');
if($articles === false || empty($articles))
returnServerError('Unable to find articles!');
$articles = $section->find('article')
or returnServerError('Unable to find articles!');
// Go through all articles
foreach($articles as $article) {
$anchor = $article->find('h1.review-title a', 0)
or returnServerError('Cannot find article URI!');
$date = $article->find('meta[itemprop=dateCreated]', 0)
or returnServerError('Cannot find article date!');
$rating = $article->find('span.rating', 0)
or returnServerError('Cannot find article rating!');
$summary = $article->find('[itemprop=name]', 0)
or returnServerError('Cannot find article summary!');
$item = array();
$item['author'] = $this->extractArticleAuthorPosition($article);
$item['timestamp'] = $this->extractArticleDate($article);
$item['title'] = $this->extractArticleRating($article)
$item['timestamp'] = strtotime($date);
$item['title'] = $rating->getAttribute('aria-label')
. ' : '
. $this->extractArticleSummary($article);
. strip_tags($summary->innertext);
$item['uri'] = $this->extractArticleUri($article);
$item['uri'] = $anchor->href;
if($full)
if($full) {
$item['content'] = $this->extractFullDescription($item['uri']);
else
} else {
$item['content'] = $this->extractArticleDescription($article);
}
$this->items[] = $item;
}
}
/**
* Fixes relative URLs in the given text
*/
private function fixUrl($text){
return preg_replace('/href=(\'|\")\//i', 'href="'.self::URI, $text);
}
}
/*
@ -128,73 +143,11 @@ class KununuBridge extends BridgeAbstract {
$company = trim($company);
$company = str_replace(' ', '-', $company);
$company = strtolower($company);
return $this->encodeUmlauts($company);
}
/**
* Encodes unmlauts in the given text
*/
private function encodeUmlauts($text){
$umlauts = Array("/ä/","/ö/","/ü/","/Ä/","/Ö/","/Ü/","/ß/");
$replace = Array("ae","oe","ue","Ae","Oe","Ue","ss");
$umlauts = Array('/ä/','/ö/','/ü/','/Ä/','/Ö/','/Ü/','/ß/');
$replace = Array('ae','oe','ue','Ae','Oe','Ue','ss');
return preg_replace($umlauts, $replace, $text);
}
/**
* Returns the company name from the review html
*/
private function extractCompanyName($html){
$company_name = $html->find('h1[itemprop=name]', 0);
if(is_null($company_name))
returnServerError('Cannot find company name!');
return $company_name->plaintext;
}
/**
* Returns the date from a given article
*/
private function extractArticleDate($article){
// They conviniently provide a time attribute for us :)
$date = $article->find('meta[itemprop=dateCreated]', 0);
if(is_null($date))
returnServerError('Cannot find article date!');
return strtotime($date->content);
}
/**
* Returns the rating from a given article
*/
private function extractArticleRating($article){
$rating = $article->find('span.rating', 0);
if(is_null($rating))
returnServerError('Cannot find article rating!');
return $rating->getAttribute('aria-label');
}
/**
* Returns the summary from a given article
*/
private function extractArticleSummary($article){
$summary = $article->find('[itemprop=name]', 0);
if(is_null($summary))
returnServerError('Cannot find article summary!');
return strip_tags($summary->innertext);
}
/**
* Returns the URI from a given article
*/
private function extractArticleUri($article){
$anchor = $article->find('ku-company-review-more', 0);
if(is_null($anchor))
returnServerError('Cannot find article URI!');
return self::URI . $anchor->{'review-url'};
return preg_replace($umlauts, $replace, $company);
}
/**
@ -202,9 +155,8 @@ class KununuBridge extends BridgeAbstract {
*/
private function extractArticleAuthorPosition($article){
// We need to parse the user-content manually
$user_content = $article->find('div.user-content', 0);
if(is_null($user_content))
returnServerError('Cannot find user content!');
$user_content = $article->find('div.user-content', 0)
or returnServerError('Cannot find user content!');
// Go through all h2 elements to find index of required span (I know... it's stupid)
$author_position = 'Unknown';
@ -222,11 +174,10 @@ class KununuBridge extends BridgeAbstract {
* Returns the description from a given article
*/
private function extractArticleDescription($article){
$description = $article->find('[itemprop=reviewBody]', 0);
if(is_null($description))
returnServerError('Cannot find article description!');
$description = $article->find('[itemprop=reviewBody]', 0)
or returnServerError('Cannot find article description!');
return $this->fixUrl($description->innertext);
return $description->innertext;
}
/**
@ -234,14 +185,14 @@ class KununuBridge extends BridgeAbstract {
*/
private function extractFullDescription($uri){
// Load full article
$html = getSimpleHTMLDOMCached($uri);
if($html === false)
returnServerError('Could not load full description!');
$html = getSimpleHTMLDOMCached($uri)
or returnServerError('Could not load full description!');
$html = defaultLinkTo($html, static::URI);
// Find the article
$article = $html->find('article', 0);
if(is_null($article))
returnServerError('Cannot find article!');
$article = $html->find('article', 0)
or returnServerError('Cannot find article!');
// Luckily they use the same layout for the review overview and full article pages :)
return $this->extractArticleDescription($article);

View File

@ -1,190 +1,538 @@
<?php
class LeBonCoinBridge extends BridgeAbstract {
const MAINTAINER = '16mhz';
const MAINTAINER = 'jacknumber';
const NAME = 'LeBonCoin';
const URI = 'https://www.leboncoin.fr/';
const DESCRIPTION = 'Returns most recent results from LeBonCoin for a
region, and optionally a category and a keyword .';
const DESCRIPTION = 'Returns most recent results from LeBonCoin';
const PARAMETERS = array(
array(
'k' => array('name' => 'Mot Clé'),
'r' => array(
'keywords' => array('name' => 'Mots-Clés'),
'region' => array(
'name' => 'Région',
'type' => 'list',
'values' => array(
'Toute la France' => 'ile_de_france/occasions',
'Alsace' => 'alsace',
'Aquitaine' => 'aquitaine',
'Auvergne' => 'auvergne',
'Basse Normandie' => 'basse_normandie',
'Bourgogne' => 'bourgogne',
'Bretagne' => 'bretagne',
'Centre' => 'centre',
'Champagne Ardenne' => 'champagne_ardenne',
'Corse' => 'corse',
'Franche Comté' => 'franche_comte',
'Haute Normandie' => 'haute_normandie',
'Ile de France' => 'ile_de_france',
'Languedoc Roussillon' => 'languedoc_roussillon',
'Limousin' => 'limousin',
'Lorraine' => 'lorraine',
'Midi Pyrénées' => 'midi_pyrenees',
'Nord Pas De Calais' => 'nord_pas_de_calais',
'Pays de la Loire' => 'pays_de_la_loire',
'Picardie' => 'picardie',
'Poitou Charentes' => 'poitou_charentes',
'Provence Alpes Côte d\'Azur' => 'provence_alpes_cote_d_azur',
'Rhône-Alpes' => 'rhone_alpes',
'Guadeloupe' => 'guadeloupe',
'Martinique' => 'martinique',
'Guyane' => 'guyane',
'Réunion' => 'reunion'
'Toute la France' => '',
'Alsace' => '1',
'Aquitaine' => '2',
'Auvergne' => '3',
'Basse Normandie' => '4',
'Bourgogne' => '5',
'Bretagne' => '6',
'Centre' => '7',
'Champagne Ardenne' => '8',
'Corse' => '9',
'Franche Comté' => '10',
'Haute Normandie' => '11',
'Ile de France' => '12',
'Languedoc Roussillon' => '13',
'Limousin' => '14',
'Lorraine' => '15',
'Midi Pyrénées' => '16',
'Nord Pas De Calais' => '17',
'Pays de la Loire' => '18',
'Picardie' => '19',
'Poitou Charentes' => '20',
'Provence Alpes Côte d\'Azur' => '21',
'Rhône-Alpes' => '22',
'Guadeloupe' => '23',
'Martinique' => '24',
'Guyane' => '25',
'Réunion' => '26'
)
),
'c' => array(
'department' => array(
'name' => 'Département',
'type' => 'list',
'values' => array(
'' => '',
'Ain' => '1',
'Aisne' => '2',
'Allier' => '3',
'Alpes-de-Haute-Provence' => '4',
'Hautes-Alpes' => '5',
'Alpes-Maritimes' => '6',
'Ardèche' => '7',
'Ardennes' => '8',
'Ariège' => '9',
'Aube' => '10',
'Aude' => '11',
'Aveyron' => '12',
'Bouches-du-Rhône' => '13',
'Calvados' => '14',
'Cantal' => '15',
'Charente' => '16',
'Charente-Maritime' => '17',
'Cher' => '18',
'Corrèze' => '19',
'Corse-du-Sud' => '2A',
'Haute-Corse' => '2B',
'Côte-d\'Or' => '21',
'Côtes-d\'Armor' => '22',
'Creuse' => '23',
'Dordogne' => '24',
'Doubs' => '25',
'Drôme' => '26',
'Eure' => '27',
'Eure-et-Loir' => '28',
'Finistère' => '29',
'Gard' => '30',
'Haute-Garonne' => '31',
'Gers' => '32',
'Gironde' => '33',
'Hérault' => '34',
'Ille-et-Vilaine' => '35',
'Indre' => '36',
'Indre-et-Loire' => '37',
'Isère' => '38',
'Jura' => '39',
'Landes' => '40',
'Loir-et-Cher' => '41',
'Loire' => '42',
'Haute-Loire' => '43',
'Loire-Atlantique' => '44',
'Loiret' => '45',
'Lot' => '46',
'Lot-et-Garonne' => '47',
'Lozère' => '48',
'Maine-et-Loire' => '49',
'Manche' => '50',
'Marne' => '51',
'Haute-Marne' => '52',
'Mayenne' => '53',
'Meurthe-et-Moselle' => '54',
'Meuse' => '55',
'Morbihan' => '56',
'Moselle' => '57',
'Nièvre' => '58',
'Nord' => '59',
'Oise' => '60',
'Orne' => '61',
'Pas-de-Calais' => '62',
'Puy-de-Dôme' => '63',
'Pyrénées-Atlantiques' => '64',
'Hautes-Pyrénées' => '65',
'Pyrénées-Orientales' => '66',
'Bas-Rhin' => '67',
'Haut-Rhin' => '68',
'Rhône' => '69',
'Haute-Saône' => '70',
'Saône-et-Loire' => '71',
'Sarthe' => '72',
'Savoie' => '73',
'Haute-Savoie' => '74',
'Paris' => '75',
'Seine-Maritime' => '76',
'Seine-et-Marne' => '77',
'Yvelines' => '78',
'Deux-Sèvres' => '79',
'Somme' => '80',
'Tarn' => '81',
'Tarn-et-Garonne' => '82',
'Var' => '83',
'Vaucluse' => '84',
'Vendée' => '85',
'Vienne' => '86',
'Haute-Vienne' => '87',
'Vosges' => '88',
'Yonne' => '89',
'Territoire de Belfort' => '90',
'Essonne' => '91',
'Hauts-de-Seine' => '92',
'Seine-Saint-Denis' => '93',
'Val-de-Marne' => '94',
'Val-d\'Oise' => '95'
)
),
'cities' => array(
'name' => 'Villes',
'title' => 'Codes postaux séparés par des virgules'
),
'category' => array(
'name' => 'Catégorie',
'type' => 'list',
'values' => array(
'TOUS' => '',
'EMPLOI' => '_emploi_',
'VEHICULES' => array(
'Tous' => '_vehicules_',
'Voitures' => 'voitures',
'Motos' => 'motos',
'Caravaning' => 'caravaning',
'Utilitaires' => 'utilitaires',
'Équipement Auto' => 'equipement_auto',
'Équipement Moto' => 'equipement_moto',
'Équipement Caravaning' => 'equipement_caravaning',
'Nautisme' => 'nautisme',
'Équipement Nautisme' => 'equipement_nautisme'
'Toutes catégories' => '',
'EMPLOI' => array(
'Emploi et recrutement' => '71',
'Offres d\'emploi et jobs' => '33'
),
'VÉHICULES' => array(
'Tous' => '1',
'Voitures' => '2',
'Motos' => '3',
'Caravaning' => '4',
'Utilitaires' => '5',
'Equipement Auto' => '6',
'Equipement Moto' => '44',
'Equipement Caravaning' => '50',
'Nautisme' => '7',
'Equipement Nautisme' => '51'
),
'IMMOBILIER' => array(
'Tous' => '_immobilier_',
'Ventes immobilières' => 'ventes_immobilieres',
'Locations' => 'locations',
'Colocations' => 'colocations',
'Bureaux & Commerces' => 'bureaux_commerces'
'Tous' => '8',
'Ventes immobilières' => '9',
'Locations' => '10',
'Colocations' => '11',
'Bureaux & Commerces' => '13'
),
'VACANCES' => array(
'Tous' => '_vacances_',
'Location gîtes' => 'locations_gites',
'Chambres d\'hôtes' => 'chambres_d_hotes',
'Campings' => 'campings',
'Hôtels' => 'hotels',
'Hébergements insolites' => 'hebergements_insolites'
'Tous' => '66',
'Locations & Gîtes' => '12',
'Chambres d\'hôtes' => '67',
'Campings' => '68',
'Hôtels' => '69',
'Hébergements insolites' => '70'
),
'MULTIMEDIA' => array(
'Tous' => '_multimedia_',
'Informatique' => 'informatique',
'Consoles & Jeux vidéo' => 'consoles_jeux_video',
'Image & Son' => 'image_son',
'Téléphonie' => 'telephonie'
'MULTIMÉDIA' => array(
'Tous' => '14',
'Informatique' => '15',
'Consoles & Jeux vidéo' => '43',
'Image & Son' => '16',
'Téléphonie' => '17'
),
'LOISIRS' => array(
'Tous' => '_loisirs_',
'DVD / Films' => 'dvd_films',
'CD / Musique' => 'cd_musique',
'Livres' => 'livres',
'Animaux' => 'animaux',
'Vélos' => 'velos',
'Sports & Hobbies' => 'sports_hobbies',
'Instruments de musique' => 'instruments_de_musique',
'Collection' => 'collection',
'Jeux & Jouets' => 'jeux_jouets',
'Vins & Gastronomie' => 'vins_gastronomie'
'Tous' => '24',
'DVD / Films' => '25',
'CD / Musique' => '26',
'Livres' => '27',
'Animaux' => '28',
'Vélos' => '55',
'Sports & Hobbies' => '29',
'Instruments de musique' => '30',
'Collection' => '40',
'Jeux & Jouets' => '41',
'Vins & Gastronomie' => '48'
),
'MATÉRIEL PROFESSIONNEL' => array(
'Tous' => '_materiel_professionnel_',
'Matériel Agricole' => 'mateiel_agricole',
'Transport - Manutention' => 'transport_manutention',
'BTP - Chantier - Gros-œuvre' => 'btp_chantier_gros_oeuvre',
'Outillage - Matériaux 2nd-œuvre' => 'outillage_materiaux_2nd_oeuvre',
'Équipements Industriels' => 'equipement_industriels',
'Restauration - Hôtellerie' => 'restauration_hotellerie',
'Fournitures de Bureau' => 'fournitures_de_bureau',
'Commerces & Marchés' => 'commerces_marches',
'Matériel médical' => 'materiel_medical'
'Tous' => '56',
'Matériel Agricole' => '57',
'Transport - Manutention' => '58',
'BTP - Chantier Gros-oeuvre' => '59',
'Outillage - Matériaux 2nd-oeuvre' => '60',
'Équipements Industriels' => '32',
'Restauration - Hôtellerie' => '61',
'Fournitures de Bureau' => '62',
'Commerces & Marchés' => '63',
'Matériel Médical' => '64'
),
'SERVICES' => array(
'Tous' => '_services_',
'Prestations de services' => 'prestations_de_services',
'Billetterie' => 'billetterie',
'Évènements' => 'evenements',
'Cours particuliers' => 'cours_particuliers',
'Covoiturage' => 'covoiturage'
'Tous' => '31',
'Prestations de services' => '34',
'Billetterie' => '35',
'Événements' => '49',
'Cours particuliers' => '36',
'Covoiturage' => '65'
),
'MAISON' => array(
'Tous' => '_maison_',
'Ameublement' => 'ameublement',
'Électroménager' => 'electromenager',
'Arts de la table' => 'arts_de_la_table',
'Décoration' => 'decoration',
'Linge de maison' => 'linge_de_maison',
'Bricolage' => 'bricolage',
'Jardinage' => 'jardinage',
'Vêtements' => 'vetements',
'Chaussures' => 'chaussures',
'Accessoires & Bagagerie' => 'accessoires_bagagerie',
'Montres & Bijoux' => 'montres_bijoux',
'Équipement bébé' => 'equipement_bebe',
'Vêtements bébé' => 'vetements_bebe'
'Tous' => '18',
'Ameublement' => '19',
'Électroménager' => '20',
'Arts de la table' => '45',
'Décoration' => '39',
'Linge de maison' => '46',
'Bricolage' => '21',
'Jardinage' => '52',
'Vêtements' => '22',
'Chaussures' => '53',
'Accessoires & Bagagerie' => '47',
'Montres & Bijoux' => '42',
'Équipement bébé' => '23',
'Vêtements bébé' => '54',
),
'AUTRES' => 'autres'
'AUTRES' => '37'
)
),
'pricemin' => array(
'name' => 'Prix min',
'type' => 'number'
),
'pricemax' => array(
'name' => 'Prix max',
'type' => 'number'
),
'estate' => array(
'name' => 'Type de bien',
'type' => 'list',
'values' => array(
'' => '',
'Maison' => '1',
'Appartement' => '2',
'Terrain' => '3',
'Parking' => '4',
'Autre' => '5'
)
),
'roomsmin' => array(
'name' => 'Pièces min',
'type' => 'number'
),
'roomsmax' => array(
'name' => 'Pièces max',
'type' => 'number'
),
'squaremin' => array(
'name' => 'Surface min',
'type' => 'number'
),
'squaremax' => array(
'name' => 'Surface max',
'type' => 'number'
),
'mileagemin' => array(
'name' => 'Kilométrage min',
'type' => 'number'
),
'mileagemax' => array(
'name' => 'Kilométrage max',
'type' => 'number'
),
'yearmin' => array(
'name' => 'Année min',
'type' => 'number'
),
'yearmax' => array(
'name' => 'Année max',
'type' => 'number'
),
'cubiccapacitymin' => array(
'name' => 'Cylindrée min',
'type' => 'number'
),
'cubiccapacitymax' => array(
'name' => 'Cylindrée max',
'type' => 'number'
),
'fuel' => array(
'name' => 'Énergie',
'type' => 'list',
'values' => array(
'' => '',
'Essence' => '1',
'Diesel' => '2',
'GPL' => '3',
'Électrique' => '4',
'Hybride' => '6',
'Autre' => '5'
)
),
'owner' => array(
'name' => 'Vendeur',
'type' => 'list',
'values' => array(
'Tous' => '',
'Particuliers' => 'private',
'Professionnels' => 'pro'
)
)
)
);
public function collectData(){
public static $LBC_API_KEY = 'ba0c2dad52b3ec';
$category = $this->getInput('c');
if(empty($category)) {
$category = 'annonces';
private function getRange($field, $range_min, $range_max){
if(!is_null($range_min)
&& !is_null($range_max)
&& $range_min > $range_max) {
returnClientError('Min-' . $field . ' must be lower than max-' . $field . '.');
}
$html = getSimpleHTMLDOM(self::URI
. $category
. '/offres/'
. $this->getInput('r')
. '/?f=a&th=1&q='
. urlencode($this->getInput('k')))
or returnServerError('Could not request LeBonCoin.');
if(!is_null($range_min)
&& is_null($range_max)) {
returnClientError('Max-' . $field . ' is needed when min-' . $field . ' is setted (range).');
}
$list = $html->find('.tabsContent', 0);
if($list === null) {
return array(
'min' => $range_min,
'max' => $range_max
);
}
public function collectData(){
$url = 'https://api.leboncoin.fr/finder/search/';
$data = $this->buildRequestJson();
$header = array(
'Content-Type: application/json',
'Content-Length: ' . strlen($data),
'api_key: ' . self::$LBC_API_KEY
);
$opts = array(
CURLOPT_CUSTOMREQUEST => 'POST',
CURLOPT_POSTFIELDS => $data
);
$content = getContents($url, $header, $opts)
or returnServerError('Could not request LeBonCoin. Tried: ' . $url);
$json = json_decode($content);
if($json->total === 0) {
return;
}
$tags = $list->find('li');
foreach($json->ads as $element) {
foreach($tags as $element) {
$item['title'] = $element->subject;
$item['content'] = $element->body;
$item['date'] = $element->index_date;
$item['timestamp'] = strtotime($element->index_date);
$item['uri'] = $element->url;
$item['ad_type'] = $element->ad_type;
$item['author'] = $element->owner->name;
$element = $element->find('a', 0);
if(isset($element->location->city)) {
$item = array();
$item['uri'] = $element->href;
$title = html_entity_decode($element->getAttribute('title'));
$content_image = $element->find('div.item_image', 0)->find('.lazyload', 0);
$item['city'] = $element->location->city;
$item['content'] .= ' -- ' . $element->location->city;
if($content_image !== null) {
$content = '<img src="' . $content_image->getAttribute('data-imgsrc') . '" alt="thumbnail">';
} else {
$content = "";
}
$date = $element->find('aside.item_absolute', 0)->find('p.item_sup', 0);
$detailsList = $element->find('section.item_infos', 0);
if(isset($element->location->zipcode)) {
$item['zipcode'] = $element->location->zipcode;
}
for($i = 0; $i <= 1; $i++) $content .= $detailsList->find('p.item_supp', $i)->plaintext;
$price = $detailsList->find('h3.item_price', 0);
$content .= $price === null ? '' : $price->plaintext;
if(isset($element->price)) {
$item['price'] = $element->price[0];
$item['content'] .= ' -- ' . current($element->price) . '€';
}
if(isset($element->images->urls)) {
$item['thumbnail'] = $element->images->thumb_url;
$item['enclosures'] = array();
foreach($element->images->urls as $image) {
$item['enclosures'][] = $image;
}
}
$item['title'] = $title;
$item['content'] = $content . $date;
$this->items[] = $item;
}
}
private function buildRequestJson() {
$requestJson = new StdClass();
$requestJson->owner_type = $this->getInput('owner');
$requestJson->filters = new StdClass();
$requestJson->filters->keywords = array(
'text' => $this->getInput('keywords')
);
if($this->getInput('region') != '') {
$requestJson->filters->location['regions'] = [$this->getInput('region')];
}
if($this->getInput('department') != '') {
$requestJson->filters->location['departments'] = [$this->getInput('department')];
}
if($this->getInput('cities') != '') {
$requestJson->filters->location['city_zipcodes'] = array();
foreach (explode(',', $this->getInput('cities')) as $zipcode) {
$requestJson->filters->location['city_zipcodes'][] = array(
'zipcode' => trim($zipcode)
);
}
}
$requestJson->filters->category = array(
'id' => $this->getInput('category')
);
if($this->getInput('pricemin') != ''
|| $this->getInput('pricemax') != '') {
$requestJson->filters->ranges->price = $this->getRange(
'price',
$this->getInput('pricemin'),
$this->getInput('pricemax')
);
}
if($this->getInput('estate') != '') {
$requestJson->filters->enums['real_estate_type'] = [$this->getInput('estate')];
}
if($this->getInput('roomsmin') != ''
|| $this->getInput('roomsmax') != '') {
$requestJson->filters->ranges->rooms = $this->getRange(
'rooms',
$this->getInput('roomsmin'),
$this->getInput('roomsmax')
);
}
if($this->getInput('squaremin') != ''
|| $this->getInput('squaremax') != '') {
$requestJson->filters->ranges->square = $this->getRange(
'square',
$this->getInput('squaremin'),
$this->getInput('squaremax')
);
}
if($this->getInput('mileagemin') != ''
|| $this->getInput('mileagemax') != '') {
$requestJson->filters->ranges->mileage = $this->getRange(
'mileage',
$this->getInput('mileagemin'),
$this->getInput('mileagemax')
);
}
if($this->getInput('yearmin') != ''
|| $this->getInput('yearmax') != '') {
$requestJson->filters->ranges->regdate = $this->getRange(
'year',
$this->getInput('yearmin'),
$this->getInput('yearmax')
);
}
if($this->getInput('cubiccapacitymin') != ''
|| $this->getInput('cubiccapacitymax') != '') {
$requestJson->filters->ranges->cubic_capacity = $this->getRange(
'cubic_capacity',
$this->getInput('cubiccapacitymin'),
$this->getInput('cubiccapacitymax')
);
}
if($this->getInput('fuel') != '') {
$requestJson->filters->enums['fuel'] = [$this->getInput('fuel')];
}
$requestJson->limit = 30;
return json_encode($requestJson);
}
}

View File

@ -3,8 +3,7 @@ class LeMondeInformatiqueBridge extends FeedExpander {
const MAINTAINER = 'ORelio';
const NAME = 'Le Monde Informatique';
const URI = 'http://www.lemondeinformatique.fr/';
const CACHE_TIMEOUT = 1800; // 30min
const URI = 'https://www.lemondeinformatique.fr/';
const DESCRIPTION = 'Returns the newest articles.';
public function collectData(){
@ -15,30 +14,26 @@ class LeMondeInformatiqueBridge extends FeedExpander {
$item = parent::parseItem($newsItem);
$article_html = getSimpleHTMLDOMCached($item['uri'])
or returnServerError('Could not request LeMondeInformatique: ' . $item['uri']);
$item['content'] = $this->cleanArticle($article_html->find('div#article', 0)->innertext);
$item['title'] = $article_html->find('h1.cleanprint-title', 0)->plaintext;
//Deduce thumbnail URL from article image URL
$item['enclosures'] = array(
str_replace(
'/grande/',
'/petite/',
$article_html->find('.article-image', 0)->find('img', 0)->src
)
);
//No response header sets the encoding, explicit conversion is needed or subsequent xml_encode() will fail
$item['content'] = utf8_encode($this->cleanArticle($article_html->find('div.col-primary', 0)->innertext));
$item['author'] = utf8_encode($article_html->find('div.author-infos', 0)->find('b', 0)->plaintext);
return $item;
}
private function stripCDATA($string){
$string = str_replace('<![CDATA[', '', $string);
$string = str_replace(']]>', '', $string);
return $string;
}
private function stripWithDelimiters($string, $start, $end){
while(strpos($string, $start) !== false) {
$section_to_remove = substr($string, strpos($string, $start));
$section_to_remove = substr($section_to_remove, 0, strpos($section_to_remove, $end) + strlen($end));
$string = str_replace($section_to_remove, '', $string);
}
return $string;
}
private function cleanArticle($article_html){
$article_html = $this->stripWithDelimiters($article_html, '<script', '</script>');
$article_html = $this->stripWithDelimiters($article_html, '<h1 class="cleanprint-title"', '</h1>');
$article_html = stripWithDelimiters($article_html, '<script', '</script>');
$article_html = explode('<p class="contact-error', $article_html)[0] . '</div>';
return $article_html;
}
}

View File

@ -3,7 +3,7 @@ class LesJoiesDuCodeBridge extends BridgeAbstract {
const MAINTAINER = 'superbaillot.net';
const NAME = 'Les Joies Du Code';
const URI = 'http://lesjoiesducode.fr/';
const URI = 'https://lesjoiesducode.fr/';
const CACHE_TIMEOUT = 7200; // 2h
const DESCRIPTION = 'LesJoiesDuCode';
@ -22,20 +22,12 @@ class LesJoiesDuCodeBridge extends BridgeAbstract {
// retrieve .gif instead of static .jpg
$images = $temp->find('p img');
foreach($images as $image) {
$img_src = str_replace(".jpg", ".gif", $image->src);
$img_src = str_replace('.jpg', '.gif', $image->src);
$image->src = $img_src;
}
$content = $temp->innertext;
$auteur = $temp->find('i', 0);
$pos = strpos($auteur->innertext, "by");
if($pos > 0) {
$auteur = trim(str_replace("*/", "", substr($auteur->innertext, ($pos + 2))));
$item['author'] = $auteur;
}
$item['content'] .= trim($content);
$item['content'] = trim($content);
$item['uri'] = $url;
$item['title'] = trim($titre);

View File

@ -100,7 +100,7 @@ class MangareaderBridge extends BridgeAbstract {
case 'Get popular mangas':
// Find manga name within "Popular mangas for ..."
$pagetitle = $xpath->query(".//*[@id='bodyalt']/h1")->item(0)->nodeValue;
$this->request = substr($pagetitle, 0, strrpos($pagetitle, " -"));
$this->request = substr($pagetitle, 0, strrpos($pagetitle, ' -'));
$this->getPopularMangas($xpath);
break;
case 'Get manga updates':
@ -120,7 +120,7 @@ class MangareaderBridge extends BridgeAbstract {
// Return some dummy-data if no content available
if(empty($this->items)) {
$item = array();
$item['content'] = "<p>No updates available</p>";
$item['content'] = '<p>No updates available</p>';
$this->items[] = $item;
}
@ -143,18 +143,18 @@ class MangareaderBridge extends BridgeAbstract {
$item['title'] = htmlspecialchars($manga->nodeValue);
// Add each chapter to the feed
$item['content'] = "";
$item['content'] = '';
foreach ($chapters as $chapter) {
if($item['content'] <> "") {
$item['content'] .= "<br>";
if($item['content'] <> '') {
$item['content'] .= '<br>';
}
$item['content'] .= "<a href='"
. self::URI
. htmlspecialchars($chapter->getAttribute('href'))
. "'>"
. htmlspecialchars($chapter->nodeValue)
. "</a>";
. '</a>';
}
$this->items[] = $item;
@ -211,13 +211,13 @@ EOD;
foreach ($chapters as $chapter) {
$item = array();
$item['title'] = htmlspecialchars($xpath->query("td[1]", $chapter)
$item['title'] = htmlspecialchars($xpath->query('td[1]', $chapter)
->item(0)
->nodeValue);
$item['uri'] = self::URI . $xpath->query("td[1]/a", $chapter)
$item['uri'] = self::URI . $xpath->query('td[1]/a', $chapter)
->item(0)
->getAttribute('href');
$item['timestamp'] = strtotime($xpath->query("td[2]", $chapter)
$item['timestamp'] = strtotime($xpath->query('td[2]', $chapter)
->item(0)
->nodeValue);
array_unshift($this->items, $item);
@ -227,12 +227,12 @@ EOD;
public function getURI(){
switch($this->queriedContext) {
case 'Get latest updates':
$path = "latest";
$path = 'latest';
break;
case 'Get popular mangas':
$path = "popular";
if($this->getInput('category') !== "all") {
$path .= "/" . $this->getInput('category');
$path = 'popular';
if($this->getInput('category') !== 'all') {
$path .= '/' . $this->getInput('category');
}
break;
case 'Get manga updates':

144
bridges/MydealsBridge.php Normal file
View File

@ -0,0 +1,144 @@
<?php
require_once(__DIR__ . '/DealabsBridge.php');
class MydealsBridge extends PepperBridgeAbstract {
const NAME = 'Mydeals bridge';
const URI = 'https://www.mydealz.de/';
const DESCRIPTION = 'Zeigt die Deals von mydeals.de';
const MAINTAINER = 'sysadminstory';
const PARAMETERS = array(
'Suche nach Stichworten' => array (
'q' => array(
'name' => 'Stichworten',
'type' => 'text',
'required' => true
),
'hide_expired' => array(
'name' => 'Abgelaufenes ausblenden',
'type' => 'checkbox',
'required' => 'true'
),
'hide_local' => array(
'name' => 'Lokales ausblenden',
'type' => 'checkbox',
'title' => 'Deals im physischen Geschäft ausblenden',
'required' => 'true'
),
'priceFrom' => array(
'name' => 'Minimaler Preis',
'type' => 'text',
'title' => 'Minmaler Preis in Euros',
'required' => 'false',
'defaultValue' => ''
),
'priceTo' => array(
'name' => 'Maximaler Preis',
'type' => 'text',
'title' => 'maximaler Preis in Euro',
'required' => 'false',
'defaultValue' => ''
),
),
'Deals pro Gruppen' => array(
'group' => array(
'name' => 'Gruppen',
'type' => 'list',
'required' => 'true',
'title' => 'Gruppe, deren Deals angezeigt werden müssen',
'values' => array(
'Elektronik' => 'elektronik',
'Handy & Smartphone' => 'smartphone',
'Gaming' => 'gaming',
'Software' => 'apps-software',
'Fashion Frauen' => 'fashion-frauen',
'Fashion Männer' => 'fashion-accessoires',
'Beauty & Gesundheit' => 'beauty',
'Family & Kids' => 'family-kids',
'Essen & Trinken' => 'food',
'Freizeit & Reisen' => 'reisen',
'Haushalt & Garten' => 'home-living',
'Entertainment' => 'entertainment',
'Verträge & Finanzen' => 'vertraege-finanzen',
'Coupons' => 'coupons',
)
),
'order' => array(
'name' => 'sortieren nach',
'type' => 'list',
'required' => 'true',
'title' => 'Sortierung der deals',
'values' => array(
'Vom heißesten zum kältesten Deal' => '',
'Vom jüngsten Deal zum ältesten' => '-new',
'Vom am meisten kommentierten Deal zum am wenigsten kommentierten Deal' => '-discussed'
)
)
)
);
public $lang = array(
'bridge-uri' => SELF::URI,
'bridge-name' => SELF::NAME,
'context-keyword' => 'Suche nach Stichworten',
'context-group' => 'Deals pro Gruppen',
'uri-group' => '/gruppe/',
'request-error' => 'Could not request mydeals',
'no-results' => 'Ups, wir konnten keine Deals zu',
'relative-date-indicator' => array(
'vor',
'seit'
),
'price' => 'Preis',
'shipping' => 'Versand',
'origin' => 'Ursprung',
'discount' => 'Rabatte',
'title-keyword' => 'Suche',
'title-group' => 'Gruppe',
'local-months' => array(
'Jan',
'Feb',
'Mär',
'Apr',
'Mai',
'Jun',
'Jul',
'Aug',
'Sep',
'Okt',
'Nov',
'Dez',
'.'
),
'local-time-relative' => array(
'eingestellt vor ',
'm',
'h,',
'day',
'days',
'month',
'year',
'and '
),
'date-prefixes' => array(
'eingestellt am ',
'lokal ',
'aktualisiert ',
),
'relative-date-alt-prefixes' => array(
'aktualisiert vor ',
'kommentiert vor ',
'heiß seit '
),
'relative-date-ignore-suffix' => array(
'/von.*$/'
),
'localdeal' => array(
'Lokal ',
'Läuft bis '
)
);
}

View File

@ -12,7 +12,7 @@ class NasaApodBridge extends BridgeAbstract {
$html = getSimpleHTMLDOM(self::URI . 'archivepix.html')
or returnServerError('Error while downloading the website content');
$list = explode("<br>", $html->find('b', 0)->innertext);
$list = explode('<br>', $html->find('b', 0)->innertext);
for($i = 0; $i < 3; $i++) {
$line = $list[$i];
@ -32,7 +32,7 @@ class NasaApodBridge extends BridgeAbstract {
$explanation = $picture_html->find('p', 2)->innertext;
//Extract date from the picture page
$date = explode(" ", $picture_html->find('p', 1)->innertext);
$date = explode(' ', $picture_html->find('p', 1)->innertext);
$item['timestamp'] = strtotime($date[4] . $date[3] . $date[2]);
//Other informations

View File

@ -6,16 +6,6 @@ class NeuviemeArtBridge extends FeedExpander {
const URI = 'http://www.9emeart.fr/';
const DESCRIPTION = 'Returns the newest articles.';
private function stripWithDelimiters($string, $start, $end){
while(strpos($string, $start) !== false) {
$section_to_remove = substr($string, strpos($string, $start));
$section_to_remove = substr($section_to_remove, 0, strpos($section_to_remove, $end) + strlen($end));
$string = str_replace($section_to_remove, '', $string);
}
return $string;
}
protected function parseItem($item){
$item = parent::parseItem($item);
@ -34,16 +24,16 @@ class NeuviemeArtBridge extends FeedExpander {
}
$article_content = '';
if($article_image) {
if ($article_image) {
$article_content = '<p><img src="' . $article_image . '" /></p>';
}
$article_content .= str_replace(
'src="/', 'src="' . self::URI,
$article_html->find('div.newsGenerique_con', 0)->innertext
);
$article_content = $this->stripWithDelimiters($article_content, '<script', '</script>');
$article_content = $this->stripWithDelimiters($article_content, '<style', '</style>');
$article_content = $this->stripWithDelimiters($article_content, '<link', '>');
$article_content = stripWithDelimiters($article_content, '<script', '</script>');
$article_content = stripWithDelimiters($article_content, '<style', '</style>');
$article_content = stripWithDelimiters($article_content, '<link', '>');
$item['content'] = $article_content;

View File

@ -6,29 +6,105 @@ class NextInpactBridge extends FeedExpander {
const URI = 'https://www.nextinpact.com/';
const DESCRIPTION = 'Returns the newest articles.';
const PARAMETERS = array( array(
'feed' => array(
'name' => 'Feed',
'type' => 'list',
'values' => array(
'Tous nos articles' => 'news',
'Nos contenus en accès libre' => 'acces-libre',
'Blog' => 'blog',
'Bons plans' => 'bonsplans'
)
),
'filter_premium' => array(
'name' => 'Premium',
'type' => 'list',
'values' => array(
'No filter' => '0',
'Hide Premium' => '1',
'Only Premium' => '2'
)
),
'filter_brief' => array(
'name' => 'Brief',
'type' => 'list',
'values' => array(
'No filter' => '0',
'Hide Brief' => '1',
'Only Brief' => '2'
)
)
));
public function collectData(){
$this->collectExpandableDatas(self::URI . 'rss/news.xml', 10);
$feed = $this->getInput('feed');
if (empty($feed))
$feed = 'news';
$this->collectExpandableDatas(self::URI . 'rss/' . $feed . '.xml');
}
protected function parseItem($newsItem){
$item = parent::parseItem($newsItem);
$item['content'] = $this->extractContent($item['uri']);
$item['content'] = $this->extractContent($item, $item['uri']);
if (is_null($item['content']))
return null; //Filtered article
return $item;
}
private function extractContent($url){
$html2 = getSimpleHTMLDOMCached($url);
$text = '<p><em>'
. $html2->find('span.sub_title', 0)->innertext
. '</em></p><p><img src="'
. $html2->find('div.container_main_image_article', 0)->find('img.dedicated', 0)->src
. '" alt="-" /></p><div>'
. $html2->find('div[itemprop=articleBody]', 0)->innertext
. '</div>';
private function extractContent($item, $url){
$html = getSimpleHTMLDOMCached($url);
if (!is_object($html))
return 'Failed to request NextInpact: ' . $url;
foreach(array(
'filter_premium' => 'h2.title_reserve_article',
'filter_brief' => 'div.brief-inner-content'
) as $param_name => $selector) {
$param_val = intval($this->getInput($param_name));
if ($param_val != 0) {
$element_present = is_object($html->find($selector, 0));
$element_wanted = ($param_val == 2);
if ($element_present != $element_wanted) {
return null; //Filter article
}
}
}
if (is_object($html->find('div[itemprop=articleBody], div.brief-inner-content', 0))) {
$subtitle = trim($html->find('span.sub_title, div.brief-head', 0));
if(is_object($subtitle) && $subtitle->plaintext !== $item['title']) {
$subtitle = '<p><em>' . $subtitle->plaintext . '</em></p>';
} else {
$subtitle = '';
}
$postimg = $html->find(
'div.container_main_image_article, div.image-brief-container, div.image-brief-side-container', 0
);
if(is_object($postimg)) {
$postimg = '<p><img src="'
. $postimg->find('img.dedicated', 0)->src
. '" alt="-" /></p>';
} else {
$postimg = '';
}
$text = $subtitle
. $postimg
. $html->find('div[itemprop=articleBody], div.brief-inner-content', 0)->outertext;
} else {
$text = $item['content']
. '<p><em>Failed retrieve full article content</em></p>';
}
$premium_article = $html->find('h2.title_reserve_article', 0);
if (is_object($premium_article)) {
$text .= '<p><em>' . $premium_article->innertext . '</em></p>';
}
$premium_article = $html2->find('h2.title_reserve_article', 0);
if (is_object($premium_article))
$text = $text . '<p><em>' . $premium_article->innertext . '</em></p>';
return $text;
}
}

View File

@ -32,43 +32,39 @@ class NextgovBridge extends FeedExpander {
protected function parseItem($newsItem){
$item = parent::parseItem($newsItem);
$item['content'] = '';
$article_thumbnail = 'https://cdn.nextgov.com/nextgov/images/logo.png';
$item['content'] = '<p><b>' . $item['content'] . '</b></p>';
$namespaces = $newsItem->getNamespaces(true);
if(isset($namespaces['media'])) {
$media = $newsItem->children($namespaces['media']);
if(isset($media->content)) {
$attributes = $media->content->attributes();
$item['content'] = '<img src="' . $attributes['url'] . '">';
$item['content'] = '<p><img src="' . $attributes['url'] . '"></p>' . $item['content'];
$article_thumbnail = str_replace(
'large.jpg',
'small.jpg',
strval($attributes['url'])
);
}
}
$item['enclosures'] = array($article_thumbnail);
$item['content'] .= $this->extractContent($item['uri']);
return $item;
}
private function stripWithDelimiters($string, $start, $end){
while (strpos($string, $start) !== false) {
$section_to_remove = substr($string, strpos($string, $start));
$section_to_remove = substr($section_to_remove, 0, strpos($section_to_remove, $end) + strlen($end));
$string = str_replace($section_to_remove, '', $string);
}
return $string;
}
private function extractContent($url){
$article = getSimpleHTMLDOMCached($url)
or returnServerError('Could not request Nextgov: ' . $url);
$article = getSimpleHTMLDOMCached($url);
$contents = $article->find('div.wysiwyg', 0)->innertext;
$contents = $this->stripWithDelimiters($contents, '<div class="ad-container">', '</div>');
$contents = $this->stripWithDelimiters($contents, '<div', '</div>'); //ad outer div
return $this->stripWithDelimiters($contents, '<script', '</script>');
$contents = ($article_thumbnail == '' ? '' : '<p><img src="' . $article_thumbnail . '" /></p>')
. '<p><b>'
. $article_subtitle
. '</b></p>'
. trim($contents);
if (!is_object($article))
return 'Could not request Nextgov: ' . $url;
$contents = $article->find('div.wysiwyg', 0);
$contents->find('svg.content-tombstone', 0)->outertext = '';
$contents = $contents->innertext;
$contents = stripWithDelimiters($contents, '<div class="ad-container">', '</div>');
$contents = stripWithDelimiters($contents, '<div', '</div>'); //ad outer div
return trim(stripWithDelimiters($contents, '<script', '</script>'));
}
}

331
bridges/NineGagBridge.php Normal file
View File

@ -0,0 +1,331 @@
<?php
class NineGagBridge extends BridgeAbstract {
const NAME = '9gag Bridge';
const URI = 'https://9gag.com/';
const DESCRIPTION = 'Returns latest quotes from 9gag.';
const MAINTAINER = 'ZeNairolf';
const CACHE_TIMEOUT = 3600;
const PARAMETERS = array(
'Popular' => array(
'd' => array(
'name' => 'Section',
'type' => 'list',
'required' => true,
'values' => array(
'Hot' => 'hot',
'Trending' => 'trending',
'Fresh' => 'fresh',
),
),
'p' => array(
'name' => 'Pages',
'type' => 'number',
'defaultValue' => 3,
),
),
'Sections' => array(
'g' => array(
'name' => 'Section',
'type' => 'list',
'required' => true,
'values' => array(
'Animals' => 'cute',
'Anime & Manga' => 'anime-manga',
'Ask 9GAG' => 'ask9gag',
'Awesome' => 'awesome',
'Basketball' => 'basketball',
'Car' => 'car',
'Classical Art Memes' => 'classicalartmemes',
'Comic' => 'comic',
'Cosplay' => 'cosplay',
'Countryballs' => 'country',
'DIY & Crafts' => 'imadedis',
'Drawing & Illustration' => 'drawing',
'Fan Art' => 'animefanart',
'Food & Drinks' => 'food',
'Football' => 'football',
'Fortnite' => 'fortnite',
'Funny' => 'funny',
'GIF' => 'gif',
'Gaming' => 'gaming',
'Girl' => 'girl',
'Girly Things' => 'girly',
'Guy' => 'guy',
'History' => 'history',
'Home Design' => 'home',
'Horror' => 'horror',
'K-Pop' => 'kpop',
'LEGO' => 'lego',
'League of Legends' => 'leagueoflegends',
'Movie & TV' => 'movie-tv',
'Music' => 'music',
'NFK - Not For Kids' => 'nsfw',
'Overwatch' => 'overwatch',
'PC Master Race' => 'pcmr',
'PUBG' => 'pubg',
'Pic Of The Day' => 'photography',
'Pokémon' => 'pokemon',
'Politics' => 'politics',
'Relationship' => 'relationship',
'Roast Me' => 'roastme',
'Satisfying' => 'satisfying',
'Savage' => 'savage',
'School' => 'school',
'Sci-Tech' => 'science',
'Sport' => 'sport',
'Star Wars' => 'starwars',
'Superhero' => 'superhero',
'Surreal Memes' => 'surrealmemes',
'Timely' => 'timely',
'Travel' => 'travel',
'Video' => 'video',
'WTF' => 'wtf',
'Wallpaper' => 'wallpaper',
'Warhammer' => 'warhammer',
),
),
't' => array(
'name' => 'Type',
'type' => 'list',
'required' => true,
'values' => array(
'Hot' => 'hot',
'Fresh' => 'fresh',
),
),
'p' => array(
'name' => 'Pages',
'type' => 'number',
'defaultValue' => 3,
),
),
);
const MIN_NBR_PAGE = 1;
const MAX_NBR_PAGE = 6;
protected $p = null;
public function collectData() {
$url = sprintf(
'%sv1/group-posts/group/%s/type/%s?',
self::URI,
$this->getGroup(),
$this->getType()
);
$cursor = 'c=10';
$posts = array();
for ($i = 0; $i < $this->getPages(); ++$i) {
$content = getContents($url.$cursor);
$json = json_decode($content, true);
$posts = array_merge($posts, $json['data']['posts']);
$cursor = $json['data']['nextCursor'];
}
foreach ($posts as $post) {
$item['uri'] = $post['url'];
$item['title'] = $post['title'];
$item['content'] = self::getContent($post);
$item['categories'] = self::getCategories($post);
$item['timestamp'] = self::getTimestamp($post);
$this->items[] = $item;
}
}
public function getName() {
if ($this->getInput('d')) {
$name = sprintf('%s - %s', '9GAG', $this->getParameterKey('d'));
} elseif ($this->getInput('g')) {
$name = sprintf('%s - %s', '9GAG', $this->getParameterKey('g'));
if ($this->getInput('t')) {
$name = sprintf('%s [%s]', $name, $this->getParameterKey('t'));
}
}
if (!empty($name)) {
return $name;
}
return self::NAME;
}
public function getURI() {
$uri = $this->getInput('g');
if ($uri === 'default') {
$uri = $this->getInput('t');
}
return self::URI.$uri;
}
protected function getGroup() {
if ($this->getInput('d')) {
return 'default';
}
return $this->getInput('g');
}
protected function getType() {
if ($this->getInput('d')) {
return $this->getInput('d');
}
return $this->getInput('t');
}
protected function getPages() {
if ($this->p === null) {
$value = (int) $this->getInput('p');
$value = ($value < self::MIN_NBR_PAGE) ? self::MIN_NBR_PAGE : $value;
$value = ($value > self::MAX_NBR_PAGE) ? self::MAX_NBR_PAGE : $value;
$this->p = $value;
}
return $this->p;
}
protected function getParameterKey($input = '') {
$params = $this->getParameters();
$tab = 'Sections';
if ($input === 'd') {
$tab = 'Popular';
}
if (!isset($params[$tab][$input])) {
return '';
}
return array_search(
$this->getInput($input),
$params[$tab][$input]['values']
);
}
protected static function getContent($post) {
if ($post['type'] === 'Animated') {
$content = self::getAnimated($post);
} elseif ($post['type'] === 'Article') {
$content = self::getArticle($post);
} else {
$content = self::getPhoto($post);
}
return $content;
}
protected static function getPhoto($post) {
$image = $post['images']['image460'];
$photo = '<picture>';
$photo .= sprintf(
'<source srcset="%s" type="image/webp">',
$image['webpUrl']
);
$photo .= sprintf(
'<img src="%s" alt="%s" %s>',
$image['url'],
$post['title'],
'width="500"'
);
$photo .= '</picture>';
return $photo;
}
protected static function getAnimated($post) {
$poster = $post['images']['image460']['url'];
$sources = $post['images'];
$video = sprintf(
'<video poster="%s" %s>',
$poster,
'preload="auto" loop controls style="min-height: 300px" width="500"'
);
$video .= sprintf(
'<source src="%s" type="video/webm">',
$sources['image460sv']['vp9Url']
);
$video .= sprintf(
'<source src="%s" type="video/mp4">',
$sources['image460sv']['h265Url']
);
$video .= sprintf(
'<source src="%s" type="video/mp4">',
$sources['image460svwm']['url']
);
$video .= '</video>';
return $video;
}
protected static function getArticle($post) {
$blocks = $post['article']['blocks'];
$medias = $post['article']['medias'];
$contents = array();
foreach ($blocks as $block) {
if ('Media' === $block['type']) {
$mediaId = $block['mediaId'];
$contents[] = self::getContent($medias[$mediaId]);
} elseif ('RichText' === $block['type']) {
$contents[] = self::getRichText($block['content']);
}
}
$content = join('</div><div>', $contents);
$content = sprintf(
'<%1$s>%2$s</%1$s>',
'div',
$content
);
return $content;
}
protected static function getRichText($text = '') {
$text = trim($text);
if (preg_match('/^>\s(?<text>.*)/', $text, $matches)) {
$text = sprintf(
'<%1$s>%2$s</%1$s>',
'blockquote',
$matches['text']
);
} else {
$text = sprintf(
'<%1$s>%2$s</%1$s>',
'p',
$text
);
}
return $text;
}
protected static function getCategories($post) {
$params = self::PARAMETERS;
$sections = $params['Sections']['g']['values'];
if(isset($post['sections'])) {
$postSections = $post['sections'];
} elseif (isset($post['postSection'])) {
$postSections = array($post['postSection']);
} else {
$postSections = array();
}
foreach ($postSections as $key => $section) {
$postSections[$key] = array_search($section, $sections);
}
return $postSections;
}
protected static function getTimestamp($post) {
$url = $post['images']['image460']['url'];
$headers = get_headers($url, true);
$date = $headers['Date'];
$time = strtotime($date);
return $time;
}
}

View File

@ -49,7 +49,7 @@ class NotAlwaysBridge extends BridgeAbstract {
public function getURI(){
if(!is_null($this->getInput('filter'))) {
return self::URI . $this->getInput('filter') . "/";
return self::URI . $this->getInput('filter') . '/';
}
return parent::getURI();

View File

@ -0,0 +1,127 @@
<?php
class NyaaTorrentsBridge extends BridgeAbstract {
const MAINTAINER = 'ORelio';
const NAME = 'NyaaTorrents';
const URI = 'https://nyaa.si/';
const DESCRIPTION = 'Returns the newest torrents, with optional search criteria.';
const PARAMETERS = array(
array(
'f' => array(
'name' => 'Filter',
'type' => 'list',
'values' => array(
'No filter' => '0',
'No remakes' => '1',
'Trusted only' => '2'
)
),
'c' => array(
'name' => 'Category',
'type' => 'list',
'values' => array(
'All categories' => '0_0',
'Anime' => '1_0',
'Anime - AMV' => '1_1',
'Anime - English' => '1_2',
'Anime - Non-English' => '1_3',
'Anime - Raw' => '1_4',
'Audio' => '2_0',
'Audio - Lossless' => '2_1',
'Audio - Lossy' => '2_2',
'Literature' => '3_0',
'Literature - English' => '3_1',
'Literature - Non-English' => '3_2',
'Literature - Raw' => '3_3',
'Live Action' => '4_0',
'Live Action - English' => '4_1',
'Live Action - Idol/PV' => '4_2',
'Live Action - Non-English' => '4_3',
'Live Action - Raw' => '4_4',
'Pictures' => '5_0',
'Pictures - Graphics' => '5_1',
'Pictures - Photos' => '5_2',
'Software' => '6_0',
'Software - Apps' => '6_1',
'Software - Games' => '6_2',
)
),
'q' => array(
'name' => 'Keyword',
'description' => 'Keyword(s)',
'type' => 'text'
)
)
);
public function collectData() {
// Build Search URL from user-provided parameters
$search_url = self::URI . '?s=id&o=desc&'
. http_build_query(array(
'f' => $this->getInput('f'),
'c' => $this->getInput('c'),
'q' => $this->getInput('q')
));
// Retrieve torrent listing from search results, which does not contain torrent description
$html = getSimpleHTMLDOM($search_url)
or returnServerError('Could not request Nyaa: ' . $search_url);
$links = $html->find('a');
$results = array();
foreach ($links as $link)
if (strpos($link->href, '/view/') === 0 && !in_array($link->href, $results))
$results[] = $link->href;
if (empty($results) && empty($this->getInput('q')))
returnServerError('No results from Nyaa: ' . $url, 500);
//Process each item individually
foreach ($results as $element) {
//Limit total amount of requests
if(count($this->items) >= 20) {
break;
}
$torrent_id = str_replace('/view/', '', $element);
//Ignore entries without valid torrent ID
if ($torrent_id != 0 && ctype_digit($torrent_id)) {
//Retrieve data for this torrent ID
$item_uri = self::URI . 'view/' . $torrent_id;
//Retrieve full description from torrent page
if ($item_html = getSimpleHTMLDOMCached($item_uri)) {
//Retrieve data from page contents
$item_title = str_replace(' :: Nyaa', '', $item_html->find('title', 0)->plaintext);
$item_desc = str_get_html(markdownToHtml($item_html->find('#torrent-description', 0)->innertext));
$item_author = extractFromDelimiters($item_html->outertext, 'href="/user/', '"');
$item_date = intval(extractFromDelimiters($item_html->outertext, 'data-timestamp="', '"'));
//Retrieve image for thumbnail or generic logo fallback
$item_image = $this->getURI() . 'static/img/avatar/default.png';
foreach ($item_desc->find('img') as $img) {
if (strpos($img->src, 'prez') === false) {
$item_image = $img->src;
break;
}
}
//Build and add final item
$item = array();
$item['uri'] = $item_uri;
$item['title'] = $item_title;
$item['author'] = $item_author;
$item['timestamp'] = $item_date;
$item['enclosures'] = array($item_image);
$item['content'] = $item_desc;
$this->items[] = $item;
}
}
$element = null;
}
$results = null;
}
}

100
bridges/PikabuBridge.php Normal file
View File

@ -0,0 +1,100 @@
<?php
class PikabuBridge extends BridgeAbstract {
const NAME = 'Пикабу';
const URI = 'https://pikabu.ru';
const DESCRIPTION = 'Выводит посты по тегу';
const MAINTAINER = 'em92';
const PARAMETERS = array(
'По тегу' => array(
'tag' => array(
'name' => 'Тег',
'exampleValue' => 'it',
'required' => true
),
'filter' => array(
'name' => 'Фильтр',
'type' => 'list',
'values' => array(
'Горячее' => 'hot',
'Свежее' => 'new',
),
'defaultValue' => 'hot'
)
)
);
public function getURI() {
if ($this->getInput('tag')) {
return self::URI . '/tag/' . rawurlencode($this->getInput('tag')) . '/' . rawurlencode($this->getInput('filter'));
} else {
return parent::getURI();
}
}
public function getIcon() {
return 'https://cs.pikabu.ru/assets/favicon.ico';
}
public function getName() {
if (is_string($this->getInput('tag'))) {
return $this->getInput('tag') . ' - ' . parent::getName();
} else {
return parent::getName();
}
}
public function collectData(){
$link = $this->getURI();
$text_html = getContents($link) or returnServerError('Could not fetch ' . $link);
$text_html = iconv('windows-1251', 'utf-8', $text_html);
$html = str_get_html($text_html);
foreach($html->find('article.story') as $post) {
$time = $post->find('time.story__datetime', 0);
if (is_null($time)) continue;
$el_to_remove_selectors = array(
'.story__read-more',
'svg.story-image__stretch',
);
foreach($el_to_remove_selectors as $el_to_remove_selector) {
foreach($post->find($el_to_remove_selector) as $el) {
$el->outertext = '';
}
}
foreach($post->find('img') as $img) {
$src = $img->getAttribute('src');
if (!$src) {
$src = $img->getAttribute('data-src');
if (!$src) {
continue;
}
}
$img->outertext = '<img src="'.$src.'">';
}
$categories = array();
foreach($post->find('.tags__tag') as $tag) {
if ($tag->getAttribute('data-tag')) {
$categories[] = $tag->innertext;
}
}
$title = $post->find('.story__title-link', 0);
$item = array();
$item['categories'] = $categories;
$item['author'] = $post->find('.user__nick', 0)->innertext;
$item['title'] = $title->plaintext;
$item['content'] = strip_tags(backgroundToImg($post->find('.story__content-inner', 0)->innertext), '<br><p><img>');
$item['uri'] = $title->href;
$item['timestamp'] = strtotime($time->getAttribute('datetime'));
$this->items[] = $item;
}
}
}

View File

@ -44,7 +44,7 @@ class PinterestBridge extends FeedExpander {
$pattern = '/https\:\/\/i\.pinimg\.com\/[a-zA-Z0-9]*x\//';
foreach($this->items as $item) {
$item["content"] = preg_replace($pattern, 'https://i.pinimg.com/originals/', $item["content"]);
$item['content'] = preg_replace($pattern, 'https://i.pinimg.com/originals/', $item['content']);
$newitems[] = $item;
}
$this->items = $newitems;
@ -64,10 +64,10 @@ class PinterestBridge extends FeedExpander {
// provide even less info. Thus we attempt multiple options.
$item['title'] = trim($result['title']);
if($item['title'] === "")
if($item['title'] === '')
$item['title'] = trim($result['rich_summary']['display_name']);
if($item['title'] === "")
if($item['title'] === '')
$item['title'] = trim($result['grid_description']);
$item['timestamp'] = strtotime($result['created_at']);

View File

@ -33,40 +33,41 @@ class PixivBridge extends BridgeAbstract {
$count++;
$item = array();
$item["id"] = $result["illustId"];
$item["uri"] = "https://www.pixiv.net/member_illust.php?mode=medium&illust_id=" . $result["illustId"];
$item["title"] = $result["illustTitle"];
$item["author"] = $result["userName"];
$item['id'] = $result['illustId'];
$item['uri'] = 'https://www.pixiv.net/member_illust.php?mode=medium&illust_id=' . $result['illustId'];
$item['title'] = $result['illustTitle'];
$item['author'] = $result['userName'];
preg_match_all($timeRegex, $result["url"], $dt, PREG_SET_ORDER, 0);
$elementDate = DateTime::createFromFormat("YmdHis",
$dt[0][1] . $dt[0][2] . $dt[0][3] . $dt[0][4] . $dt[0][5] . $dt[0][6]);
$item["timestamp"] = $elementDate->getTimestamp();
preg_match_all($timeRegex, $result['url'], $dt, PREG_SET_ORDER, 0);
$elementDate = DateTime::createFromFormat('YmdHis',
$dt[0][1] . $dt[0][2] . $dt[0][3] . $dt[0][4] . $dt[0][5] . $dt[0][6],
new DateTimeZone('Asia/Tokyo'));
$item['timestamp'] = $elementDate->getTimestamp();
$item["content"] = "<img src='" . $this->cacheImage($result['url'], $item["id"]) . "' />";
$item['content'] = "<img src='" . $this->cacheImage($result['url'], $item['id']) . "' />";
$this->items[] = $item;
}
}
public function cacheImage($url, $illustId) {
private function cacheImage($url, $illustId) {
$url = str_replace("_master1200", "", $url);
$url = str_replace("c/240x240/img-master/", "img-original/", $url);
$url = str_replace('_master1200', '', $url);
$url = str_replace('c/240x240/img-master/', 'img-original/', $url);
$path = CACHE_DIR . '/pixiv_img';
if(!is_dir($path))
mkdir($path, 0755, true);
if(!is_file($path . '/' . $illustId . '.jpeg')) {
$headers = array("Referer: https://www.pixiv.net/member_illust.php?mode=medium&illust_id=" . $illustId);
$headers = array('Referer: https://www.pixiv.net/member_illust.php?mode=medium&illust_id=' . $illustId);
$illust = getContents($url, $headers);
if(strpos($illust, "404 Not Found") !== false) {
$illust = getContents(str_replace("jpg", "png", $url), $headers);
if(strpos($illust, '404 Not Found') !== false) {
$illust = getContents(str_replace('jpg', 'png', $url), $headers);
}
file_put_contents($path . '/' . $illustId . '.jpeg', $illust);
}
return 'cache/pixiv_img/' . $illustId . ".jpeg";
return 'cache/pixiv_img/' . $illustId . '.jpeg';
}

View File

@ -8,9 +8,9 @@ class RainbowSixSiegeBridge extends BridgeAbstract {
const DESCRIPTION = 'Latest articles from the Rainbow Six Siege blog';
public function collectData(){
$dlUrl = "https://prod-tridionservice.ubisoft.com/live/v1/News/Latest?templateId=tcm%3A152-7677";
$dlUrl .= "8-32&pageIndex=0&pageSize=10&language=en-US&detailPageId=tcm%3A152-194572-64";
$dlUrl .= "&keywordList=175426&siteId=undefined&useSeoFriendlyUrl=true";
$dlUrl = 'https://prod-tridionservice.ubisoft.com/live/v1/News/Latest?templateId=tcm%3A152-7677';
$dlUrl .= '8-32&pageIndex=0&pageSize=10&language=en-US&detailPageId=tcm%3A152-194572-64';
$dlUrl .= '&keywordList=175426&siteId=undefined&useSeoFriendlyUrl=true';
$jsonString = getContents($dlUrl) or returnServerError('Error while downloading the website content');
$json = json_decode($jsonString, true);

View File

@ -25,7 +25,7 @@ class ReadComicsBridge extends BridgeAbstract {
return $timestamp;
}
$keywordsList = explode(";", $this->getInput('q'));
$keywordsList = explode(';', $this->getInput('q'));
foreach($keywordsList as $keywords) {
$html = $this->getSimpleHTMLDOM(self::URI . 'comic/' . rawurlencode($keywords))
or $this->returnServerError('Could not request readcomics.tv.');

View File

@ -9,16 +9,6 @@ class Releases3DSBridge extends BridgeAbstract {
public function collectData(){
function extractFromDelimiters($string, $start, $end){
if(strpos($string, $start) !== false) {
$section_retrieved = substr($string, strpos($string, $start) + strlen($start));
$section_retrieved = substr($section_retrieved, 0, strpos($section_retrieved, $end));
return $section_retrieved;
}
return false;
}
function typeToString($type){
switch($type) {
case 1: return '3DS Game';
@ -76,8 +66,8 @@ class Releases3DSBridge extends BridgeAbstract {
$ignDate = time();
$ignCoverArt = '';
$ignSearchUrl = 'http://www.ign.com/search?q=' . urlencode($name);
if($ignResult = getSimpleHTMLDOM($ignSearchUrl)) {
$ignSearchUrl = 'https://www.ign.com/search?q=' . urlencode($name);
if($ignResult = getSimpleHTMLDOMCached($ignSearchUrl)) {
$ignCoverArt = $ignResult->find('div.search-item-media', 0)->find('img', 0)->src;
$ignDesc = $ignResult->find('div.search-item-description', 0)->plaintext;
$ignLink = $ignResult->find('div.search-item-sub-title', 0)->find('a', 1)->href;
@ -127,6 +117,7 @@ class Releases3DSBridge extends BridgeAbstract {
$item['title'] = $name;
$item['author'] = $publisher;
$item['timestamp'] = $ignDate;
$item['enclosures'] = array($ignCoverArt);
$item['uri'] = empty($ignLink) ? $searchLinkDuckDuckGo : $ignLink;
$item['content'] = $ignDescription . $releaseDescription . $releaseSearchLinks;
$this->items[] = $item;

View File

@ -19,7 +19,7 @@ class ReporterreBridge extends BridgeAbstract {
// Replace all relative urls with absolute ones
$text = preg_replace(
'/(href|src)(\=[\"\'])(?!http)([^"\']+)/ims',
"$1$2" . self::URI . "$3",
'$1$2' . self::URI . '$3',
$text
);

View File

@ -9,9 +9,9 @@ class Rue89Bridge extends FeedExpander {
protected function parseItem($item){
$item = parent::parseItem($item);
$url = "http://api.rue89.nouvelobs.com/export/mobile2/node/"
. str_replace(" ", "", substr($item['uri'], -8))
. "/full";
$url = 'http://api.rue89.nouvelobs.com/export/mobile2/node/'
. str_replace(' ', '', substr($item['uri'], -8))
. '/full';
$datas = json_decode(getContents($url), true);
$item['content'] = $datas['node']['body'];

View File

@ -1,88 +0,0 @@
<?php
class SexactuBridge extends BridgeAbstract {
const MAINTAINER = 'Riduidel';
const NAME = 'Sexactu';
const AUTHOR = 'Maïa Mazaurette';
const URI = 'http://www.gqmagazine.fr';
const CACHE_TIMEOUT = 7200; // 2h
const DESCRIPTION = 'Sexactu via rss-bridge';
const REPLACED_ATTRIBUTES = array(
'href' => 'href',
'src' => 'src',
'data-original' => 'src'
);
public function getURI(){
return self::URI . '/sexactu';
}
public function collectData(){
$html = getSimpleHTMLDOM($this->getURI())
or returnServerError('Could not request ' . $this->getURI());
$sexactu = $html->find('.container_sexactu', 0);
$rowList = $sexactu->find('.row');
foreach($rowList as $row) {
// only use first list as second one only contains pages numbers
$title = $row->find('.title', 0);
if($title) {
$item = array();
$item['author'] = self::AUTHOR;
$item['title'] = $title->plaintext;
$urlAttribute = "data-href";
$uri = $title->$urlAttribute;
if($uri === false)
continue;
if(substr($uri, 0, 1) === 'h') { // absolute uri
$item['uri'] = $uri;
} else if(substr($uri, 0, 1) === '/') { // domain relative url
$item['uri'] = self::URI . $uri;
} else {
$item['uri'] = $this->getURI() . $uri;
}
$article = $this->loadFullArticle($item['uri']);
$item['content'] = $this->replaceUriInHtmlElement($article->find('.article_content', 0));
$publicationDate = $article->find('time[itemprop=datePublished]', 0);
$short_date = $publicationDate->datetime;
$item['timestamp'] = strtotime($short_date);
} else {
// Sometimes we get rubbish, ignore.
continue;
}
$this->items[] = $item;
}
}
/**
* Loads the full article and returns the contents
* @param $uri The article URI
* @return The article content
*/
private function loadFullArticle($uri){
$html = getSimpleHTMLDOMCached($uri);
$content = $html->find('#article', 0);
if($content) {
return $content;
}
return null;
}
/**
* Replaces all relative URIs with absolute ones
* @param $element A simplehtmldom element
* @return The $element->innertext with all URIs replaced
*/
private function replaceUriInHtmlElement($element){
$returned = $element->innertext;
foreach (self::REPLACED_ATTRIBUTES as $initial => $final) {
$returned = str_replace($initial . '="/', $final . '="' . self::URI . '/', $returned);
}
return $returned;
}
}

View File

@ -73,7 +73,7 @@ class ShanaprojectBridge extends BridgeAbstract {
// Getting the picture is a little bit tricky as it is part of the style.
// Luckily the style is part of the parent div :)
if(preg_match("/url\(\/\/([^\)]+)\)/i", $anime->parent->style, $matches))
if(preg_match('/url\(\/\/([^\)]+)\)/i', $anime->parent->style, $matches))
return $matches[1];
returnServerError('Could not extract background image!');

View File

@ -21,7 +21,7 @@ class Shimmie2Bridge extends DanbooruBridge {
protected function getItemFromElement($element){
$item = array();
$item['uri'] = $this->getURI() . $element->href;
$item['id'] = (int)preg_replace("/[^0-9]/", '', $element->getAttribute(static::IDATTRIBUTE));
$item['id'] = (int)preg_replace('/[^0-9]/', '', $element->getAttribute(static::IDATTRIBUTE));
$item['timestamp'] = time();
$thumbnailUri = $this->getURI() . $element->find('img', 0)->src;
$item['tags'] = $element->getAttribute('data-tags');

825
bridges/SkimfeedBridge.php Normal file
View File

@ -0,0 +1,825 @@
<?php
class SkimfeedBridge extends BridgeAbstract {
const CONTEXT_NEWS_BOX = 'News box';
const CONTEXT_HOT_TOPICS = 'Hot topics';
const CONTEXT_TECH_NEWS = 'Tech news';
const CONTEXT_CUSTOM = 'Custom feed';
const NAME = 'Skimfeed Bridge';
const URI = 'https://skimfeed.com';
const DESCRIPTION = 'Returns feeds from Skimfeed, also supports custom feeds!';
const MAINTAINER = 'logmanoriginal';
const CACHE_TIMEOUT = 3600;
const PARAMETERS = array(
self::CONTEXT_NEWS_BOX => array( // auto-generated (see below)
'box_channel' => array(
'name' => 'Channel',
'type' => 'list',
'required' => true,
'title' => 'Select your channel',
'values' => array(
'Hacker News' => '/news/hacker-news.html',
'QZ' => '/news/qz.html',
'The Verge' => '/news/the-verge.html',
'Slashdot' => '/news/slashdot.html',
'Lifehacker' => '/news/lifehacker.html',
'Gizmag' => '/news/gizmag.html',
'Fast Company' => '/news/fast-company.html',
'Engadget' => '/news/engadget.html',
'Wired' => '/news/wired.html',
'MakeUseOf' => '/news/makeuseof.html',
'Techcrunch' => '/news/techcrunch.html',
'Apple Insider' => '/news/apple-insider.html',
'ArsTechnica' => '/news/arstechnica.html',
'Tech in Asia' => '/news/tech-in-asia.html',
'FastCoExist' => '/news/fastcoexist.html',
'Digital Trends' => '/news/digital-trends.html',
'AnandTech' => '/news/anandtech.html',
'How to Geek' => '/news/how-to-geek.html',
'Geek' => '/news/geek.html',
'BBC Technology' => '/news/bbc-technology.html',
'Extreme Tech' => '/news/extreme-tech.html',
'Packet Storm Sec' => '/news/packet-storm-sec.html',
'MedGadget' => '/news/medgadget.html',
'Design' => '/news/design.html',
'The Next Web' => '/news/the-next-web.html',
'Bit-Tech' => '/news/bit-tech.html',
'Next Big Future' => '/news/next-big-future.html',
'A VC' => '/news/a-vc.html',
'Copyblogger' => '/news/copyblogger.html',
'Smashing Mag' => '/news/smashing-mag.html',
'Continuations' => '/news/continuations.html',
'Cult of Mac' => '/news/cult-of-mac.html',
'SecuriTeam' => '/news/securiteam.html',
'The Tech Block' => '/news/the-tech-block.html',
'BetaBeat' => '/news/betabeat.html',
'PC Mag' => '/news/pc-mag.html',
'Venture Beat' => '/news/venture-beat.html',
'ReadWriteWeb' => '/news/readwriteweb.html',
'High Scalability' => '/news/high-scalability.html',
)
)
),
self::CONTEXT_HOT_TOPICS => array(),
self::CONTEXT_TECH_NEWS => array( // auto-generated (see below)
'tech_channel' => array(
'name' => 'Tech channel',
'type' => 'list',
'required' => true,
'title' => 'Select your tech channel',
'values' => array(
'Agg' => array(
'Reddit' => '/news/reddit.html',
'Tech Insider' => '/news/tech-insider.html',
'Digg' => '/news/digg.html',
'Meta Filter' => '/news/meta-filter.html',
'Fark' => '/news/fark.html',
'Mashable' => '/news/mashable.html',
'Ad Week' => '/news/ad-week.html',
'The Chive' => '/news/the-chive.html',
'BoingBoing' => '/news/boingboing.html',
'Vice' => '/news/vice.html',
'ClientsFromHell' => '/news/clientsfromhell.html',
'How Stuff Works' => '/news/how-stuff-works.html',
'Buzzfeed' => '/news/buzzfeed.html',
'BoingBoing' => '/news/boingboing.html',
'Cracked' => '/news/cracked.html',
'Weird News' => '/news/weird-news.html',
'ITOTD' => '/news/itotd.html',
'Metafilter' => '/news/metafilter.html',
'TheOnion' => '/news/theonion.html',
),
'Cars' => array(
'Reddit Cars' => '/news/reddit-cars.html',
'NYT Auto' => '/news/nyt-auto.html',
'Truth About Cars' => '/news/truth-about-cars.html',
'AutoBlog' => '/news/autoblog.html',
'AutoSpies' => '/news/autospies.html',
'Autoweek' => '/news/autoweek.html',
'The Garage' => '/news/the-garage.html',
'Car and Driver' => '/news/car-and-driver.html',
'EGM Car Tech' => '/news/egm-car-tech.html',
'Top Gear' => '/news/top-gear.html',
'eGarage' => '/news/egarage.html',
),
'Comics' => array(
'Penny Arcade' => '/news/penny-arcade.html',
'XKCD' => '/news/xkcd.html',
'Channelate' => '/news/channelate.html',
'Savage Chicken' => '/news/savage-chicken.html',
'Dinosaur Comics' => '/news/dinosaur-comics.html',
'Explosm' => '/news/explosm.html',
'PoorlyDLines' => '/news/poorlydlines.html',
'Moonbeard' => '/news/moonbeard.html',
'Nedroid' => '/news/nedroid.html',
),
'Design' => array(
'FastCoCreate' => '/news/fastcocreate.html',
'Dezeen' => '/news/dezeen.html',
'Design Boom' => '/news/design-boom.html',
'Mmminimal' => '/news/mmminimal.html',
'We Heart' => '/news/we-heart.html',
'CreativeBloq' => '/news/creativebloq.html',
'TheDSGNblog' => '/news/thedsgnblog.html',
'Grainedit' => '/news/grainedit.html',
),
'Football' => array(
'Mail Football' => '/news/mail-football.html',
'Yahoo Football' => '/news/yahoo-football.html',
'FourFourTwo' => '/news/fourfourtwo.html',
'Goal' => '/news/goal.html',
'BBC Football' => '/news/bbc-football.html',
'TalkSport' => '/news/talksport.html',
'101 Great Goals' => '/news/101-great-goals.html',
'Who Scored' => '/news/who-scored.html',
'Football365 Champ' => '/news/football365-champ.html',
'Football365 Premier' => '/news/football365-premier.html',
'BleacherReport' => '/news/bleacherreport.html',
),
'Gaming' => array(
'Polygon' => '/news/polygon.html',
'Gamespot' => '/news/gamespot.html',
'RockPaperShotgun' => '/news/rockpapershotgun.html',
'VG247' => '/news/vg247.html',
'IGN' => '/news/ign.html',
'Reddit Games' => '/news/reddit-games.html',
'TouchArcade' => '/news/toucharcade.html',
'GamesRadar' => '/news/gamesradar.html',
'Siliconera' => '/news/siliconera.html',
'Reddit GameDeals' => '/news/reddit-gamedeals.html',
'Joystiq' => '/news/joystiq.html',
'GameInformer' => '/news/gameinformer.html',
'PSN Blog' => '/news/psn-blog.html',
'Reddit GamerNews' => '/news/reddit-gamernews.html',
'Steam' => '/news/steam.html',
'DualShockers' => '/news/dualshockers.html',
'ShackNews' => '/news/shacknews.html',
'CheapAssGamer' => '/news/cheapassgamer.html',
'Eurogamer' => '/news/eurogamer.html',
'Major Nelson' => '/news/major-nelson.html',
'Reddit Truegaming' => '/news/reddit-truegaming.html',
'GameTrailers' => '/news/gametrailers.html',
'GamaSutra' => '/news/gamasutra.html',
'USGamer' => '/news/usgamer.html',
'Shoryuken' => '/news/shoryuken.html',
'Destructoid' => '/news/destructoid.html',
'ArsGaming' => '/news/arsgaming.html',
'XBOX Blog' => '/news/xbox-blog.html',
'GiantBomb' => '/news/giantbomb.html',
'VideoGamer' => '/news/videogamer.html',
'Pocket Tactics' => '/news/pocket-tactics.html',
'WiredGaming' => '/news/wiredgaming.html',
'AllGamesBeta' => '/news/allgamesbeta.html',
'OnGamers' => '/news/ongamers.html',
'Reddit GameBundles' => '/news/reddit-gamebundles.html',
'Kotaku' => '/news/kotaku.html',
'PCGamer' => '/news/pcgamer.html',
),
'Investing' => array(
'Seeking Alpha' => '/news/seeking-alpha.html',
'BBC Business' => '/news/bbc-business.html',
'Harvard Biz' => '/news/harvard-biz.html',
'Market Watch' => '/news/market-watch.html',
'Investor Place' => '/news/investor-place.html',
'Money Week' => '/news/money-week.html',
'Moneybeat' => '/news/moneybeat.html',
'Dealbook' => '/news/dealbook.html',
'Economist Business' => '/news/economist-business.html',
'Economist' => '/news/economist.html',
'Economist CN' => '/news/economist-cn.html',
),
'Long' => array(
'The Atlantic' => '/news/the-atlantic.html',
'Reddit Long' => '/news/reddit-long.html',
'Paris Review' => '/news/paris-review.html',
'New Yorker' => '/news/new-yorker.html',
'LongForm' => '/news/longform.html',
'LongReads' => '/news/longreads.html',
'The Browser' => '/news/the-browser.html',
'The Feature' => '/news/the-feature.html',
),
'MMA' => array(
'MMA Weekly' => '/news/mma-weekly.html',
'MMAFighting' => '/news/mmafighting.html',
'Reddit MMA' => '/news/reddit-mma.html',
'Sherdog Articles' => '/news/sherdog-articles.html',
'FightLand Vice' => '/news/fightland-vice.html',
'Sherdog Forum' => '/news/sherdog-forum.html',
'MMA Junkie' => '/news/mma-junkie.html',
'Sherdog MMA Video' => '/news/sherdog-mma-video.html',
'BloodyElbow' => '/news/bloodyelbow.html',
'CageWriter' => '/news/cagewriter.html',
'Sherdog News' => '/news/sherdog-news.html',
'MMAForum' => '/news/mmaforum.html',
'MMA Junkie Radio' => '/news/mma-junkie-radio.html',
'UFC News' => '/news/ufc-news.html',
'FightLinker' => '/news/fightlinker.html',
'Bodybuilding MMA' => '/news/bodybuilding-mma.html',
'BleacherReport MMA' => '/news/bleacherreport-mma.html',
'FiveOuncesofPain' => '/news/fiveouncesofpain.html',
'Sherdog Pictures' => '/news/sherdog-pictures.html',
'CagePotato' => '/news/cagepotato.html',
'Sherdog Radio' => '/news/sherdog-radio.html',
'ProMMARadio' => '/news/prommaradio.html',
),
'Mobile' => array(
'Macrumors' => '/news/macrumors.html',
'Android Police' => '/news/android-police.html',
'GSM Arena' => '/news/gsm-arena.html',
'DigiTrend Mobile' => '/news/digitrend-mobile.html',
'Mobile Nation' => '/news/mobile-nation.html',
'TechRadar' => '/news/techradar.html',
'ZDNET Mobile' => '/news/zdnet-mobile.html',
'MacWorld' => '/news/macworld.html',
'Android Dev Blog' => '/news/android-dev-blog.html',
),
'News' => array(
'Daily Mail' => '/news/daily-mail.html',
'Business Insider' => '/news/business-insider.html',
'The Guardian' => '/news/the-guardian.html',
'Fox' => '/news/fox.html',
'BBC World' => '/news/bbc-world.html',
'MSNBC' => '/news/msnbc.html',
'ABC News' => '/news/abc-news.html',
'Al Jazeera' => '/news/al-jazeera.html',
'Business Insider India' => '/news/business-insider-india.html',
'Observer' => '/news/observer.html',
'NYT Tech' => '/news/nyt-tech.html',
'NYT World' => '/news/nyt-world.html',
'CNN' => '/news/cnn.html',
'Japan Times' => '/news/japan-times.html',
'WorldCrunch' => '/news/worldcrunch.html',
'Pro publica' => '/news/pro-publica.html',
'OZY' => '/news/ozy.html',
'Times of India' => '/news/times-of-india.html',
'The Australian' => '/news/the-australian.html',
'Harpers' => '/news/harpers.html',
'Moscow Times' => '/news/moscow-times.html',
'The Times' => '/news/the-times.html',
'Reuters Tech' => '/news/reuters-tech.html',
),
'Politics' => array(
'FreeRepublic' => '/news/freerepublic.html',
'Salon' => '/news/salon.html',
'DrudgeReport' => '/news/drudgereport.html',
'TheHill' => '/news/thehill.html',
'TheBlaze' => '/news/theblaze.html',
'InfoWars' => '/news/infowars.html',
'New Republic' => '/news/new-republic.html',
'WashTimes' => '/news/washtimes.html',
'RealCleanPol' => '/news/realcleanpol.html',
'Fact Check' => '/news/fact-check.html',
'DailyKos' => '/news/dailykos.html',
'NewsMax' => '/news/newsmax.html',
'Politico' => '/news/politico.html',
'Michelle Malkin' => '/news/michelle-malkin.html',
),
'Reddit' => array(
'R Movies' => '/news/r-movies.html',
'R News' => '/news/r-news.html',
'Futurology' => '/news/futurology.html',
'R All' => '/news/r-all.html',
'R Music' => '/news/r-music.html',
'R Askscience' => '/news/r-askscience.html',
'R Technology' => '/news/r-technology.html',
'R Bestof' => '/news/r-bestof.html',
'R Askreddit' => '/news/r-askreddit.html',
'R Worldnews' => '/news/r-worldnews.html',
'R Explainlikeimfive' => '/news/r-explainlikeimfive.html',
'R Iama' => '/news/r-iama.html',
),
'Science' => array(
'PhysOrg' => '/news/physorg.html',
'Hack-a-day' => '/news/hack-a-day.html',
'Reddit Science' => '/news/reddit-science.html',
'Stats Blog' => '/news/stats-blog.html',
'Flowing Data' => '/news/flowing-data.html',
'Eureka Alert' => '/news/eureka-alert.html',
'Robotics BizRev' => '/news/robotics-bizrev.html',
'Planet big Data' => '/news/planet-big-data.html',
'Makezine' => '/news/makezine.html',
'MIT Tech' => '/news/mit-tech.html',
'R Bloggers' => '/news/r-bloggers.html',
'DataIsBeautiful' => '/news/dataisbeautiful.html',
'Ted Videos' => '/news/ted-videos.html',
'Advanced Science' => '/news/advanced-science.html',
'Robotiq' => '/news/robotiq.html',
'Science Daily' => '/news/science-daily.html',
'IEEE Robotics' => '/news/ieee-robotics.html',
'PSFK' => '/news/psfk.html',
'Discover Magazine' => '/news/discover-magazine.html',
'DataTau' => '/news/datatau.html',
'RoboHub' => '/news/robohub.html',
'Discovery' => '/news/discovery.html',
'Smart Data' => '/news/smart-data.html',
'Whats Big Data' => '/news/whats-big-data.html',
),
'Tech' => array(
'Hacker News' => '/news/hacker-news.html',
'The Verge' => '/news/the-verge.html',
'Lifehacker' => '/news/lifehacker.html',
'Fast Company' => '/news/fast-company.html',
'ArsTechnica' => '/news/arstechnica.html',
'MakeUseOf' => '/news/makeuseof.html',
'FastCoExist' => '/news/fastcoexist.html',
'How to Geek' => '/news/how-to-geek.html',
'The Next Web' => '/news/the-next-web.html',
'Engadget' => '/news/engadget.html',
'Gizmag' => '/news/gizmag.html',
'QZ' => '/news/qz.html',
'Wired' => '/news/wired.html',
'Techcrunch' => '/news/techcrunch.html',
'Slashdot' => '/news/slashdot.html',
'Extreme Tech' => '/news/extreme-tech.html',
'AnandTech' => '/news/anandtech.html',
'Digital Trends' => '/news/digital-trends.html',
'Next Big Future' => '/news/next-big-future.html',
'Apple Insider' => '/news/apple-insider.html',
'Geek' => '/news/geek.html',
'BBC Technology' => '/news/bbc-technology.html',
'Bit-Tech' => '/news/bit-tech.html',
'Packet Storm Sec' => '/news/packet-storm-sec.html',
'Design' => '/news/design.html',
'High Scalability' => '/news/high-scalability.html',
'Smashing Mag' => '/news/smashing-mag.html',
'The Tech Block' => '/news/the-tech-block.html',
'A VC' => '/news/a-vc.html',
'Tech in Asia' => '/news/tech-in-asia.html',
'ReadWriteWeb' => '/news/readwriteweb.html',
'PC Mag' => '/news/pc-mag.html',
'Continuations' => '/news/continuations.html',
'Copyblogger' => '/news/copyblogger.html',
'Cult of Mac' => '/news/cult-of-mac.html',
'BetaBeat' => '/news/betabeat.html',
'MedGadget' => '/news/medgadget.html',
'SecuriTeam' => '/news/securiteam.html',
'Venture Beat' => '/news/venture-beat.html',
),
'Trend' => array(
'Trend Hunter' => '/news/trend-hunter.html',
'ApartmentT' => '/news/apartmentt.html',
'GQ' => '/news/gq.html',
'Digital Trends' => '/news/digital-trends.html',
'Cool Hunting' => '/news/cool-hunting.html',
'FastCoDesign' => '/news/fastcodesign.html',
'TC Startups' => '/news/tc-startups.html',
'Killer Startups' => '/news/killer-startups.html',
'DigiInfo' => '/news/digiinfo.html',
'New Startups' => '/news/new-startups.html',
'DigiTrends' => '/news/digitrends.html',
),
'Watches' => array(
'Hodinkee' => '/news/hodinkee.html',
'Quill and Pad' => '/news/quill-and-pad.html',
'Monochrome' => '/news/monochrome.html',
'Deployant' => '/news/deployant.html',
'Watches by SJX' => '/news/watches-by-sjx.html',
'Fratello Watches' => '/news/fratello-watches.html',
'A Blog to Watch' => '/news/a-blog-to-watch.html',
'Wound for Life' => '/news/wound-for-life.html',
'Watch Paper' => '/news/watch-paper.html',
'Watch Report' => '/news/watch-report.html',
'Perpetuelle' => '/news/perpetuelle.html',
),
'Youtube' => array(
'LinusTechTips' => '/news/linustechtips.html',
'MetalJesusRocks' => '/news/metaljesusrocks.html',
'TotalBiscuit' => '/news/totalbiscuit.html',
'DexBonus' => '/news/dexbonus.html',
'Lon Siedman' => '/news/lon-siedman.html',
'MKBHD' => '/news/mkbhd.html',
'Terry A Davis' => '/news/terry-a-davis.html',
'HappyConsole' => '/news/happyconsole.html',
'Austin Evans' => '/news/austin-evans.html',
'NCIX' => '/news/ncix.html',
),
)
),
),
self::CONTEXT_CUSTOM => array(
'config' => array(
'name' => 'Configuration',
'type' => 'text',
'required' => true,
'title' => 'Enter feed numbers from Skimfeed!',
'exampleValue' => '5,8,2,l,p,9,23'
)
),
'global' => array(
'limit' => array(
'name' => 'Limit',
'type' => 'number',
'title' => 'Limits the number of returned items in the feed',
'exampleValue' => 10
)
)
);
public function getURI() {
switch($this->queriedContext) {
case self::CONTEXT_NEWS_BOX:
$channel = $this->getInput('box_channel');
if($channel) {
return static::URI . $channel;
}
break;
case self::CONTEXT_HOT_TOPICS:
return static::URI;
case self::CONTEXT_TECH_NEWS:
$channel = $this->getInput('tech_channel');
if($channel) {
return static::URI . $channel;
}
break;
case self::CONTEXT_CUSTOM:
$config = $this->getInput('config');
return static::URI . '/custom.php?f=' . urlencode($config);
}
return parent::getURI();
}
public function getName() {
switch($this->queriedContext) {
case self::CONTEXT_NEWS_BOX:
$channel = $this->getInput('box_channel');
$title = array_search(
$channel,
static::PARAMETERS[self::CONTEXT_NEWS_BOX]['box_channel']['values']
);
return $title . ' - ' . static::NAME;
case self::CONTEXT_HOT_TOPICS:
return 'Hot topics - ' . static::NAME;
case self::CONTEXT_TECH_NEWS:
$channel = $this->getInput('tech_channel');
$titles = array();
foreach(static::PARAMETERS[self::CONTEXT_TECH_NEWS]['tech_channel']['values'] as $ch) {
$titles = array_merge($titles, $ch);
}
$title = array_search($channel, $titles);
return $title . ' - ' . static::NAME;
case self::CONTEXT_CUSTOM:
return 'Custom - ' . static::NAME;
}
return parent::getName();
}
public function collectData() {
// enable to export parameter lists
// $this->exportBoxChannels(); die;
// $this->exportTechChannels(); die;
$html = getSimpleHTMLDOM($this->getURI())
or returnServerError('Request to ' . $this->getURI() . ' failed!');
defaultLinkTo($html, static::URI);
switch($this->queriedContext) {
case self::CONTEXT_NEWS_BOX:
$author = array_search(
$this->getInput('box_channel'),
static::PARAMETERS[self::CONTEXT_NEWS_BOX]['box_channel']['values']
);
$author = '<a href="'
. $this->getURI()
. '">'
. $author
. '</a>';
$this->extractFeed($html, $author);
break;
case self::CONTEXT_HOT_TOPICS:
$this->extractHotTopics($html);
break;
case self::CONTEXT_TECH_NEWS:
$authors = array();
foreach(static::PARAMETERS[self::CONTEXT_TECH_NEWS]['tech_channel']['values'] as $ch) {
$authors = array_merge($authors, $ch);
}
$author = '<a href="'
. $this->getURI()
. '">'
. array_search($this->getInput('tech_channel'), $authors)
. '</a>';
$this->extractFeed($html, $author);
break;
case self::CONTEXT_CUSTOM:
$this->extractCustomFeed($html);
break;
}
}
private function extractFeed($html, $author) {
$articles = $html->find('li')
or returnServerError('Could not find articles!');
if(count($articles) === 1
&& stristr($articles[0]->plaintext, 'Nothing new in the last 48 hours')) {
return; // Nothing to show
}
$limit = $this->getInput('limit') ?: -1;
foreach($articles as $article) {
$anchor = $article->find('a', 0)
or returnServerError('Could not find anchor!');
$item = array();
$item['uri'] = $this->getTarget($anchor);
$item['title'] = trim($anchor->plaintext);
// The timestamp is encoded as relative time (max. the last 48 hours)
// like this: "- 7 hours". It should always be at the end of the article:
$age = substr($article->plaintext, strrpos($article->plaintext, '-'));
$item['timestamp'] = strtotime($age);
$item['author'] = $author;
$this->items[] = $item;
if($limit > 0 && count($this->items) >= $limit) {
return;
}
}
}
private function extractHotTopics($html) {
$topics = $html->find('#popbox ul li')
or returnServerError('Could not find topics!');
$limit = $this->getInput('limit') ?: -1;
foreach($topics as $topic) {
$anchor = $topic->find('a', 0)
or returnServerError('Could not find anchor!');
$item = array();
$item['uri'] = $this->getTarget($anchor);
$item['title'] = $anchor->title;
$this->items[] = $item;
if($limit > 0 && count($this->items) >= $limit) {
return;
}
}
}
private function extractCustomFeed($html) {
$boxes = $html->find('#boxx .boxes')
or returnServerError('Could not find boxes!');
foreach($boxes as $box) {
$anchor = $box->find('span.boxtitles a', 0)
or returnServerError('Could not find box anchor!');
$author = '<a href="' . $anchor->href . '">' . trim($anchor->plaintext) . '</a>';
$uri = $anchor->href;
$box_html = getSimpleHTMLDOM($uri)
or returnServerError('Could not load custom feed!');
$this->extractFeed($box_html, $author);
}
}
private function getTarget($anchor) {
// Anchors are linked to Skimfeed, luckily the target URI is encoded
// in that URI via '&u=<URI>':
$query = parse_url($anchor->href, PHP_URL_QUERY);
foreach(explode('&', $query) as $parameter) {
list($key, $value) = explode('=', $parameter);
if($key !== 'u') {
continue;
}
return urldecode($value);
}
}
/**
* dev-mode!
* Requires '&format=Html'
*
* Returns the 'box' array from the source site
*/
private function exportBoxChannels() {
$html = getSimpleHTMLDOMCached(static::URI)
or returnServerError('No contents received from Skimfeed!');
if(!$this->isCompatible($html)) {
returnServerError('Skimfeed version is not compatible!');
}
$boxes = $html->find('#boxx .boxes')
or returnServerError('Could not find boxes!');
// begin of 'channel' list
$message = <<<EOD
'box_channel' => array(
'name' => 'Channel',
'type' => 'list',
'required' => true,
'title' => 'Select your channel',
'values' => array(
EOD;
foreach($boxes as $box) {
$anchor = $box->find('span.boxtitles a', 0)
or returnServerError('Could not find box anchor!');
$title = trim($anchor->plaintext);
$uri = $anchor->href;
// add value
$message .= "\t\t'{$title}' => '{$uri}', \n";
}
// end of 'box' list
$message .= <<<EOD
)
),
EOD;
echo <<<EOD
<!DOCTYPE html>
<html>
<body>
<code style="white-space: pre-wrap;">{$message}</code>
</body>
</html>
EOD;
}
/**
* dev-mode!
* Requires '&format=Html'
*
* Returns the 'techs' array from the source site
*/
private function exportTechChannels() {
$html = getSimpleHTMLDOMCached(static::URI)
or returnServerError('No contents received from Skimfeed!');
if(!$this->isCompatible($html)) {
returnServerError('Skimfeed version is not compatible!');
}
$channels = $html->find('#menubar a')
or returnServerError('Could not find channels!');
// begin of 'tech_channel' list
$message = <<<EOD
'tech_channel' => array(
'name' => 'Tech channel',
'type' => 'list',
'required' => true,
'title' => 'Select your tech channel',
'values' => array(
EOD;
foreach($channels as $channel) {
if($channel->href === '#'
|| $channel->class === 'homelink'
|| $channel->plaintext === 'Twitter'
|| $channel->plaintext === 'Weather'
|| $channel->plaintext === '+Custom') {
continue;
}
$title = trim($channel->plaintext);
$uri = '/' . $channel->href;
$message .= "\t\t'{$title}' => array(\n";
$channel_html = getSimpleHTMLDOMCached(static::URI . $uri)
or returnServerError('Could not load tech channel ' . $channel->plaintext . '!');
$boxes = $channel_html->find('#boxx .boxes')
or returnServerError('Could not find boxes!');
foreach($boxes as $box) {
$anchor = $box->find('span.boxtitles a', 0)
or returnServerError('Could not find box anchor!');
$boxtitle = trim($anchor->plaintext);
$boxuri = $anchor->href;
$message .= "\t\t\t'{$boxtitle}' => '{$boxuri}', \n";
}
$message .= "\t\t),\n";
}
// end of 'box' list
$message .= <<<EOD
)
),
EOD;
echo <<<EOD
<!DOCTYPE html>
<html>
<body>
<code style="white-space: pre-wrap;">{$message}</code>
</body>
</html>
EOD;
}
/**
* Checks if the reported skimfeed version is compatible
*/
private function isCompatible($html) {
$title = $html->find('title', 0);
if(!$title) {
return false;
}
if($title->plaintext === 'Skimfeed V5.5 - Tech News') {
return true;
}
return false;
}
}

View File

@ -31,7 +31,7 @@ class SupInfoBridge extends BridgeAbstract {
}
}
public function fetchArticle($link) {
private function fetchArticle($link) {
$articleHTML = getSimpleHTMLDOM(self::URI . $link)
or returnServerError('Unable to fetch article !');

View File

@ -0,0 +1,45 @@
<?php
class SuperSmashBlogBridge extends BridgeAbstract {
const MAINTAINER = 'corenting';
const NAME = 'Super Smash Blog';
const URI = 'https://www.smashbros.com/en_US/blog/index.html';
const CACHE_TIMEOUT = 7200; // 2h
const DESCRIPTION = 'Latest articles from the Super Smash Blog blog';
public function collectData(){
$dlUrl = 'https://www.smashbros.com/data/bs/en_US/json/en_US.json';
$jsonString = getContents($dlUrl) or returnServerError('Error while downloading the website content');
$json = json_decode($jsonString, true);
foreach($json as $article) {
// Build content
$picture = $article['acf']['image1']['url'];
if (strlen($picture) != 0) {
$picture = str_get_html('<img src="https://www.smashbros.com/' . substr($picture, 8) . '"/>');
} else {
$picture = '';
}
$video = $article['acf']['link_url'];
if (strlen($video) != 0) {
$video = str_get_html('<a href="' . $video .'">Youtube video</a>');
} else {
$video = '';
}
$text = str_get_html($article['acf']['editor']);
$content = $picture . $video . $text;
// Build final item
$item = array();
$item['title'] = $article['title']['rendered'];
$item['timestamp'] = strtotime($article['date']);
$item['content'] = $content;
$item['uri'] = self::URI . '?post=' . $article['id'];
$this->items[] = $item;
}
}
}

View File

@ -8,67 +8,66 @@ class TheHackerNewsBridge extends BridgeAbstract {
public function collectData(){
function stripWithDelimiters($string, $start, $end){
while(strpos($string, $start) !== false) {
$section_to_remove = substr($string, strpos($string, $start));
$section_to_remove = substr($section_to_remove, 0, strpos($section_to_remove, $end) + strlen($end));
$string = str_replace($section_to_remove, '', $string);
}
return $string;
}
function stripRecursiveHtmlSection($string, $tag_name, $tag_start){
$open_tag = '<' . $tag_name;
$close_tag = '</' . $tag_name . '>';
$close_tag_length = strlen($close_tag);
if(strpos($tag_start, $open_tag) === 0) {
while(strpos($string, $tag_start) !== false) {
$max_recursion = 100;
$section_to_remove = null;
$section_start = strpos($string, $tag_start);
$search_offset = $section_start;
do {
$max_recursion--;
$section_end = strpos($string, $close_tag, $search_offset);
$search_offset = $section_end + $close_tag_length;
$section_to_remove = substr(
$string,
$section_start,
$section_end - $section_start + $close_tag_length
);
$open_tag_count = substr_count($section_to_remove, $open_tag);
$close_tag_count = substr_count($section_to_remove, $close_tag);
} while($open_tag_count > $close_tag_count && $max_recursion > 0);
$string = str_replace($section_to_remove, '', $string);
}
}
return $string;
}
$html = getSimpleHTMLDOM($this->getURI())
or returnServerError('Could not request TheHackerNews: ' . $this->getURI());
$limit = 0;
foreach($html->find('article') as $element) {
foreach($html->find('div.body-post') as $element) {
if($limit < 5) {
$article_url = $element->find('a.entry-title', 0)->href;
$article_author = trim($element->find('span.vcard', 0)->plaintext);
$article_title = $element->find('a.entry-title', 0)->plaintext;
$article_timestamp = strtotime($element->find('span.updated', 0)->plaintext);
$article = getSimpleHTMLDOM($article_url)
or returnServerError('Could not request TheHackerNews: ' . $article_url);
$article_url = $element->find('a.story-link', 0)->href;
$article_author = trim($element->find('i.fa-user', 0)->parent()->plaintext);
$article_title = $element->find('h2.home-title', 0)->plaintext;
$contents = $article->find('div.articlebodyonly', 0)->innertext;
$contents = stripRecursiveHtmlSection($contents, 'div', '<div class=\'clear\'');
$contents = stripWithDelimiters($contents, '<script', '</script>');
//Date without time
$article_timestamp = strtotime(
extractFromDelimiters(
$element->find('i.fa-calendar', 0)->parent()->outertext,
'</i>',
'<span>'
)
);
//Article thumbnail in lazy-loading image
if (is_object($element->find('img[data-echo]', 0))) {
$article_thumbnail = array(
extractFromDelimiters(
$element->find('img[data-echo]', 0)->outertext,
"data-echo='",
"'"
)
);
} else {
$article_thumbnail = array();
}
if ($article = getSimpleHTMLDOMCached($article_url)) {
//Article body
$contents = $article->find('div.articlebody', 0)->innertext;
$contents = stripRecursiveHtmlSection($contents, 'div', '<div class="ad_');
$contents = stripWithDelimiters($contents, 'id="google_ads', '</iframe>');
$contents = stripWithDelimiters($contents, '<script', '</script>');
//Date with time
if (is_object($article->find('meta[itemprop=dateModified]', 0))) {
$article_timestamp = strtotime(
extractFromDelimiters(
$article->find('meta[itemprop=dateModified]', 0)->outertext,
"content='",
"'"
)
);
}
} else {
$contents = 'Could not request TheHackerNews: ' . $article_url;
}
$item = array();
$item['uri'] = $article_url;
$item['title'] = $article_title;
$item['author'] = $article_author;
$item['enclosures'] = $article_thumbnail;
$item['timestamp'] = $article_timestamp;
$item['content'] = trim($contents);
$this->items[] = $item;

View File

@ -0,0 +1,41 @@
<?php
class TheYeteeBridge extends BridgeAbstract {
const MAINTAINER = 'Monsieur Poutounours';
const NAME = 'TheYetee';
const URI = 'https://theyetee.com';
const CACHE_TIMEOUT = 14400; // 4 h
const DESCRIPTION = 'Fetch daily shirts from The Yetee';
public function collectData(){
$html = getSimpleHTMLDOM(self::URI)
or returnServerError('Could not request The Yetee.');
$div = $html->find('.hero-col');
foreach($div as $element) {
$item = array();
$item['enclosures'] = array();
$title = $element->find('h2', 0)->plaintext;
$item['title'] = $title;
$author = trim($element->find('div[class=credit]', 0)->plaintext);
$item['author'] = $author;
$uri = $element->find('div[class=controls] a', 0)->href;
$item['uri'] = static::URI.$uri;
$content = '<p>'.$element->find('section[class=product-listing-info] p', -1)->plaintext.'</p>';
$photos = $element->find('a[class=js-modaal-gallery] img');
foreach($photos as $photo) {
$content = $content."<br /><img src='$photo->src' />";
$item['enclosures'][] = $photo->src;
}
$item['content'] = $content;
$this->items[] = $item;
}
}
}

View File

@ -1,102 +0,0 @@
<?php
class Torrent9Bridge extends BridgeAbstract {
const MAINTAINER = 'lagaisse';
const NAME = 'Torrent9 Bridge';
const URI = 'http://www.torrent9.pe';
const CACHE_TIMEOUT = 86400; // 24h = 86400s
const DESCRIPTION = 'Returns latest torrents';
const PAGE_SERIES = 'torrents_series';
const PAGE_SERIES_VOSTFR = 'torrents_series_vostfr';
const PAGE_SERIES_FR = 'torrents_series_french';
const PARAMETERS = array(
'From search' => array(
'q' => array(
'name' => 'Search',
'required' => true,
'title' => 'Type your search'
)
),
'By page' => array(
'page' => array(
'name' => 'Page',
'type' => 'list',
'required' => false,
'values' => array(
'Series' => self::PAGE_SERIES,
'Series VOST' => self::PAGE_SERIES_VOSTFR,
'Series FR' => self::PAGE_SERIES_FR,
),
'defaultValue' => self::PAGE_SERIES
)
)
);
public function collectData(){
if($this->queriedContext === 'From search') {
$request = str_replace(' ', '-', trim($this->getInput('q')));
$page = self::URI . '/search_torrent/' . urlencode($request) . '.html';
} else {
$request = $this->getInput('page');
$page = self::URI . '/' . $request . '.html';
}
$html = getSimpleHTMLDOM($page)
or returnServerError('No results for this query.');
foreach($html->find('table', 0)->find('tr') as $episode) {
if($episode->parent->tag == 'tbody') {
$urlepisode = self::URI . $episode->find('a', 0)->getAttribute('href');
//30 years = forever
$htmlepisode = getSimpleHTMLDOMCached($urlepisode, 86400 * 366 * 30);
$item = array();
$item['author'] = $episode->find('a', 0)->text();
$item['title'] = $episode->find('a', 0)->text();
$item['id'] = $episode->find('a', 0)->getAttribute('href');
$item['pubdate'] = $this->getCachedDate($urlepisode);
$textefiche = $htmlepisode->find('.movie-information', 0)->find('p', 1);
if(isset($textefiche)) {
$item['content'] = $textefiche->text();
} else {
$p = $htmlepisode->find('.movie-information', 0)->find('p');
if(!empty($p)) {
$item['content'] = $htmlepisode->find('.movie-information', 0)->find('p', 0)->text();
}
}
$item['id'] = $episode->find('a', 0)->getAttribute('href');
$item['uri'] = self::URI . $htmlepisode->find('.download', 0)->getAttribute('href');
$this->items[] = $item;
}
}
}
public function getName(){
if(!is_null($this->getInput('q'))) {
return $this->getInput('q') . ' : ' . self::NAME;
}
return parent::getName();
}
private function getCachedDate($url){
debugMessage('getting pubdate from url ' . $url . '');
// Initialize cache
$cache = Cache::create('FileCache');
$cache->setPath(CACHE_DIR . '/pages');
$params = [$url];
$cache->setParameters($params);
// Get cachefile timestamp
$time = $cache->getTime();
return ($time !== false ? $time : time());
}
}

View File

@ -17,8 +17,14 @@ class VkBridge extends BridgeAbstract
)
);
protected $videos = array();
protected $pageName;
protected function getAccessToken()
{
return 'c8071613517c155c6cfbd2a059b2718e9c37b89094c4766834969dda75f657a2c1cbb49bab4c5e649f1db';
}
public function getURI()
{
if (!is_null($this->getInput('u'))) {
@ -51,11 +57,20 @@ class VkBridge extends BridgeAbstract
$pageName = $pageName->plaintext;
$this->pageName = htmlspecialchars_decode($pageName);
}
foreach ($html->find('div.replies') as $comment_block) {
$comment_block->outertext = '';
}
$html->load($html->save());
$pinned_post_item = null;
$last_post_id = 0;
foreach ($html->find('.post') as $post) {
defaultLinkTo($post, self::URI);
$post_videos = array();
$is_pinned_post = false;
if (strpos($post->getAttribute('class'), 'post_fixed') !== false) {
$is_pinned_post = true;
@ -66,7 +81,7 @@ class VkBridge extends BridgeAbstract
$post->find('a.wall_post_more', 0)->outertext = '';
}
$content_suffix = "";
$content_suffix = '';
// looking for external links
$external_link_selectors = array(
@ -81,8 +96,8 @@ class VkBridge extends BridgeAbstract
$innertext = $a->innertext;
$parsed_url = parse_url($a->getAttribute('href'));
if (strpos($parsed_url['path'], '/away.php') !== 0) continue;
parse_str($parsed_url["query"], $parsed_query);
$content_suffix .= "<br>External link: <a href='" . $parsed_query["to"] . "'>$innertext</a>";
parse_str($parsed_url['query'], $parsed_query);
$content_suffix .= "<br>External link: <a href='" . $parsed_query['to'] . "'>$innertext</a>";
}
}
@ -100,21 +115,21 @@ class VkBridge extends BridgeAbstract
}
// looking for article
$article = $post->find("a.article_snippet", 0);
$article = $post->find('a.article_snippet', 0);
if (is_object($article)) {
if (strpos($article->getAttribute('class'), "article_snippet_mini") !== false) {
$article_title_selector = "div.article_snippet_mini_title";
$article_author_selector = "div.article_snippet_mini_info > .mem_link,
div.article_snippet_mini_info > .group_link";
$article_thumb_selector = "div.article_snippet_mini_thumb";
if (strpos($article->getAttribute('class'), 'article_snippet_mini') !== false) {
$article_title_selector = 'div.article_snippet_mini_title';
$article_author_selector = 'div.article_snippet_mini_info > .mem_link,
div.article_snippet_mini_info > .group_link';
$article_thumb_selector = 'div.article_snippet_mini_thumb';
} else {
$article_title_selector = "div.article_snippet__title";
$article_author_selector = "div.article_snippet__author";
$article_thumb_selector = "div.article_snippet__image";
$article_title_selector = 'div.article_snippet__title';
$article_author_selector = 'div.article_snippet__author';
$article_thumb_selector = 'div.article_snippet__image';
}
$article_title = $article->find($article_title_selector, 0)->innertext;
$article_author = $article->find($article_author_selector, 0)->innertext;
$article_link = self::URI . ltrim($article->getAttribute('href'), '/');
$article_link = $article->getAttribute('href');
$article_img_element_style = $article->find($article_thumb_selector, 0)->getAttribute('style');
preg_match('/background-image: url\((.*)\)/', $article_img_element_style, $matches);
if (count($matches) > 0) {
@ -126,20 +141,22 @@ class VkBridge extends BridgeAbstract
// get video on post
$video = $post->find('div.post_video_desc', 0);
$main_video_link = '';
if (is_object($video)) {
$video_title = $video->find('div.post_video_title', 0)->plaintext;
$video_link = self::URI . ltrim( $video->find('a.lnk', 0)->getAttribute('href'), '/' );
$content_suffix .= "<br>Video: <a href='$video_link'>$video_title</a>";
$video_link = $video->find('a.lnk', 0)->getAttribute('href');
$this->appendVideo($video_title, $video_link, $content_suffix, $post_videos);
$video->outertext = '';
$main_video_link = $video_link;
}
// get all other videos
foreach($post->find('a.page_post_thumb_video') as $a) {
$video_title = $a->getAttribute('aria-label');
$temp = explode(" ", $video_title, 2);
$video_title = htmlspecialchars_decode($a->getAttribute('aria-label'));
$temp = explode(' ', $video_title, 2);
if (count($temp) > 1) $video_title = $temp[1];
$video_link = self::URI . ltrim( $a->getAttribute('href'), '/' );
$content_suffix .= "<br>Video: <a href='$video_link'>$video_title</a>";
$video_link = $a->getAttribute('href');
if ($video_link != $main_video_link) $this->appendVideo($video_title, $video_link, $content_suffix, $post_videos);
$a->outertext = '';
}
@ -155,16 +172,16 @@ class VkBridge extends BridgeAbstract
foreach($post->find('.page_album_wrap') as $el) {
$a = $el->find('.page_album_link', 0);
$album_title = $a->find('.page_album_title_text', 0)->getAttribute('title');
$album_link = self::URI . ltrim($a->getAttribute('href'), '/');
$album_link = $a->getAttribute('href');
$el->outertext = '';
$content_suffix .= "<br>Album: <a href='$album_link'>$album_title</a>";
}
// get photo documents
foreach($post->find('a.page_doc_photo_href') as $a) {
$doc_link = self::URI . ltrim($a->getAttribute('href'), '/');
$doc_gif_label_element = $a->find(".page_gif_label", 0);
$doc_title_element = $a->find(".doc_label", 0);
$doc_link = $a->getAttribute('href');
$doc_gif_label_element = $a->find('.page_gif_label', 0);
$doc_title_element = $a->find('.doc_label', 0);
if (is_object($doc_gif_label_element)) {
$gif_preview_img = backgroundToImg($a->find('.page_doc_photo', 0));
@ -184,11 +201,11 @@ class VkBridge extends BridgeAbstract
// get other documents
foreach($post->find('div.page_doc_row') as $div) {
$doc_title_element = $div->find("a.page_doc_title", 0);
$doc_title_element = $div->find('a.page_doc_title', 0);
if (is_object($doc_title_element)) {
$doc_title = $doc_title_element->innertext;
$doc_link = self::URI . ltrim($doc_title_element->getAttribute('href'), '/');
$doc_link = $doc_title_element->getAttribute('href');
$content_suffix .= "<br>Doc: <a href='$doc_link'>$doc_title</a>";
} else {
@ -204,7 +221,7 @@ class VkBridge extends BridgeAbstract
$poll_title = $div->find('.page_media_poll_title', 0)->innertext;
$content_suffix .= "<br>Poll: $poll_title";
foreach($div->find('div.page_poll_text') as $poll_stat_title) {
$content_suffix .= "<br>- " . $poll_stat_title->innertext;
$content_suffix .= '<br>- ' . $poll_stat_title->innertext;
}
$div->outertext = '';
}
@ -228,20 +245,29 @@ class VkBridge extends BridgeAbstract
$item = array();
$item['content'] = strip_tags(backgroundToImg($post->find('div.wall_text', 0)->innertext), '<br><img>');
$item['content'] .= $content_suffix;
$item['categories'] = array();
// get post hashtags
foreach($post->find('a') as $a) {
$href = $a->getAttribute('href');
$prefix = '/feed?section=search&q=%23';
$innertext = $a->innertext;
if ($href && substr($href, 0, strlen($prefix)) === $prefix) {
$item['categories'][] = urldecode(substr($href, strlen($prefix)));
} else if (substr($innertext, 0, 1) == '#') {
$item['categories'][] = $innertext;
}
}
// get post link
$post_link = $post->find('a.post_link', 0)->getAttribute('href');
preg_match("/wall-?\d+_(\d+)/", $post_link, $preg_match_result);
preg_match('/wall-?\d+_(\d+)/', $post_link, $preg_match_result);
$item['post_id'] = intval($preg_match_result[1]);
if (substr(self::URI, -1) == '/') {
$post_link = self::URI . ltrim($post_link, "/");
} else {
$post_link = self::URI . $post_link;
}
$item['uri'] = $post_link;
$item['timestamp'] = $this->getTime($post);
$item['title'] = $this->getTitle($item['content']);
$item['author'] = $post_author;
$item['videos'] = $post_videos;
if ($is_pinned_post) {
// do not append it now
$pinned_post_item = $item;
@ -252,16 +278,18 @@ class VkBridge extends BridgeAbstract
}
if (is_null($pinned_post_item)) {
return;
} else if (count($this->items) == 0) {
$this->items[] = $pinned_post_item;
} else if ($last_post_id < $pinned_post_item['post_id']) {
$this->items[] = $pinned_post_item;
usort($this->items, function ($item1, $item2) {
return $item2['post_id'] - $item1['post_id'];
});
if (!is_null($pinned_post_item)) {
if (count($this->items) == 0) {
$this->items[] = $pinned_post_item;
} else if ($last_post_id < $pinned_post_item['post_id']) {
$this->items[] = $pinned_post_item;
usort($this->items, function ($item1, $item2) {
return $item2['post_id'] - $item1['post_id'];
});
}
}
$this->getCleanVideoLinks();
}
private function getPhoto($a) {
@ -273,17 +301,17 @@ class VkBridge extends BridgeAbstract
$data = json_decode($arg, true);
if ($data == null) return;
$thumb = $data['temp']['base'] . $data['temp']['x_'][0] . ".jpg";
$thumb = $data['temp']['base'] . $data['temp']['x_'][0] . '.jpg';
$original = '';
foreach(array('y_', 'z_', 'w_') as $key) {
if (!isset($data['temp'][$key])) continue;
if (!isset($data['temp'][$key][0])) continue;
if (substr($data['temp'][$key][0], 0, 4) == "http") {
$base = "";
if (substr($data['temp'][$key][0], 0, 4) == 'http') {
$base = '';
} else {
$base = $data['temp']['base'];
}
$original = $base . $data['temp'][$key][0] . ".jpg";
$original = $base . $data['temp'][$key][0] . '.jpg';
}
if ($original) {
@ -296,7 +324,7 @@ class VkBridge extends BridgeAbstract
private function getTitle($content)
{
preg_match('/^["\w\ \p{Cyrillic}\(\)\?#«»-]+/mu', htmlspecialchars_decode($content), $result);
if (count($result) == 0) return "untitled";
if (count($result) == 0) return 'untitled';
return $result[0];
}
@ -326,7 +354,7 @@ class VkBridge extends BridgeAbstract
}
public function getContents()
private function getContents()
{
ini_set('user-agent', 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:53.0) Gecko/20100101 Firefox/53.0');
@ -335,5 +363,51 @@ class VkBridge extends BridgeAbstract
return getContents($this->getURI(), $header);
}
protected function appendVideo($video_title, $video_link, &$content_suffix, array &$post_videos)
{
if (!$video_title) $video_title = '(empty)';
preg_match('/video([0-9-]+_[0-9]+)/', $video_link, $preg_match_result);
if (count($preg_match_result) > 1) {
$video_id = $preg_match_result[1];
$this->videos[ $video_id ] = array(
'url' => $video_link,
'title' => $video_title,
);
$post_videos[] = $video_id;
} else {
$content_suffix .= '<br>Video: <a href="'.htmlspecialchars($video_link).'">'.$video_title.'</a>';
}
}
protected function getCleanVideoLinks() {
$result = $this->api('video.get', array(
'videos' => implode(',', array_keys($this->videos)),
'count' => 200
));
if (isset($result['error'])) return;
foreach($result['response']['items'] as $item) {
$video_id = strval($item['owner_id']).'_'.strval($item['id']);
$this->videos[$video_id]['url'] = $item['player'];
}
foreach($this->items as &$item) {
foreach($item['videos'] as $video_id) {
$video_link = $this->videos[$video_id]['url'];
$video_title = $this->videos[$video_id]['title'];
$item['content'] .= '<br>Video: <a href="'.htmlspecialchars($video_link).'">'.$video_title.'</a>';
}
unset($item['videos']);
}
}
protected function api($method, array $params)
{
$params['v'] = '5.80';
$params['access_token'] = $this->getAccessToken();
return json_decode( getContents('https://api.vk.com/method/'.$method.'?'.http_build_query($params)), true );
}
}

View File

@ -3,37 +3,24 @@ class WeLiveSecurityBridge extends FeedExpander {
const MAINTAINER = 'ORelio';
const NAME = 'We Live Security';
const URI = 'http://www.welivesecurity.com/';
const URI = 'https://www.welivesecurity.com/';
const DESCRIPTION = 'Returns the newest articles.';
private function stripWithDelimiters($string, $start, $end){
while(strpos($string, $start) !== false) {
$section_to_remove = substr($string, strpos($string, $start));
$section_to_remove = substr($section_to_remove, 0, strpos($section_to_remove, $end) + strlen($end));
$string = str_replace($section_to_remove, '', $string);
}
return $string;
}
protected function parseItem($item){
$item = parent::parseItem($item);
$article_html = getSimpleHTMLDOMCached($item['uri']);
if(!$article_html) {
$item['content'] .= '<p>Could not request ' . $this->getName() . ': ' . $item['uri'] . '</p>';
$item['content'] .= '<p><em>Could not request ' . $this->getName() . ': ' . $item['uri'] . '</em></p>';
return $item;
}
$article_content = $article_html->find('div.wlistingsingletext', 0)->innertext;
$article_content = $this->stripWithDelimiters($article_content, '<script', '</script>');
$article_content = '<p><b>'
. $item['content']
. '</b></p>'
. trim($article_content);
$item['content'] = $article_content;
$article_content = $article_html->find('div.formatted', 0)->innertext;
$article_content = stripWithDelimiters($article_content, '<script', '</script>');
$article_content = stripRecursiveHTMLSection($article_content, 'div', '<div class="comments');
$article_content = stripRecursiveHTMLSection($article_content, 'div', '<div class="similar-articles');
$article_content = stripRecursiveHTMLSection($article_content, 'span', '<span class="meta');
$item['content'] = trim($article_content);
return $item;
}

View File

@ -18,10 +18,10 @@ class WhydBridge extends BridgeAbstract {
public function collectData(){
$html = '';
if(strlen(preg_replace("/[^0-9a-f]/", '', $this->getInput('u'))) == 24) {
if(strlen(preg_replace('/[^0-9a-f]/', '', $this->getInput('u'))) == 24) {
// is input the userid ?
$html = getSimpleHTMLDOM(
self::URI . 'u/' . preg_replace("/[^0-9a-f]/", '', $this->getInput('u'))
self::URI . 'u/' . preg_replace('/[^0-9a-f]/', '', $this->getInput('u'))
) or returnServerError('No results for this query.');
} else { // input may be the username
$html = getSimpleHTMLDOM(

View File

@ -3,8 +3,7 @@ class WordPressBridge extends FeedExpander {
const MAINTAINER = 'aledeg';
const NAME = 'Wordpress Bridge';
const URI = 'https://wordpress.org/';
const CACHE_TIMEOUT = 10800; // 3h
const DESCRIPTION = 'Returns the newest full posts of a Wordpress powered website';
const DESCRIPTION = 'Returns the newest full posts of a WordPress powered website';
const PARAMETERS = array( array(
'url' => array(
@ -13,8 +12,8 @@ class WordPressBridge extends FeedExpander {
)
));
private function clearContent($content){
$content = preg_replace('/<script[^>]*>[^<]*<\/script>/', '', $content);
private function cleanContent($content){
$content = stripWithDelimiters($content, '<script', '</script>');
$content = preg_replace('/<div class="wpa".*/', '', $content);
$content = preg_replace('/<form.*\/form>/', '', $content);
return $content;
@ -27,6 +26,10 @@ class WordPressBridge extends FeedExpander {
$article = null;
switch(true) {
case !is_null($article_html->find('[itemprop=articleBody]', 0)):
// highest priority content div
$article = $article_html->find('[itemprop=articleBody]', 0);
break;
case !is_null($article_html->find('article', 0)):
// most common content div
$article = $article_html->find('article', 0);
@ -39,15 +42,37 @@ class WordPressBridge extends FeedExpander {
// another common content div
$article = $article_html->find('.post-content', 0);
break;
case !is_null($article_html->find('.post', 0)):
// for old WordPress themes without HTML5
$article = $article_html->find('.post', 0);
break;
}
foreach ($article->find('h1.entry-title') as $title)
if ($title->plaintext == $item['title'])
$title->outertext = '';
$article_image = $article_html->find('img.wp-post-image', 0);
if(!empty($item['content']) && (!is_object($article_image) || empty($article_image->src))) {
$article_image = str_get_html($item['content'])->find('img.wp-post-image', 0);
}
if(is_object($article_image) && !empty($article_image->src)) {
if(empty($article_image->getAttribute('data-lazy-src'))) {
$article_image = $article_image->src;
} else {
$article_image = $article_image->getAttribute('data-lazy-src');
}
$mime_type = getMimeType($article_image);
if (strpos($mime_type, 'image') === false)
$article_image .= '#.image'; // force image
if (empty($item['enclosures']))
$item['enclosures'] = array($article_image);
else
$item['enclosures'] = array_merge($item['enclosures'], $article_image);
}
if(!is_null($article)) {
$item['content'] = $this->clearContent($article->innertext);
$item['content'] = $this->cleanContent($article->innertext);
}
return $item;

View File

@ -12,72 +12,72 @@ class YGGTorrentBridge extends BridgeAbstract {
const PARAMETERS = array(
array(
"cat" => array(
"name" => "category",
"type" => "list",
"values" => array(
"Toute les catégories" => "all.all",
"Film/Vidéo - Toutes les sous-catégories" => "2145.all",
"Film/Vidéo - Animation" => "2145.2178",
"Film/Vidéo - Animation Série" => "2145.2179",
"Film/Vidéo - Concert" => "2145.2180",
"Film/Vidéo - Documentaire" => "2145.2181",
"Film/Vidéo - Émission TV" => "2145.2182",
"Film/Vidéo - Film" => "2145.2183",
"Film/Vidéo - Série TV" => "2145.2184",
"Film/Vidéo - Spectacle" => "2145.2185",
"Film/Vidéo - Sport" => "2145.2186",
"Film/Vidéo - Vidéo-clips" => "2145.2186",
"Audio - Toutes les sous-catégories" => "2139.all",
"Audio - Karaoké" => "2139.2147",
"Audio - Musique" => "2139.2148",
"Audio - Podcast Radio" => "2139.2150",
"Audio - Samples" => "2139.2149",
"Jeu vidéo - Toutes les sous-catégories" => "2142.all",
"Jeu vidéo - Autre" => "2142.2167",
"Jeu vidéo - Linux" => "2142.2159",
"Jeu vidéo - MacOS" => "2142.2160",
"Jeu vidéo - Microsoft" => "2142.2162",
"Jeu vidéo - Nintendo" => "2142.2163",
"Jeu vidéo - Smartphone" => "2142.2165",
"Jeu vidéo - Sony" => "2142.2164",
"Jeu vidéo - Tablette" => "2142.2166",
"Jeu vidéo - Windows" => "2142.2161",
"eBook - Toutes les sous-catégories" => "2140.all",
"eBook - Audio" => "2140.2151",
"eBook - Bds" => "2140.2152",
"eBook - Comics" => "2140.2153",
"eBook - Livres" => "2140.2154",
"eBook - Mangas" => "2140.2155",
"eBook - Presse" => "2140.2156",
"Emulation - Toutes les sous-catégories" => "2141.all",
"Emulation - Emulateurs" => "2141.2157",
"Emulation - Roms" => "2141.2158",
"GPS - Toutes les sous-catégories" => "2141.all",
"GPS - Applications" => "2141.2168",
"GPS - Cartes" => "2141.2169",
"GPS - Divers" => "2141.2170"
'cat' => array(
'name' => 'category',
'type' => 'list',
'values' => array(
'Toute les catégories' => 'all.all',
'Film/Vidéo - Toutes les sous-catégories' => '2145.all',
'Film/Vidéo - Animation' => '2145.2178',
'Film/Vidéo - Animation Série' => '2145.2179',
'Film/Vidéo - Concert' => '2145.2180',
'Film/Vidéo - Documentaire' => '2145.2181',
'Film/Vidéo - Émission TV' => '2145.2182',
'Film/Vidéo - Film' => '2145.2183',
'Film/Vidéo - Série TV' => '2145.2184',
'Film/Vidéo - Spectacle' => '2145.2185',
'Film/Vidéo - Sport' => '2145.2186',
'Film/Vidéo - Vidéo-clips' => '2145.2186',
'Audio - Toutes les sous-catégories' => '2139.all',
'Audio - Karaoké' => '2139.2147',
'Audio - Musique' => '2139.2148',
'Audio - Podcast Radio' => '2139.2150',
'Audio - Samples' => '2139.2149',
'Jeu vidéo - Toutes les sous-catégories' => '2142.all',
'Jeu vidéo - Autre' => '2142.2167',
'Jeu vidéo - Linux' => '2142.2159',
'Jeu vidéo - MacOS' => '2142.2160',
'Jeu vidéo - Microsoft' => '2142.2162',
'Jeu vidéo - Nintendo' => '2142.2163',
'Jeu vidéo - Smartphone' => '2142.2165',
'Jeu vidéo - Sony' => '2142.2164',
'Jeu vidéo - Tablette' => '2142.2166',
'Jeu vidéo - Windows' => '2142.2161',
'eBook - Toutes les sous-catégories' => '2140.all',
'eBook - Audio' => '2140.2151',
'eBook - Bds' => '2140.2152',
'eBook - Comics' => '2140.2153',
'eBook - Livres' => '2140.2154',
'eBook - Mangas' => '2140.2155',
'eBook - Presse' => '2140.2156',
'Emulation - Toutes les sous-catégories' => '2141.all',
'Emulation - Emulateurs' => '2141.2157',
'Emulation - Roms' => '2141.2158',
'GPS - Toutes les sous-catégories' => '2141.all',
'GPS - Applications' => '2141.2168',
'GPS - Cartes' => '2141.2169',
'GPS - Divers' => '2141.2170'
)
),
"nom" => array(
"name" => "Nom",
"description" => "Nom du torrent",
"type" => "text"
'nom' => array(
'name' => 'Nom',
'description' => 'Nom du torrent',
'type' => 'text'
),
"description" => array(
"name" => "Description",
"description" => "Description du torrent",
"type" => "text"
'description' => array(
'name' => 'Description',
'description' => 'Description du torrent',
'type' => 'text'
),
"fichier" => array(
"name" => "Fichier",
"description" => "Fichier du torrent",
"type" => "text"
'fichier' => array(
'name' => 'Fichier',
'description' => 'Fichier du torrent',
'type' => 'text'
),
"uploader" => array(
"name" => "Uploader",
"description" => "Uploader du torrent",
"type" => "text"
'uploader' => array(
'name' => 'Uploader',
'description' => 'Uploader du torrent',
'type' => 'text'
),
)
@ -85,59 +85,59 @@ class YGGTorrentBridge extends BridgeAbstract {
public function collectData() {
$catInfo = explode(".", $this->getInput("cat"));
$catInfo = explode('.', $this->getInput('cat'));
$category = $catInfo[0];
$subcategory = $catInfo[1];
$html = getSimpleHTMLDOM(self::URI . "/engine/search?name="
. $this->getInput("nom")
. "&description="
. $this->getInput("description")
. "&fichier="
. $this->getInput("fichier")
. "&file="
. $this->getInput("uploader")
. "&category="
$html = getSimpleHTMLDOM(self::URI . '/engine/search?name='
. $this->getInput('nom')
. '&description='
. $this->getInput('description')
. '&fichier='
. $this->getInput('fichier')
. '&file='
. $this->getInput('uploader')
. '&category='
. $category
. "&sub_category="
. '&sub_category='
. $subcategory
. "&do=search")
or returnServerError("Unable to query Yggtorrent !");
. '&do=search&order=desc&sort=publish_date')
or returnServerError('Unable to query Yggtorrent !');
$count = 0;
$results = $html->find(".results", 0);
$results = $html->find('.results', 0);
if(!$results) return;
foreach($results->find("tr") as $row) {
foreach($results->find('tr') as $row) {
$count++;
if($count == 1) continue;
if($count == 12) break;
if($count == 1) continue; // Skip table header
if($count == 22) break; // Stop processing after 21 items (20 + 1 table header)
$item = array();
$item["timestamp"] = $row->find(".hidden", 1)->plaintext;
$item["title"] = $row->find("a", 1)->plaintext;
$torrentData = $this->collectTorrentData($row->find("a", 1)->href);
$item["author"] = $torrentData["author"];
$item["content"] = $torrentData["content"];
$item["seeders"] = $row->find("td", 7)->plaintext;
$item["leechers"] = $row->find("td", 8)->plaintext;
$item["size"] = $row->find("td", 5)->plaintext;
$item['timestamp'] = $row->find('.hidden', 1)->plaintext;
$item['title'] = $row->find('a', 1)->plaintext;
$torrentData = $this->collectTorrentData($row->find('a', 1)->href);
$item['author'] = $torrentData['author'];
$item['content'] = $torrentData['content'];
$item['seeders'] = $row->find('td', 7)->plaintext;
$item['leechers'] = $row->find('td', 8)->plaintext;
$item['size'] = $row->find('td', 5)->plaintext;
$this->items[] = $item;
}
}
public function collectTorrentData($url) {
private function collectTorrentData($url) {
//For weird reason, the link we get can be invalid, we fix it.
$url_full = explode("/", $url);
$url_full = explode('/', $url);
$url_full[4] = urlencode($url_full[4]);
$url_full[5] = urlencode($url_full[5]);
$url_full[6] = urlencode($url_full[6]);
$url = implode("/", $url_full);
$page = getSimpleHTMLDOM($url) or returnServerError("Unable to query Yggtorrent page !");
$author = $page->find(".informations", 0)->find("a", 4)->plaintext;
$content = $page->find(".default", 1);
return array("author" => $author, "content" => $content);
$url = implode('/', $url_full);
$page = getSimpleHTMLDOMCached($url) or returnServerError('Unable to query Yggtorrent page !');
$author = $page->find('.informations', 0)->find('a', 4)->plaintext;
$content = $page->find('.default', 1);
return array('author' => $author, 'content' => $content);
}
}

View File

@ -25,14 +25,14 @@ class YoutubeBridge extends BridgeAbstract {
'By channel id' => array(
'c' => array(
'name' => 'channel id',
'exampleValue' => "15",
'exampleValue' => '15',
'required' => true
)
),
'By playlist Id' => array(
'p' => array(
'name' => 'playlist id',
'exampleValue' => "15"
'exampleValue' => '15'
)
),
'Search result' => array(
@ -45,9 +45,25 @@ class YoutubeBridge extends BridgeAbstract {
'type' => 'number',
'exampleValue' => 1
)
),
'global' => array(
'duration_min' => array(
'name' => 'min. duration (minutes)',
'type' => 'number',
'title' => 'Minimum duration for the video in minutes',
'exampleValue' => 5
),
'duration_max' => array(
'name' => 'max. duration (minutes)',
'type' => 'number',
'title' => 'Maximum duration for the video in minutes',
'exampleValue' => 10
)
)
);
private $feedName = '';
private function ytBridgeQueryVideoInfo($vid, &$author, &$desc, &$time){
$html = $this->ytGetSimpleHTMLDOM(self::URI . "watch?v=$vid");
@ -113,6 +129,17 @@ class YoutubeBridge extends BridgeAbstract {
private function ytBridgeParseHtmlListing($html, $element_selector, $title_selector, $add_parsed_items = true) {
$limit = $add_parsed_items ? 10 : INF;
$count = 0;
$duration_min = $this->getInput('duration_min') ?: -1;
$duration_min = $duration_min * 60;
$duration_max = $this->getInput('duration_max') ?: INF;
$duration_max = $duration_max * 60;
if($duration_max < $duration_min) {
returnClientError('Max duration must be greater than min duration!');
}
foreach($html->find($element_selector) as $element) {
if($count < $limit) {
$author = '';
@ -121,6 +148,20 @@ class YoutubeBridge extends BridgeAbstract {
$vid = str_replace('/watch?v=', '', $element->find('a', 0)->href);
$vid = substr($vid, 0, strpos($vid, '&') ?: strlen($vid));
$title = $this->ytBridgeFixTitle($element->find($title_selector, 0)->plaintext);
// The duration comes in one of the formats:
// hh:mm:ss / mm:ss / m:ss
// 01:03:30 / 15:06 / 1:24
$durationText = trim($element->find('span[class="video-time"]', 0)->plaintext);
$durationText = preg_replace('/([\d]{1,2})\:([\d]{2})/', '00:$1:$2', $durationText);
sscanf($durationText, '%d:%d:%d', $hours, $minutes, $seconds);
$duration = $hours * 3600 + $minutes * 60 + $seconds;
if($duration < $duration_min || $duration > $duration_max) {
continue;
}
if($title != '[Private Video]' && strpos($vid, 'googleads') === false) {
if ($add_parsed_items) {
$this->ytBridgeQueryVideoInfo($vid, $author, $desc, $time);
@ -168,7 +209,7 @@ class YoutubeBridge extends BridgeAbstract {
}
if(!empty($url_feed) && !empty($url_listing)) {
if($xml = $this->ytGetSimpleHTMLDOM($url_feed)) {
if(!$this->skipFeeds() && $xml = $this->ytGetSimpleHTMLDOM($url_feed)) {
$this->ytBridgeParseXmlFeed($xml);
} elseif($html = $this->ytGetSimpleHTMLDOM($url_listing)) {
$this->ytBridgeParseHtmlListing($html, 'li.channels-content-item', 'h3');
@ -182,7 +223,7 @@ class YoutubeBridge extends BridgeAbstract {
$html = $this->ytGetSimpleHTMLDOM($url_listing)
or returnServerError("Could not request YouTube. Tried:\n - $url_listing");
$item_count = $this->ytBridgeParseHtmlListing($html, 'tr.pl-video', '.pl-video-title a', false);
if ($item_count <= 15 && ($xml = $this->ytGetSimpleHTMLDOM($url_feed))) {
if ($item_count <= 15 && !$this->skipFeeds() && ($xml = $this->ytGetSimpleHTMLDOM($url_feed))) {
$this->ytBridgeParseXmlFeed($xml);
} else {
$this->ytBridgeParseHtmlListing($html, 'tr.pl-video', '.pl-video-title a');
@ -195,7 +236,7 @@ class YoutubeBridge extends BridgeAbstract {
$this->request = $this->getInput('s');
$page = 1;
if($this->getInput('pa'))
$page = (int)preg_replace("/[^0-9]/", '', $this->getInput('pa'));
$page = (int)preg_replace('/[^0-9]/', '', $this->getInput('pa'));
$url_listing = self::URI
. 'results?search_query='
@ -215,6 +256,10 @@ class YoutubeBridge extends BridgeAbstract {
}
}
private function skipFeeds() {
return ($this->getInput('duration_min') || $this->getInput('duration_max'));
}
public function getName(){
// Name depends on queriedContext:
switch($this->queriedContext) {
@ -226,5 +271,5 @@ class YoutubeBridge extends BridgeAbstract {
default:
return parent::getName();
}
}
}
}

View File

@ -1,9 +1,9 @@
<?php
class ZDNetBridge extends BridgeAbstract {
class ZDNetBridge extends FeedExpander {
const MAINTAINER = 'ORelio';
const NAME = 'ZDNet Bridge';
const URI = 'http://www.zdnet.com/';
const URI = 'https://www.zdnet.com/';
const DESCRIPTION = 'Technology News, Analysis, Comments and Product Reviews for IT Professionals.';
//http://www.zdnet.com/zdnet.opml
@ -160,143 +160,42 @@ class ZDNetBridge extends BridgeAbstract {
));
public function collectData(){
function stripCdata($string){
$string = str_replace('<![CDATA[', '', $string);
$string = str_replace(']]>', '', $string);
return trim($string);
}
function extractFromDelimiters($string, $start, $end){
if(strpos($string, $start) !== false) {
$section_retrieved = substr($string, strpos($string, $start) + strlen($start));
$section_retrieved = substr($section_retrieved, 0, strpos($section_retrieved, $end));
return $section_retrieved;
}
return false;
}
function stripWithDelimiters($string, $start, $end){
while(strpos($string, $start) !== false) {
$section_to_remove = substr($string, strpos($string, $start));
$section_to_remove = substr($section_to_remove, 0, strpos($section_to_remove, $end) + strlen($end));
$string = str_replace($section_to_remove, '', $string);
}
return $string;
}
function stripRecursiveHtmlSection($string, $tag_name, $tag_start){
$open_tag = '<' . $tag_name;
$close_tag = '</' . $tag_name . '>';
$close_tag_length = strlen($close_tag);
if(strpos($tag_start, $open_tag) === 0) {
while(strpos($string, $tag_start) !== false) {
$max_recursion = 100;
$section_to_remove = null;
$section_start = strpos($string, $tag_start);
$search_offset = $section_start;
do {
$max_recursion--;
$section_end = strpos($string, $close_tag, $search_offset);
$search_offset = $section_end + $close_tag_length;
$section_to_remove = substr(
$string,
$section_start,
$section_end - $section_start + $close_tag_length
);
$open_tag_count = substr_count($section_to_remove, $open_tag);
$close_tag_count = substr_count($section_to_remove, $close_tag);
} while ($open_tag_count > $close_tag_count && $max_recursion > 0);
$string = str_replace($section_to_remove, '', $string);
}
}
return $string;
}
$baseUri = self::URI;
$baseUri = static::URI;
$feed = $this->getInput('feed');
if(strpos($feed, 'downloads!') !== false) {
$feed = str_replace('downloads!', '', $feed);
$baseUri = str_replace('www.', 'downloads.', $baseUri);
}
$url = $baseUri . trim($feed, '/') . '/rss.xml';
$html = getSimpleHTMLDOM($url)
or returnServerError('Could not request ZDNet: ' . $url);
$limit = 0;
$this->collectExpandableDatas($url);
}
foreach($html->find('item') as $element) {
if($limit < 10) {
$article_url = preg_replace(
'/([^#]+)#ftag=.*/',
'$1',
stripCdata(extractFromDelimiters($element->innertext, '<link>', '</link>'))
);
protected function parseItem($item){
$item = parent::parseItem($item);
$article_author = stripCdata(extractFromDelimiters($element->innertext, 'role="author">', '<'));
$article_title = stripCdata($element->find('title', 0)->plaintext);
$article_subtitle = stripCdata($element->find('description', 0)->plaintext);
$article_timestamp = strtotime(stripCdata($element->find('pubDate', 0)->plaintext));
$article = getSimpleHTMLDOM($article_url)
or returnServerError('Could not request ZDNet: ' . $article_url);
$article = getSimpleHTMLDOMCached($item['uri']);
if(!$article)
returnServerError('Could not request ZDNet: ' . $url);
if(!empty($article_author)) {
$author = $article_author;
} else {
$author = $article->find('meta[name=author]', 0);
if(is_object($author)) {
$author = $author->content;
} else {
$author = 'ZDNet';
}
}
$thumbnail = $article->find('meta[itemprop=image]', 0);
if(is_object($thumbnail)) {
$thumbnail = $thumbnail->content;
} else {
$thumbnail = '';
}
$contents = $article->find('article', 0)->innertext;
foreach(array(
'<div class="shareBar"',
'<div class="shortcodeGalleryWrapper"',
'<div class="relatedContent',
'<div class="downloadNow',
'<div data-shortcode',
'<div id="sharethrough',
'<div id="inpage-video'
) as $div_start) {
$contents = stripRecursiveHtmlSection($contents, 'div', $div_start);
}
$contents = stripWithDelimiters($contents, '<script', '</script>');
$contents = stripWithDelimiters($contents, '<meta itemprop="image"', '>');
$contents = trim(stripWithDelimiters($contents, '<section class="sharethrough-top', '</section>'));
$content_img = strpos($contents, '<img'); //Look for first image
if (($content_img !== false && $content_img < 512) || $thumbnail == '') {
$content_img = ''; //Image already present on article beginning or no thumbnail
} else {
$content_img = '<p><img src="'.$thumbnail.'" /></p>'; //Include thumbnail
}
$contents = $content_img
. '<p><b>'
. $article_subtitle
. '</b></p>'
. $contents;
$item = array();
$item['author'] = $author;
$item['uri'] = $article_url;
$item['title'] = $article_title;
$item['timestamp'] = $article_timestamp;
$item['content'] = $contents;
$this->items[] = $item;
$limit++;
}
$contents = $article->find('article', 0)->innertext;
foreach(array(
'<div class="shareBar"',
'<div class="shortcodeGalleryWrapper"',
'<div class="relatedContent',
'<div class="downloadNow',
'<div data-shortcode',
'<div id="sharethrough',
'<div id="inpage-video'
) as $div_start) {
$contents = stripRecursiveHtmlSection($contents, 'div', $div_start);
}
$contents = stripWithDelimiters($contents, '<script', '</script>');
$contents = stripWithDelimiters($contents, '<meta itemprop="image"', '>');
$contents = stripWithDelimiters($contents, '<svg class="svg-symbol', '</svg>');
$contents = trim(stripWithDelimiters($contents, '<section class="sharethrough-top', '</section>'));
$item['content'] = $contents;
return $item;
}
}

55
bridges/ZenodoBridge.php Normal file
View File

@ -0,0 +1,55 @@
<?php
class ZenodoBridge extends BridgeAbstract {
const MAINTAINER = 'theradialactive';
const NAME = 'Zenodo';
const URI = 'https://zenodo.org';
const CACHE_TIMEOUT = 10;
const DESCRIPTION = 'Returns the newest content of Zenodo';
public function collectData(){
$html = getSimpleHTMLDOM($this->getURI())
or returnServerError('zenodo.org not reachable.');
foreach($html->find('div.record-elem') as $element) {
$item = array();
$item['uri'] = self::URI . $element->find('h4', 0)->find('a', 0)->href;
$item['title'] = trim(
htmlspecialchars_decode($element->find('h4', 0)->find('a', 0)->innertext,
ENT_QUOTES
)
);
foreach($element->find('p', 0)->find('span') as $authors) {
$item['author'] = $item['author'] . $authors . '; ';
}
$content = $element->find('p.hidden-xs', 0)->find('a', 0)->innertext . '<br>';
$type = '<br>Type: ' . $element->find('span.label-default', 0)->innertext;
$raw_date = $element->find('small.text-muted', 0)->innertext;
$clean_date = date_parse(str_replace('Uploaded on ', '', $raw_date));
$content = $content . date_parse($clean_date);
$item['timestamp'] = mktime(
$clean_date['hour'],
$clean_date['minute'],
$clean_date['second'],
$clean_date['month'],
$clean_date['day'],
$clean_date['year']
);
$access = '';
if ($element->find('span.label-success', 0)->innertext) {
$access = 'Open Access';
} elseif ($element->find('span.label-warning', 0)->innertext) {
$access = 'Embargoed Access';
} else {
$access = $element->find('span.label-error', 0)->innertext;
}
$access = '<br>Access: ' . $access;
$publication = '<br>Publication Date: ' . $element->find('span.label-info', 0)->innertext;
$item['content'] = $content . $type . $access . $publication;
$this->items[] = $item;
}
}
}

View File

@ -0,0 +1,85 @@
<?php
class ZoneTelechargementBridge extends BridgeAbstract {
const NAME = 'Zone Telechargement';
const URI = 'https://ww4.zone-telechargement1.org/';
const DESCRIPTION = 'Suivi de série sur Zone Telechargement';
const MAINTAINER = 'sysadminstory';
const PARAMETERS = array(
'Suivre la publication des épisodes d\'une série en cours de diffusion' => array(
'url' => array(
'name' => 'URL de la série',
'type' => 'text',
'required' => true,
'title' => 'URL d\'une série sans le https://ww4.zone-telechargement1.org/',
'exampleValue' => 'telecharger-series/31079-halt-and-catch-fire-saison-4-french-hd720p.html'
)
)
);
public function collectData(){
$html = getSimpleHTMLDOM(self::URI . $this->getInput('url'))
or returnServerError('Could not request Zone Telechargement.');
// Get the TV show title
$qualityselector = 'div[style=font-size: 18px;margin: 10px auto;color:red;font-weight:bold;text-align:center;]';
$show = trim($html->find('div[class=smallsep]', 0)->next_sibling()->plaintext);
$quality = trim(explode("\n", $html->find($qualityselector, 0)->plaintext)[0]);
$this->showTitle = $show . ' ' . $quality;
// Get the post content
$linkshtml = $html->find('div[class=postinfo]', 0);
$episodes = array();
$list = $linkshtml->find('a');
// Construct the tabble of episodes using the links
foreach($list as $element) {
// Retrieve episode number from link text
$epnumber = explode(' ', $element->plaintext)[1];
$hoster = $this->findLinkHoster($element);
// Format the link and add the link to the corresponding episode table
$episodes[$epnumber][] = '<a href="' . $element->href . '">'. $hoster . ' - '
. $this->showTitle . ' Episode ' . $epnumber . '</a>';
}
// Finally construct the items array
foreach($episodes as $epnum => $episode) {
$item = array();
// Add every link available in the episode table separated by a <br/> tag
$item['content'] = implode('<br/>', $episode);
$item['title'] = $this->showTitle . ' Episode ' . $epnum;
// As RSS Bridge use the URI as GUID they need to be unique : adding a md5 hash of the title element
// should geneerate unique URI to prevent confusion for RSS readers
$item['uri'] = self::URI . $this->getInput('url') . '#' . hash('md5', $item['title']);
// Insert the episode at the beginning of the item list, to show the newest episode first
array_unshift($this->items, $item);
}
}
public function getName(){
switch($this->queriedContext) {
case 'Suivre la publication des épisodes d\'une série en cours de diffusion':
return $this->showTitle . ' - ' . self::NAME;
break;
default:
return self::NAME;
}
}
private function findLinkHoster($element)
{
// The hoster name is one level higher than the link tag : get the parent element
$element = $element->parent();
//echo "PARENT : $element \n";
$continue = true;
// Walk through all elements in the reverse order until finding the one with a div and that is not a <br/>
while(!($element->find('div', 0) != null && $element->tag != 'br')) {
$element = $element->prev_sibling();
}
// Return the text of the div : it's the file hoster name !
return $element->find('div', 0)->plaintext;
}
}

View File

@ -19,7 +19,7 @@ class FileCache implements CacheInterface {
$writeStream = file_put_contents($this->getCacheFile(), serialize($datas));
if($writeStream === false) {
throw new \Exception("Cannot write the cache... Do you have the right permissions ?");
throw new \Exception('Cannot write the cache... Do you have the right permissions ?');
}
return $this;
@ -27,6 +27,7 @@ class FileCache implements CacheInterface {
public function getTime(){
$cacheFile = $this->getCacheFile();
clearstatcache(false, $cacheFile);
if(file_exists($cacheFile)) {
return filemtime($cacheFile);
}

View File

@ -24,4 +24,21 @@ name = "Hidden proxy name"
; Allow users to disable proxy usage for specific requests.
; true = enabled
; false = disabled (default)
by_bridge = false
by_bridge = false
[authentication]
; Enables authentication for all requests to this RSS-Bridge instance.
;
; Warning: You'll have to upgrade existing feeds after enabling this option!
;
; true = enabled
; false = disabled (default)
enable = false
; The username for authentication. Insert this name when prompted for login.
username = ""
; The password for authentication. Insert this password when prompted for login.
; Use a strong password to prevent others from guessing your login!
password = ""

View File

@ -11,12 +11,19 @@ class AtomFormat extends FormatAbstract{
$httpHost = isset($_SERVER['HTTP_HOST']) ? $_SERVER['HTTP_HOST'] : '';
$httpInfo = isset($_SERVER['PATH_INFO']) ? $_SERVER['PATH_INFO'] : '';
$serverRequestUri = $this->xml_encode($_SERVER['REQUEST_URI']);
$serverRequestUri = isset($_SERVER['REQUEST_URI']) ? $this->xml_encode($_SERVER['REQUEST_URI']) : '';
$extraInfos = $this->getExtraInfos();
$title = $this->xml_encode($extraInfos['name']);
$uri = !empty($extraInfos['uri']) ? $extraInfos['uri'] : 'https://github.com/RSS-Bridge/rss-bridge';
$icon = $this->xml_encode($uri .'/favicon.ico');
$uriparts = parse_url($uri);
if(!empty($extraInfos['icon'])) {
$icon = $extraInfos['icon'];
} else {
$icon = $this->xml_encode($uriparts['scheme'] . '://' . $uriparts['host'] .'/favicon.ico');
}
$uri = $this->xml_encode($uri);
$entries = '';
@ -32,6 +39,16 @@ class AtomFormat extends FormatAbstract{
foreach($item['enclosures'] as $enclosure) {
$entryEnclosures .= '<link rel="enclosure" href="'
. $this->xml_encode($enclosure)
. '" type="' . getMimeType($enclosure) . '" />'
. PHP_EOL;
}
}
$entryCategories = '';
if(isset($item['categories'])) {
foreach($item['categories'] as $category) {
$entryCategories .= '<category term="'
. $this->xml_encode($category)
. '"/>'
. PHP_EOL;
}
@ -49,6 +66,7 @@ class AtomFormat extends FormatAbstract{
<updated>{$entryTimestamp}</updated>
<content type="html">{$entryContent}</content>
{$entryEnclosures}
{$entryCategories}
</entry>
EOD;

View File

@ -47,6 +47,20 @@ class HtmlFormat extends FormatAbstract {
$entryEnclosures .= '</div>';
}
$entryCategories = '';
if(isset($item['categories']) && count($item['categories']) > 0) {
$entryCategories = '<div class="categories"><p>Categories:</p>';
foreach($item['categories'] as $category) {
$entryCategories .= '<li class="category">'
. $this->sanitizeHtml($category)
. '</li>';
}
$entryCategories .= '</div>';
}
$entries .= <<<EOD
<section class="feeditem">
@ -55,6 +69,7 @@ class HtmlFormat extends FormatAbstract {
{$entryAuthor}
{$entryContent}
{$entryEnclosures}
{$entryCategories}
</section>
EOD;
@ -70,6 +85,8 @@ EOD;
<meta charset="{$charset}">
<title>{$title}</title>
<link href="static/HtmlFormat.css" rel="stylesheet">
<link rel="alternate" type="application/atom+xml" title="Atom" href="./?{$atomquery}" />
<link rel="alternate" type="application/rss+xml" title="RSS" href="/?{$mrssquery}" />
<meta name="robots" content="noindex, follow">
</head>
<body>

View File

@ -10,7 +10,7 @@ class MrssFormat extends FormatAbstract {
$httpHost = isset($_SERVER['HTTP_HOST']) ? $_SERVER['HTTP_HOST'] : '';
$httpInfo = isset($_SERVER['PATH_INFO']) ? $_SERVER['PATH_INFO'] : '';
$serverRequestUri = $this->xml_encode($_SERVER['REQUEST_URI']);
$serverRequestUri = isset($_SERVER['REQUEST_URI']) ? $this->xml_encode($_SERVER['REQUEST_URI']) : '';
$extraInfos = $this->getExtraInfos();
$title = $this->xml_encode($extraInfos['name']);
@ -21,7 +21,8 @@ class MrssFormat extends FormatAbstract {
$uri = 'https://github.com/RSS-Bridge/rss-bridge';
}
$icon = $this->xml_encode($uri .'/favicon.ico');
$uriparts = parse_url($uri);
$icon = $this->xml_encode($uriparts['scheme'] . '://' . $uriparts['host'] .'/favicon.ico');
$items = '';
foreach($this->getItems() as $item) {
@ -36,7 +37,7 @@ class MrssFormat extends FormatAbstract {
if(isset($item['enclosures'])) {
$entryEnclosures .= '<enclosure url="'
. $this->xml_encode($item['enclosures'][0])
. '"/>';
. '" type="' . getMimeType($item['enclosures'][0]) . '" />';
if(count($item['enclosures']) > 1) {
$entryEnclosures .= PHP_EOL;
@ -44,12 +45,22 @@ class MrssFormat extends FormatAbstract {
Some media files might not be shown to you. Consider using the ATOM format instead!';
foreach($item['enclosures'] as $enclosure) {
$entryEnclosures .= '<atom:link rel="enclosure" href="'
. $enclosure . '" />'
. $enclosure . '" type="' . getMimeType($enclosure) . '" />'
. PHP_EOL;
}
}
}
$entryCategories = '';
if(isset($item['categories'])) {
foreach($item['categories'] as $category) {
$entryCategories .= '<category>'
. $category . '</category>'
. PHP_EOL;
}
}
$items .= <<<EOD
<item>
@ -60,6 +71,7 @@ Some media files might not be shown to you. Consider using the ATOM format inste
<description>{$itemContent}{$entryEnclosuresWarning}</description>
<author>{$itemAuthor}</author>
{$entryEnclosures}
{$entryCategories}
</item>
EOD;
@ -67,6 +79,8 @@ EOD;
$charset = $this->getCharset();
/* xml attributes need to have certain characters escaped to be w3c compliant */
$imageTitle = htmlspecialchars($title, ENT_COMPAT);
/* Data are prepared, now let's begin the "MAGIE !!!" */
$toReturn = <<<EOD
<?xml version="1.0" encoding="{$charset}"?>
@ -78,7 +92,7 @@ xmlns:atom="http://www.w3.org/2005/Atom">
<title>{$title}</title>
<link>http{$https}://{$httpHost}{$httpInfo}/</link>
<description>{$title}</description>
<image url="{$icon}" title="{$title}" link="{$uri}"/>
<image url="{$icon}" title="{$imageTitle}" link="{$uri}"/>
<atom:link rel="alternate" type="text/html" href="{$uri}" />
<atom:link rel="self" href="http{$https}://{$httpHost}{$serverRequestUri}" />
{$items}

Some files were not shown because too many files have changed in this diff Show More