Commit graph

1788 commits

Author SHA1 Message Date
Corentin Garcia
55b36b0455 [DauphineLibereBridge] Use https, fix content parsing (fix issue #780) (#811) 2018-09-09 20:23:59 +01:00
ORelio
de8cee6a1c Catching up | [Main] Debug mode, parse utils, MIME | [Bridges] Add/Improve 20 bridges (#802)
* Debug mode improvements

 - Improve debug warning message
 - Restore error reporting in debug mode
 - Fix 'notice' messages for unset fields

* Add parsing utility functions

html.php
 - extractFromDelimiters
 - stripWithDelimiters
 - stripRecursiveHTMLSection
 - markdownToHtml (partial)

bridges
 - remove now-duplicate functions
 - call functions from html.php instead

* [Anidex] New bridge

Anime torrent tracker

* [Anime-Ultime] Restore thumbnail

* [CNET] Recreate bridge

Full rewrite as the previous one was broken

* [Dilbert] Minor URI fix

Use new self::URI property

* [EstCeQuonMetEnProd] Fix content extraction

Bridge was broken

* [Facebook] Fix "SpSonsSoriSsés" label

... which was taking space in item title

* [Futura-Sciences] Use HTTPS, More cleanup

Use HTTPS as FS now offer HTTPS
Clean additional useless HTML elements

* [GBATemp] Multiple fixes

- Fix categories: missing "break" statements
- Restore thumbnail as enclosure
- Fix date extraction
- Fix user blog post extraction
- Use getSimpleHTMLDOMCached

* [JapanExpo] Fix bridge, HTTPS, thumbnails

- Fix getSimpleHTMLDOMCached call
- Upgrade to HTTPS as JE now offers HTTPS
- Restore thumbnails as enclosures

* [LeMondeInformatique] Fix bridge, HTTPS

- Upgrade to HTTPS as LMI now offers HTTPS
- Restore thumbnails using small images
- Fix content extraction
- Fix text encoding issue

* [Nextgov] Fix content extraction

- Restore thumbnail and use small image
- Field extraction fixes

* [NextInpact] Add categories and filtering by type

- Offer all RSS feeds
- Allow filtering by article type
- Implement extraction for brief articles
- Remove article limit, many brief articles are publied all at once

* [NyaaTorrents] New bridge

Anime torrent tracker

* [Releases3DS] Cache content, restore thumbnail

- Use getSimpleHTMLDOMCached
- Restore thumbnail as enclosure

* [TheHackerNews] Fix bridge

 - Fix content extraction including article body
 - Restore thumbnail as enclosure

* [WeLiveSecurity] HTTPS, Fix content extraction

- Upgrade to HTTPS as WLS now offers HTTPS
- Fix content extraction including article body

* [WordPress] Reduce timeout, more content selectors

- Reduce timeout to use default one (1h)
- Add new content selector (articleBody)
- Find thumbnail and set as enclosure
- Fix <script> cleanup

* [YGGTorrent] Increase limit, use cache

- Increase item limit as uploads are very frequent
- Use getSimpleHTMLDOMCached

* [ZDNet] Rewrite with FeedExpander

- Upgrade to HTTPS as ZD now offers HTTPS
- Use FeedExpander for secondary fields
- Fix content extraction for article body

* [Main] Handle MIME type for enclosures

Many feed readers will ignore enclosures (e.g. thumbnails) with no MIME type. This commit adds automatic MIME type detection based on file extension (which may be inaccurate but is the only way without fetching the content).

One can force enclosure type using #.ext anchor (hacky, needs improving)

* [FeedExpander] Improve field extraction

- Add support for passing enclosures
- Improve author and uri extraction
- Fix 'notice' PHP error messages

* [Pull] Coding style fixes for #802

* [Pull] Implementing changes for #802

 - Fix coding style issues with str append
 - Remove useless CACHE_TIMEOUT
 - Use count() instead of $limit
 - Use defaultLinkTo() + handle strings
 - Use http_build_query()
 - Fix missing </em>
 - Remove error_reporting(0)
 - warning CSS (@LogMANOriginal)
 - Fix typo in FeedExpander comment

* [Main] More documentation for markdownToHtml

See #802 for more details
2018-09-09 20:20:13 +01:00
Quentin Delmas
123fce4394 [ForGifsBridge] Fix permissions of ForGifsBridge 2018-09-09 17:34:36 +01:00
Quentin Delmas
a3f99c9c3f [GOGBridge] Added bridge for GOG.com 2018-09-09 17:32:36 +01:00
Eugene Molotov
bf30ad127c [FacebookBridge] Removes query string from post links
* [FacebookBridge] Removes query string from post links
2018-09-09 16:31:15 +01:00
logmanoriginal
37f84196b7 [GooglePlusPostBridge] Fix title is empty if content is too short
The bridge would generate empty titles if the content is longer than
50 characters, but doesn't have further spaces in it. With this commit
the title is correctly generated based on the contents, taking missing
spaces into account.

References #786
2018-09-08 17:07:57 +02:00
Corentin Garcia
44764f7182 [GrandComicsDatabaseBridge] Fix links in content (#804) 2018-09-08 11:12:27 +01:00
Antoine Cadoret
19f294d71d Add fields to leboncoin bridge (#783)
* [LeBonCoinBridge] Add fields to LeBonCoinBridge
2018-08-31 14:34:41 +01:00
Teromene
b0e33e4e01
Update LeBonCoinBridge to use the site's API (#795)
* Update LeBonCoinBridge to use the site's API
2018-08-28 14:20:02 +01:00
Eugene Molotov
558fa50a2a [core] Enabled debug mode before including core files (#790) 2018-08-25 20:02:47 +01:00
Eugene Molotov
ffb8b82c73 [FileCache] reseting cached file stat result to have correct getTime() result (#792)
* [FileCache] reseting cached file stat result to have correct getTime() result
2018-08-25 20:00:51 +01:00
Eugene Molotov
422c125d8e [core] Returning 304 http code when returning cached data (#793) 2018-08-25 20:00:38 +01:00
Quentin Delmas
059656c370 Fix phpcs. 2018-08-22 16:25:08 +01:00
Quentin Delmas
9fc1e97efe Avoid bot exclusion. 2018-08-22 16:21:39 +01:00
Quentin Delmas
be3620acb7 Add extension check for the "json" extension. 2018-08-22 16:21:20 +01:00
LogMANOriginal
16c0a61232
[README] Add a "Deploy to Cloud" button for Docker
Adds a button to deploy RSS-Bridge to the Docker Cloud as described here: https://docs.docker.com/docker-cloud/apps/deploy-to-cloud-btn/
2018-08-21 18:40:39 +02:00
Walter Barrett
704a87ad97 Icons: Allow Bridge-specified icons (#788) 2018-08-21 17:46:47 +02:00
sysadminstory
c4cccfe0f3 [LesJoiesDuCode] Switch to HTTPS and remove author (#787)
Website offers now HTTPS, therefore the bridge was switched to it.
The post author is not displayed anymore on the homepage, so it has been
removed.
2018-08-21 17:41:56 +02:00
Marcin C
d07deb0930 css: Modern look for RSS-Bridge (#781) 2018-08-21 17:22:46 +02:00
Piranhaplant
e7dab5d351 Fixed timestamp on Pixiv bridge (#785) 2018-08-18 16:54:24 -03:00
logmanoriginal
ad82d50bbd [CNETBridge] Remove bridge
CNET now provides public feeds at https://www.cnet.com/rss/

References #775
2018-08-12 11:02:44 +02:00
logmanoriginal
c305c1ded7 [BlaguesDeMerdeBridge] Adjust to layout changes
References #767
2018-08-10 21:08:47 +02:00
logmanoriginal
f14a5bd771 [CADBridge] Remove bridge
https://cad-comic.com/ now provides feeds at

- https://cad-comic.com/feed (rss)
- https://cad-comic.com/feed/atom (atom)

Thus multiple alternatives are available to choose from, making this
bridge obsolete:

- FilterBridge (using one of the feeds above)
- WordPressBridge (on the main site)
- One of the two available feeds

References #752
2018-08-10 19:53:32 +02:00
logmanoriginal
a20d5f9af0 tests: reuse RssBridge.php instead of implementing a custom solution 2018-08-10 15:33:32 +02:00
logmanoriginal
ee28b124e0 [DanbooruBridge] Fix bridge
This commit fixes an issue caused by self closing tags not supported
by simplehtmldom (<source>).

Adds a monkey patch to extend simplehtmldom with the ability to detect
that particular tag. Most of the code added is copied directly from
simplehtmldom (see vendor/simplehtmldom) with adjustments to account
for RSS-Bridge formatting.

Related to: https://sourceforge.net/p/simplehtmldom/bugs/83/

Notice: The tag itself is valid according to Mozilla:

The HTML <picture> element serves as a container for zero or more
<source> elements and one <img> element to provide versions of an
image for different display device scenarios. The browser will
consider each of the child <source> elements and select one
corresponding to the best match found; if no matches are found
among the <source> elements, the file specified by the <img>
element's src attribute is selected. The selected image is then
presented in the space occupied by the <img> element.

-- https://developer.mozilla.org/en-US/docs/Web/HTML/Element/picture

References #753
2018-08-09 21:55:43 +02:00
LogMANOriginal
7dee3a175a
[index] Add '?action=list' to list bridges (#493)
Adds a new action '?action=list' to return a list of bridges as JSON formatted text. Each bridge brings following information:

- status (active/inactive)
- uri
- name
- parameters
- maintainer
- description

For inactive bridges only the status is returned.
Bridges that cannot be instantiated are considered inactive.
2018-08-09 19:14:10 +02:00
logmanoriginal
5fea9fc1f5 bridges: Fix bridges failing unit test 2018-08-09 17:04:16 +02:00
logmanoriginal
6bceb2b2db [tests] Add unit test for bridge implementation
Adds unit test for bridge implementations:

- Custom functions must be in protected or private scope
- getName() must return a valid string (non-empty)
- getURI() must return a valid URI
- Each bridge must define constants for NAME, URI, DESCRIPTION and
  MAINTAINER. CACHE_TIMEOUT and PARAMETERS are optional.

The unit test is written for PHPUnit 6.x and will automatically be
tested by Travis-CI for PHP 7.0 (see .travis.yml).

Remarks:

Unit tests for bridge data were scrapped in #378 for complexity
reasons (tests would have to be maintained for each bridge). This
unit test, however, is written for testing all bridges without
taking specific implementation details into account.
2018-08-09 17:04:16 +02:00
Eugene Molotov
df81fa62d1 [VkBridge] Video attachment fixes (#766)
* use defaultLinkTo
* remove duplicate video links
* remove line ending before "Reposted" label
* return newline before reposted string
* remove comments
* use video links that won't require login
* set title if video has no title
2018-08-09 17:02:36 +02:00
Eugene Molotov
f8c6400373 [HtmlFormat] Hide "Categories" label, if array of categories is empty (#765) 2018-08-09 16:46:53 +02:00
logmanoriginal
de7622ebbf version: Bump to 2018-08-07 2018-08-07 18:37:38 +02:00
logmanoriginal
09c9d015b4 [ForGifsBridge] Add new bridge 2018-08-04 23:42:58 +02:00
logmanoriginal
3a496e3b18 [FilterBridge] Add option to build title from content
Adds a new option '&title_from_content=on' to build the title for feed
items from the feeds content. The title is generated from the first
whitespace after 50 characters of the content or the entire content if
the total size is lower than 50 characters.

References #587
2018-08-04 20:46:59 +02:00
Eugene Molotov
df58f5bbdb [core] Add urljoin (#756)
Adds php-urljoin from https://github.com/fluffy-critter/php-urljoin to replace the custom implementation of 'defaultLinkTo'
2018-08-02 06:31:56 +02:00
logmanoriginal
9d0452d11b [.travis] Use composer for HHVM
This fixes the HHVM build failing because pear doesn't exist in HHVM.
2018-08-01 19:37:10 +02:00
sublimz
f92ac49947 [LeBonCoinBridge] Add cities support (#751) 2018-08-01 17:25:18 +02:00
Benasse
a574fa15ac [YGGTorrentBridge] Order search result by publish date (#762) 2018-07-31 21:46:10 +02:00
Nemo
8f9a385b4d [AmazonPriceTrackerBridge] Improve Amazon scraper logic (#761)
- Now works on all websites, and even with products
  with multiple prices
- Closes #750
2018-07-31 21:44:37 +02:00
logmanoriginal
53bdfa3bf0 [GooglePlusPostBridge] Skip posts without message 2018-07-31 19:15:09 +02:00
logmanoriginal
53278b2eed [GooglePlusPostBridge] Add option to include image in content
References #600
2018-07-31 19:09:12 +02:00
logmanoriginal
5f3c55b808 [GooglePlusPostBridge] General cleanup 2018-07-31 18:55:35 +02:00
logmanoriginal
fb79a67370 [GooglePlusPostBridge] Normalize static::URI usage
This commit fixes a few things related to static::URI

1) Remove trailing slash from the URI to simplify using 'defaultLinkTo'
2) Use static::URI instead of self::URI for consistency
3) Remove custom implementation of 'defaultLinkTo'
2018-07-31 18:29:14 +02:00
logmanoriginal
3c4e12ceba [GooglePlusPostBridge] Add images to enclosures
Images are collected for each post and added to enclosures. Images or
animtions from lh3.googleusercontent.com are specifically handled in
order to return the animated version of the gif and the original sized
image (this is normally taken care of by JS in the browser).
2018-07-31 18:18:22 +02:00
logmanoriginal
0d1923c52f [GitHubGistBridge] Add new bridge
Adds a new bridge for https://gist.github.com

The bridge generates feeds for comments on a particular gist based on
the gist ID or full URI. For better readability the general behavior
of code sections is manually restored with the original CSS styles
from GitHub.
2018-07-29 16:31:47 +02:00
logmanoriginal
ce896b4247 [SkimfeedBridge] Add new bridge
New bridge for Skimfeed: https://skimfeed.com

Generates feeds for all features of Skimfeed:

- News (the ones displayed on the front page)
- Hot topics ("What's Hot" section on the front page)
- Tech news (preconfigured feeds in the menu bar)
- Custom feeds (using the configuration system of Skimfeed), see
https://skimfeed.com/custom.php

The number of items returned by the bridge can be limited for all
categories ('&limit=...'). This parameter is optional, all categories
are unlimited by default!

Authors are added with HTML anchors in order to allow quick navigation
to source channels.

The bridge ships with developer tools to auto-generate lists in the
future (especially useful for 'Tech news'!)

References #748
2018-07-27 23:18:32 +02:00
sysadminstory
a4b2d88dbe [DealabsBridge] Follow website change (#758) 2018-07-25 20:02:31 +02:00
logmanoriginal
65ec04ea98 [contents] Remove superfluous debug log from getContents
References #757
2018-07-25 19:56:46 +02:00
logmanoriginal
afb4de318b [FlickrBridge] Fix missing scheme for image URLs
References #754
2018-07-23 20:14:46 +02:00
Eugene Molotov
43bb17f995 [VkBridge] Converting hashtags to categories (#755)
* [VkBridge] Converting hashtags to categories
2018-07-22 16:43:00 +02:00
logmanoriginal
bae7a5879f [FlickrBridge] Fixed broken bridge
Following changes in the JSON data and selecting images for the
content (320x240 or bigger) and enclosure (largest version). All of
the data is now extracted from the JSON data instead of parsing the
DOM.

References #754
2018-07-22 14:06:04 +02:00