Commit graph

138 commits

Author SHA1 Message Date
logmanoriginal 89ca42da54 [index] Always write exceptions to error.log
Exceptions are reported to users, but they do not necessarily appear
in the error log on the server. Using 'error_log' we can explicitly
write exceptions and error messages to the log file, using the
standard PHP message format.

For more information see https://stackoverflow.com/a/26867035
2018-10-24 15:58:12 +02:00
Eugene Molotov a508dddb36 [core] Fixed broken caching (#880) 2018-10-22 19:14:49 +02:00
LogMANOriginal b90bcee1fc
Return exceptions in requested feed formats (#841)
* [Exceptions] Don't return header for bridge exceptions
* [Exceptions] Add link to list in exception message

This is an alternative when the button is not rendered
for some reason.

* [index] Don't return bridge exception for formats
* [index] Return feed item for bridge exceptions
* [BridgeAbstract] Rename 'getCacheTime' to 'getModifiedTime'
* [BridgeAbstract] Move caching to index.php to separate concerns

index.php needs more control over caching behavior in order to cache
exceptions. This cannot be done in a bridge, as the bridge might be
broken, thus preventing caching from working.

This also (and more importantly) separates concerns. The bridge should
not even care if caching is involved or not. Its purpose is to collect
and provide data.

Response times should be faster, as more complex bridge functions like
'setDatas' (evaluates all input parameters to predict the current
context) and 'collectData' (collects data from sites) can be skipped
entirely.

Notice: In its current form, index.php takes care of caching. This
could, however, be moved into a separate class (i.e. CacheAbstract)
in order to make implementation details cache specific.

* [index] Add '_error_time' parameter to $item['uri']

This ensures that error messages are recognized by feed readers as
new errors after 24 hours. During that time the same item is returned
no matter how often the cache is cleared.

References https://github.com/RSS-Bridge/rss-bridge/issues/814#issuecomment-420876162

* [index] Include '_error_time' in the title for errors

This prevents feed readers from "updating" feeds based on the title

* [index] Handle "HTTP_IF_MODIFIED_SINCE" client requests

Implementation is based on `BridgeAbstract::dieIfNotModified()`,
introduced in 422c125d8e and
simplified based on https://stackoverflow.com/a/10847262

Basically, before returning cached data we check if the client send
the "HTTP_IF_MODIFIED_SINCE" header. If the modification time is
more recent or equal to the cache time, we reply with "HTTP/1.1 304
 Not Modified" (same as before). Otherwise send the cached data.

* [index] Don't encode exception message with `htmlspecialchars`
* [Exceptions] Include error message in exception
* [index] Show different error message for error code 0
2018-10-15 17:21:43 +02:00
logmanoriginal e3a5a6a170 [index] Update and improve parameter handling for bridge and cache
- Use 'array_diff_key' instead of 'unset'
- Remove parameters for caches

By removing certain parameters for caches, the loading times can be
improved considerably:

* action: It doesn't matter which action the user took to generate
feed items.

* format: This has the biggest impact on performance, because cached
items are now shared between different formats (i.e. try switching
between Atom, Html and Mrss and compare previous vs. now). If a
server handles lots of requests, this may even reduce bandwidth if
the same contents are requested for different formats.

* _noproxy: The proxy behavior has no impact on the produced items,
so it can be ignored.

* _cache_timeout: This is another option which might impact performance
for some servers, especially if 'custom_timeout' has been enabled in
the configuration. Requests with different cache timeouts no longer
result in separate cache files.
2018-09-22 15:44:03 +02:00
ORelio de8cee6a1c Catching up | [Main] Debug mode, parse utils, MIME | [Bridges] Add/Improve 20 bridges (#802)
* Debug mode improvements

 - Improve debug warning message
 - Restore error reporting in debug mode
 - Fix 'notice' messages for unset fields

* Add parsing utility functions

html.php
 - extractFromDelimiters
 - stripWithDelimiters
 - stripRecursiveHTMLSection
 - markdownToHtml (partial)

bridges
 - remove now-duplicate functions
 - call functions from html.php instead

* [Anidex] New bridge

Anime torrent tracker

* [Anime-Ultime] Restore thumbnail

* [CNET] Recreate bridge

Full rewrite as the previous one was broken

* [Dilbert] Minor URI fix

Use new self::URI property

* [EstCeQuonMetEnProd] Fix content extraction

Bridge was broken

* [Facebook] Fix "SpSonsSoriSsés" label

... which was taking space in item title

* [Futura-Sciences] Use HTTPS, More cleanup

Use HTTPS as FS now offer HTTPS
Clean additional useless HTML elements

* [GBATemp] Multiple fixes

- Fix categories: missing "break" statements
- Restore thumbnail as enclosure
- Fix date extraction
- Fix user blog post extraction
- Use getSimpleHTMLDOMCached

* [JapanExpo] Fix bridge, HTTPS, thumbnails

- Fix getSimpleHTMLDOMCached call
- Upgrade to HTTPS as JE now offers HTTPS
- Restore thumbnails as enclosures

* [LeMondeInformatique] Fix bridge, HTTPS

- Upgrade to HTTPS as LMI now offers HTTPS
- Restore thumbnails using small images
- Fix content extraction
- Fix text encoding issue

* [Nextgov] Fix content extraction

- Restore thumbnail and use small image
- Field extraction fixes

* [NextInpact] Add categories and filtering by type

- Offer all RSS feeds
- Allow filtering by article type
- Implement extraction for brief articles
- Remove article limit, many brief articles are publied all at once

* [NyaaTorrents] New bridge

Anime torrent tracker

* [Releases3DS] Cache content, restore thumbnail

- Use getSimpleHTMLDOMCached
- Restore thumbnail as enclosure

* [TheHackerNews] Fix bridge

 - Fix content extraction including article body
 - Restore thumbnail as enclosure

* [WeLiveSecurity] HTTPS, Fix content extraction

- Upgrade to HTTPS as WLS now offers HTTPS
- Fix content extraction including article body

* [WordPress] Reduce timeout, more content selectors

- Reduce timeout to use default one (1h)
- Add new content selector (articleBody)
- Find thumbnail and set as enclosure
- Fix <script> cleanup

* [YGGTorrent] Increase limit, use cache

- Increase item limit as uploads are very frequent
- Use getSimpleHTMLDOMCached

* [ZDNet] Rewrite with FeedExpander

- Upgrade to HTTPS as ZD now offers HTTPS
- Use FeedExpander for secondary fields
- Fix content extraction for article body

* [Main] Handle MIME type for enclosures

Many feed readers will ignore enclosures (e.g. thumbnails) with no MIME type. This commit adds automatic MIME type detection based on file extension (which may be inaccurate but is the only way without fetching the content).

One can force enclosure type using #.ext anchor (hacky, needs improving)

* [FeedExpander] Improve field extraction

- Add support for passing enclosures
- Improve author and uri extraction
- Fix 'notice' PHP error messages

* [Pull] Coding style fixes for #802

* [Pull] Implementing changes for #802

 - Fix coding style issues with str append
 - Remove useless CACHE_TIMEOUT
 - Use count() instead of $limit
 - Use defaultLinkTo() + handle strings
 - Use http_build_query()
 - Fix missing </em>
 - Remove error_reporting(0)
 - warning CSS (@LogMANOriginal)
 - Fix typo in FeedExpander comment

* [Main] More documentation for markdownToHtml

See #802 for more details
2018-09-09 20:20:13 +01:00
Eugene Molotov 558fa50a2a [core] Enabled debug mode before including core files (#790) 2018-08-25 20:02:47 +01:00
Eugene Molotov 422c125d8e [core] Returning 304 http code when returning cached data (#793) 2018-08-25 20:00:38 +01:00
Walter Barrett 704a87ad97 Icons: Allow Bridge-specified icons (#788) 2018-08-21 17:46:47 +02:00
LogMANOriginal 7dee3a175a
[index] Add '?action=list' to list bridges (#493)
Adds a new action '?action=list' to return a list of bridges as JSON formatted text. Each bridge brings following information:

- status (active/inactive)
- uri
- name
- parameters
- maintainer
- description

For inactive bridges only the status is returned.
Bridges that cannot be instantiated are considered inactive.
2018-08-09 19:14:10 +02:00
LogMANOriginal d83f2f285b
Separate index and bridge card generating code into a separate classes (#734)
[html] Generate index and bridge cards using separate clases

Move HTML generating code from 'index.php' to 'Index.php', separating components into static functions.

Move HTML generation code for bridge cards from 'html.php' to 'BridgeCard.php', separating components into static functions.
2018-07-21 18:15:07 +02:00
Teromene da6b98851c Add recuperation of the current version from git if available (#731)
* Add recuperation of the current version from git if available
* Include version when auto-reporting an error
2018-06-30 10:24:22 +02:00
Teromene 937ea49271 Add basic authentication support (#728)
* Move configuration in its own class in order to reduce the verbosity of index.php
* Add authentication mechanism using HTTP auth
* Add a method to get the config parameters
* Remove the installation checks from the index page
* Log all failed authentication attempts
2018-06-27 19:09:41 +02:00
Joe Digilio 50924b9213 Abort on parse error of config.default.ini.php (#714)
If there is an error parsing the default config file, then abort.
2018-06-15 21:02:06 +02:00
logmanoriginal 4c5013bc82 [index] Bump release version to 2018-06-10 2018-06-10 22:14:58 +02:00
LogMANOriginal 8ac8e08abf
Add user config (#653)
Uses the parse_ini_file function to load default settings from the default configuration file 'config.default.ini.php'. Optionally loads custom settings from 'config.ini.php' to replace the default
values.
2018-05-29 11:52:17 +02:00
teromene b0c7a62f74 [index] Bumped version to 2018-04-20 2018-04-20 17:15:25 +02:00
logmanoriginal 178177e787 [index] Push version to 2018-04-06 2018-04-06 22:45:33 +02:00
logmanoriginal 2df2623430 [index] Add 'curl' extension check 2018-04-06 20:42:19 +02:00
logmanoriginal de5f850cdb [index] Fix indentation using tabs 2018-04-06 20:34:44 +02:00
teromene ac6847045c Catch Errors in order to display a message in more cases. We also catch Exceptions to maintain compat with php 5.
Add check for simplexml extension.
2018-04-04 19:02:40 +01:00
LogMANOriginal 8ba817478b
Implement customizable cache timeout (#641)
* [BridgeAbstract] Implement customizable cache timeout

The customizable cache timeout is used instead of the default cache
timeout (CACHE_TIMEOUT) if specified by the caller.

* [index] Add new global parameter '_cache_timeout'

The _cache_timeout parameter is an optional parameter that can be
used to specify a custom cache timeout. This option is enabled by
default.

It can be disabled using the named constant 'CUSTOM_CACHE_TIMEOUT'
which supports two states:

> true: Enabled (default)
> false: Disabled

* [BridgeAbstract] Change scope of 'getCacheTimeout' to public

* [html] Add cache timeout parameter to all bridges

The timeout parameter only shows if CUSTOM_CACHE_TIMEOUT has been set
to true. The default value is automatically set to the value specified
in the bridge.

* [index] Disable custom cache timeout by default
2018-03-14 18:06:36 +01:00
logmanoriginal 29a1c7ac09 [index.php] Add extension check for 'mbstring'
The mbstring extension is required by all formats in order to convert multi-
byte characters to UTF-8. This commit adds an extension check to throw an
error message if the extension is not enabled.
2018-03-07 19:11:47 +01:00
teromene 98b0f0f8ba [Core] Verify the presence of the array keys before accessing them.
Fixes  #588
2017-10-12 17:14:34 +01:00
Teromene e30ad3feb4 Add support for running rss-bridge from the CLI 2017-09-25 19:14:02 +02:00
logmanoriginal b4c6aa41a7 [index] Return error if no format is specified when requesting a bridge 2017-08-28 20:45:06 +02:00
logmanoriginal 2595b5d7d8 [index] Bump version 2017-08-19 21:09:48 +02:00
logmanoriginal f91309c7e4 [index] Use constant WHITELIST_FILE all the way 2017-08-12 19:15:16 +02:00
logmanoriginal cd012e115b [index] Show bridge options when loading with URL fragment
Loading the page with an URL fragement (#bridge-*) should result in
the bridge showing all parameters by default. Unfortunately this is
not possible using PHP, which is why a new JavaScript function is
needed (select.js)

That way, when returning from a bridge ('back to rss-bridge') will
keep the selected bridge active (only works for HTML format).
2017-08-11 19:39:16 +02:00
logmanoriginal df9e3968dc [index] Add GET parameter 'q' for search queries 2017-08-11 17:43:15 +02:00
logmanoriginal 99e7e7876e exception: Use built-in HTTP response codes
PHP >= 5.4 provides a built-in function to generate valid HTTP
error header including the error description: http_response_code()

See: http://php.net/manual/en/function.http-response-code.php
See also: https://stackoverflow.com/a/12018482

This commit removes the '\Http' utility class and replaces all
calls to 'Http::getMessageForCode()' by 'http_response_code()'
2017-08-06 12:55:11 +02:00
logmanoriginal 84d2c02a09 whitelist: Do case-insensitive whitelist matching
Matching whitelisted bridges using a case-insensitive match makes
sense for following reasons:

- Wrong upper/lower case spelling in the whitelist is not easily
discovered. Example: Misspelling 'Youtube' as 'YouTube' will not
show the 'Youtube' bridge (while it is expected to show)

- Two bridges with the same name but different letter casing are
discouraged to prevent confusion and keep the project compatible
with Windows machines
2017-08-06 00:01:32 +02:00
logmanoriginal ccd8af09b9 [index] Use single quotes instead of double quotes 2017-08-05 15:46:16 +02:00
logmanoriginal f2d02a4187 [index] Simplify debug mode detection
This removes superfluous variables and if-statements when checking
whether the debug mode is active or not.
2017-08-05 15:43:48 +02:00
logmanoriginal f19d34a5a1 [index] Check permissions for cache folder and whitelist file
* The cache folder requires write permissions at all times
* The whitelist file requires write permissions if it does not
exist (can be created manually)
2017-08-05 15:23:30 +02:00
logmanoriginal f1534c91e2 [index] Use constant instead of variable for the whitelist file path
Like the cache folder the whitelist file is assumed static and thus
should be defined as constant.
2017-08-05 15:23:08 +02:00
logmanoriginal f7265ca77b [index] Bump version number 2017-08-03 20:51:59 +02:00
logmanoriginal a4b9611e66 [phpcs] Add missing rules
- Do not add spaces after opening or before closing parenthesis

  // Wrong
  if( !is_null($var) ) {
    ...
  }

  // Right
  if(!is_null($var)) {
    ...
  }

- Add space after closing parenthesis

  // Wrong
  if(true){
    ...
  }

  // Right
  if(true) {
    ...
  }

- Add body into new line
- Close body in new line

  // Wrong
  if(true) { ... }

  // Right
  if(true) {
    ...
  }

Notice: Spaces after keywords are not detected:

  // Wrong (not detected)
  // -> space after 'if' and missing space after 'else'
  if (true) {
    ...
  } else{
    ...
  }

  // Right
  if(true) {
    ...
  } else {
    ...
  }
2017-07-29 19:55:12 +02:00
LogMANOriginal 38b56bf23a [index] Improve error handling (#555)
Add additional information to error message:

- Name of the bridge
- Possible solutions
- Error description
- Error code
- Error message

* Output type changed from 'text' to 'html'
* Added styles for the error page
* Added a button to remotely open a GitHub issue

Closes #525
2017-07-29 19:16:16 +02:00
Teromene 1a4c3f4418 Add a search bar to simplify looking for a bridge. (#494)
* Add a search bar to simplify looking for a bridge.

* Fix phpcs line length.

* Change the phpcs config.
2017-03-21 20:31:10 +00:00
Teromene c803396d7e Correct phpcs check. 2017-03-09 22:27:14 +00:00
Teromene ac518ca297 Fix line return in user-agent. 2017-03-08 11:47:55 +00:00
Teromene 1763a1518c Restore the ability to whitelist all the bridges by putting a wildcard into the whitelist file. 2017-03-08 10:56:39 +00:00
logmanoriginal ff83410534 style: Fix coding styles 2017-02-14 17:28:07 +01:00
Badet Aurélien c702a0e69f Bridge getExtraInfos (#432)
* add function getExtraInfos() to BridgeAbstract

* replace call to $bridge->getName() and $bridge->getURI() by $bridge->getExtraInfos()

replace call to $bridge->getName() and $bridge->getURI() by $bridge->getExtraInfos() defined by default in BridgeAbstract.
So we could pass additionals ExtraInfos from custom bridges to custom formats.
2016-11-29 00:48:59 +00:00
logmanoriginal d39c1ed63a [index] Add check for 'allow_url_fopen' 2016-11-05 13:09:20 +01:00
logmanoriginal 04a195361a [index] Add check for 'libxml' extension 2016-11-05 13:05:28 +01:00
logmanoriginal 3bcd98404b [index] Add check for correct PHP version 2016-11-05 13:02:48 +01:00
logmanoriginal 5790ebc6ba [index] Initialize variable before using it 2016-10-20 22:12:54 +02:00
logmanoriginal 85149add61 [index] Fixes a bug where requests could result in same cache file names
Previously the cache file name was only build upon bridge
parameters. If two bridges don't make use of any parameter
this would result into equal file names.
2016-10-20 22:03:23 +02:00
logmanoriginal 8fb4db8914 [index] Introduce CACHE_DIR 2016-10-08 16:21:00 +02:00