Commit graph

409 commits

Author SHA1 Message Date
logmanoriginal 0e30468e0f [rssbridge] Use PATH_ROOT whenever possible 2019-06-07 19:51:06 +02:00
logmanoriginal ccf375e917 config: Use global constant for config files
The configuration files are currently hard-coded in the configuration
classes and error messages. However, the implementation should not
rely on specific details like the file name. Instead, the files should
be part of the global definition.

This commit introduces two global constants for the configuration files

- FILE_CONFIG => 'config.ini.php'
- FILE_CONFIG_DEFAULT => 'config.default.ini.php'
2019-06-07 19:48:29 +02:00
logmanoriginal 946a99d334 config: Add [system] => 'timezone'
RSS-Bridge currently statically sets the timezone to UTC which can
result in incorrect timestamps if the server is hosted in another
region.

This commit adds a new configuration parameter to allow admins to
specify their own timezone for their servers. Invalid values will
result in an error message.

Example:

  [system]
  timezone = "UTC"

For compatibility reasons the default value is set to UTC.

This parameter accepts any of the supported timezones listed at
https://www.php.net/manual/en/timezones.php

Closes #956
References #1001
2019-06-07 19:22:51 +02:00
logmanoriginal e2e0ced055 [Bridge] Improve performance for correctly written whitelist.txt
If the bridge name matches exactly, it is not necessary to perform
a strtolower compare of bridges. In some situations this can lead
to much faster response times (depending on the amount of bridges
in whitelist.txt).
2019-06-06 20:59:33 +02:00
logmanoriginal d4e867f240 core: Move default bridges to whitelist.default.txt
Default bridges are currently statically defined in index.php, which
is not the right place if we want to keep responsibilities separated.

This commit introduces a new file whitelist.default.txt that holds
the default bridges and which is loaded automatically, if whitelist.txt
doesn't exist.

Due to this it is also no longer necessary to have write permission
for the root directory.

References #1001
2019-06-06 20:53:46 +02:00
logmanoriginal 6c4098d655 Revert "all: Use ->remove() instead of ->outertext = ''"
This reverts commit 052844f5e1.

There is a bug in ->remove() that causes the parser to incorrectly
identify elements in the DOM tree that shouldn't exist anymore.

References #1151
2019-06-02 13:06:16 +02:00
logmanoriginal 468d8be72d [Exceptions] Fix GitHub query labels for bug reports
All bug reports now use the Bridge-Broken label by default
2019-06-01 22:35:56 +02:00
logmanoriginal 052844f5e1 all: Use ->remove() instead of ->outertext = ''
simplehtmldom 1.9 introduced new functions to recursively remove
nodes from the DOM. This allows removing elements without the need
to re-load the document by using $html->load($html->save()), which
is very inefficient.

Find more information about remove() at
https://simplehtmldom.sourceforge.io/docs/1.9/api/simple_html_dom_node/remove/
2019-06-01 21:29:57 +02:00
logmanoriginal 014b698f67 [html] Use find('*') over custom solution
find('*') wasn't supported in older versions of simplehtmldom but it
is supported now. Thus, all custom implementations can be replaced
by the correct solution.
2019-06-01 21:05:12 +02:00
fulmeek 66c5b732cf [FeedItem] Avoid repeated UID hashing after loading from cache (#1148)
This fixes the following issue:

1. bridge sets unique ids for the items (ids get hashed)
2. items go to the cache
3. on next run items get loaded from cache
4. these items have different ids because they were hashed again
5. they show up twice in feed reader
2019-06-01 19:36:46 +02:00
Knah Tsaeb 95731eb812 Merge remote-tracking branch 'origin/master' into kt_bridge 2019-05-10 16:05:36 +02:00
Lyra 2cd310c025 Bump version to 2019-05-08 2019-05-08 22:36:22 +02:00
fulmeek 21d3bf3b60 caches: Refactor the API (#1060)
- For consistency, functions should always return null on non-existing data.

- WordPressPluginUpdateBridge appears to have used its own cache instance in the past. Obviously not used anymore.

- Since $key can be anything, the cache implementation must ensure to assign the related data reliably; most commonly by serializing and hashing the key in an appropriate way.

- Even though the default path for storage is perfectly fine, some people may want to use a different location. This is an example how a cache implementation is responsible for its requirements.
2019-04-29 20:12:43 +02:00
Knah Tsaeb d50f246dec Merge remote-tracking branch 'origin/master' into kt_bridge 2019-04-23 10:28:55 +02:00
Roliga 380fdf2e40 [ParameterValidator] Handle missing parameter type (#1057)
* [ParameterValidator] Handle missing parameter type
2019-04-04 22:55:46 +02:00
logmanoriginal 6293c3d33d [FeedItem] Filter duplicate enclosures 2019-03-21 19:42:44 +01:00
logmanoriginal 88aae6fd95 core: Apply changes to fix broken Travis builds
Travis-CI recently got updated, which causes existing builds to fail.
For example: https://travis-ci.org/RSS-Bridge/rss-bridge/builds/507568117

Indenting multi-line arguments of functions fixes it.
2019-03-20 19:23:22 +01:00
Nemo 684558e276 [StockFilingsBridge] Add new bridge (#1011) 2019-03-17 20:40:21 +01:00
logmanoriginal d7094b7feb [Configuration] Bump version to dev.2019-03-17 2019-03-17 20:31:17 +01:00
logmanoriginal ae2c35c18a [Configuration] Bump version to 2019-03-17 2019-03-17 20:28:55 +01:00
logmanoriginal e3588f62bd [Cache] Fix cache types ending on 'cache' are not detected correctly
References #1000
2019-02-24 11:56:43 +01:00
Lyra f9ed934c8c Update contributors and bump version 2019-02-19 22:05:06 +01:00
ORelio ca9c2abb60 [FeedExpander] Fix item href being used as feed uri (#1033) 2019-02-11 19:07:03 +01:00
logmanoriginal 556a417dd6 core: Add support for custom cache types via config.ini.php
This commit adds support for a new parameter which specifies the type
of cache to use for caching. It is specified in config.ini.php:

 [cache]

 type = "..."

Currently only one type of cache is supported (see /caches). All uses
of 'FileCache' were replaced by this configuration option.

Note: Caching currently depends on files and folders (due to FileCache).
Experience may vary depending on the selected cache type. For now always
check if FileCache is working before testing alternative types.

References #1000
2019-02-06 18:52:44 +01:00
LogMANOriginal 51ee541d5a
core: Implement action factory (#1002) 2019-02-06 18:34:51 +01:00
logmanoriginal 32d4da8b76 [Bridge] Fix failed to open stream when reading non-existing whitelist 2019-02-04 17:35:40 +01:00
LogMANOriginal 394149b114
core: Add item uid (#1017)
'uid' represents the unique id for a feed item. This item is null by
default and can be set to any string value. The provided string value
is always hashed to sha1 to make it the same length in all cases.

References #977, #1005
2019-02-03 20:56:41 +01:00
logmanoriginal a29512deee [BridgeCard] Don't warn about the 'required' attribute if it is set to false 2019-01-22 19:12:37 +01:00
logmanoriginal 434c12672f lib: Ignore required attribute on lists an checkboxes
References #1014
2019-01-22 18:11:52 +01:00
Knah Tsaeb 90f7e23e71 Merge branch 'master' into kt_bridge 2019-01-16 09:06:14 +01:00
logmanoriginal bcd7bccc46 vendor: Update PHP Simple HTML DOM Parser to 1.8.1
https://sourceforge.net/projects/simplehtmldom/files/simplehtmldom/1.8.1/

Note: Some bridges may need fixes in their CSS queries if they don't follow
the specification.
2019-01-13 22:02:59 +01:00
logmanoriginal 2def7a04a3 Bump version to dev.2019-01-13 2019-01-13 19:23:59 +01:00
logmanoriginal ef6709c402 Bump version to 2019-01-13 2019-01-13 19:15:06 +01:00
triatic 245af35a60 [contents] improve file_get_contents() reporting (#986)
Suppress any errors from file_get_contents() and include the PHP error in the feed instead.
2019-01-06 20:30:02 +01:00
triatic 81ee15a161 general: Fix PHP 7.3 deprecation warnings (#982)
Fix PHP 7.3 deprecation warnings. FILTER_VALIDATE_URL implies FILTER_FLAG_SCHEME_REQUIRED and FILTER_FLAG_HOST_REQUIRED since PHP 5.2.1

https://bugs.php.net/bug.php?id=75442
2018-12-28 16:13:03 +01:00
LogMANOriginal 988635dcf3
core: Add FeedItem class (#940)
Add transformation from legacy items to FeedItems, before transforming
items to the desired format. This allows using legacy bridges alongside
bridges that return FeedItems.

As discussed in #940, instead of throwing exceptions on invalid
parameters, add messages to the debug log instead

Add support for strings to setTimestamp(). If the provided timestamp
is a string, automatically try to parse it using strtotime().

This allows bridges to simply use `$item['timestamp'] = $timestamp;`
instead of `$item['timestamp'] = strtotime($timestamp);`

Support simple_html_dom_node as input paramter for setURI

Support simple_html_dom_node as input parameter for setContent
2018-12-26 22:41:32 +01:00
triatic 4095cad9b4 lib: Make cURL module requirement optional (#979)
When running in CLI mode without certificates, do not require curl module to be loaded.
2018-12-26 22:31:30 +01:00
logmanoriginal e7d3a006c8 global: Fix code violations 2018-12-26 21:58:07 +01:00
triatic dc83962483 [contents] Use file_get_contents when in CLI mode & no certs (#962)
file_get_contents can natively use system root certificates, so use file_get_contents when in CLI mode with no root certificates for cURL.
2018-12-26 20:04:55 +01:00
logmanoriginal a07ead42a7 Bump version to dev.2018-12-11 2018-12-11 17:07:41 +01:00
logmanoriginal a11ade3442 Bump version to 2018-12-11 2018-12-11 17:01:16 +01:00
triatic 1c58c04271 [contents] Better error reporting for cUrl errors (#958)
References #954
2018-12-10 21:20:13 +01:00
LogMANOriginal 3d301fc4ee
[contents] Skip caching if the remote server requests no caching (#945)
* Skip caching if Cache-Control defines no-cache
* Skip caching if Cache-Control defines no-store
2018-11-28 17:36:28 +01:00
triatic 263e8872ea core: Don't use server variables in CLI mode (#939) 2018-11-26 18:33:51 +01:00
Roliga 49da67cb33 core: Automatically select a bridge based on a URL (#928)
* core: Add bridge parameter auto detection

This adds a new 'detect' action which accepts a URL from which an
appropriate bridge is selected and relevant parameters are extracted.
The user is then automatically redirected to the selected bridge.

For example to get a feed from: https://twitter.com/search?q=%23rss-bridge
we could send a request to:
'/?action=detect&format=Atom&url=twitter.com/search%3Fq%3D%2523rss-bridge'
which would redirect to:
'/?action=display&q=%23rss-bridge&bridge=Twitter&format=Atom'.

This auto detection happens on a per-bridge basis, so a new function
'detectParameters' is added to BridgeInterface which bridges may implement.
It takes a URL for an argument and returns a list of parameters that were
extracted, or null if the URL isn't relevant for the bridge.

* [TwitterBridge] Add parameter auto detection

* [BridgeAbstract] Add generic parameter detection

This adds generic "paramater detection" for bridges that don't have any
parameters defined. If the queried URL matches the URI defined in the
bridge (ignoring https://, www. and trailing /) an emtpy list of parameters is
returned.
2018-11-26 18:05:40 +01:00
LogMANOriginal 7b261d1cc2
[contents] Add server side caching for all requests (If-Modified-Since) (#889)
This commit adds a cache for 'getContents' to '/cache/server'. All
contents are cached by default (even in debug mode). If debug mode
is enabled, the cached data is overwritten on each request.

In normal mode RSS-Bridge adds the 'If-Modified-Since' header with
the timestamp from the previously cached data (if available) to the
request.

Find more information on 'If-Modified-Since' here:
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/If-Modified-Since

If the server responds with "304 Not Modified", the cached data is
returned.

If the server responds with "200 OK", the received data is written
to the cache (creates a new cache file if it doesn't exist yet).

No changes were made for all other response codes.

Servers that don't support the 'If-Modified-Since' header, will
respond with "200 OK".

For servers that respond with "304 Not Modified", the required band-
width will decrease and RSS-Bridge will responding faster.

Files in the cache are forcefully removed after 24 hours.

Notice: Only few servers actually do support 'If-Modified-Since'.
Thus, most bridges won't be affected by this change.
2018-11-19 17:53:08 +01:00
logmanoriginal 96a518c9e7 [html] Remove todo as it is already implemented 2018-11-18 17:52:45 +01:00
logmanoriginal 0d2ea9a677 [html] Rename parameters for sanitize() 2018-11-18 17:43:34 +01:00
logmanoriginal 66e82e46db [html] Remove todo tags
It is not feasible to use a single 'substr' in the functions
2018-11-18 17:36:00 +01:00
logmanoriginal 54800fcc8d [html] Clarify meaning of strange find() parameter
simple_html_dom currently doesnt support "->find('*')", which is a
known issue: https://sourceforge.net/p/simplehtmldom/bugs/157/

The solution implemented by RSS-Bridge is to find all nodes WITHOUT
a specific attribute. If the attribute is very unlikely to appear
in the DOM, this is essentially returning all nodes.

This is the meaning behind

"->find('*[!b38fd2b1fe7f4747d6b1c1254ccd055e]')"
2018-11-18 17:32:07 +01:00
logmanoriginal 67004556e6 [BridgeCard] Use self:: instead of BridgeCard:: 2018-11-18 16:59:13 +01:00
logmanoriginal c6a7b9ac64 exception: Remove HttpException class
This class served no particular purpose (other than adding a
layer on top of Exception).
2018-11-18 16:53:21 +01:00
logmanoriginal dbffbd4d4e [FormatAbstract] Check content type before sending header 2018-11-18 16:30:34 +01:00
logmanoriginal 1c17ffb5c4 [FeedExpander] Add constants for feed types 2018-11-18 16:18:40 +01:00
logmanoriginal 326cfb21cf [FeedExpander] Rename $name to $title 2018-11-18 16:11:38 +01:00
logmanoriginal 8ab1fb86a9 [FeedExpander] Let collectExpandableDatas() return self 2018-11-18 16:03:32 +01:00
logmanoriginal a9ec3d0d1f [Configuration] Change scope of $config to private 2018-11-18 15:58:34 +01:00
logmanoriginal ac5bcb62ec [Configuration] Add documentation for defined constants 2018-11-18 15:52:28 +01:00
logmanoriginal f24ab8b51b [Configuration] Rename $category to $section in getConfig() 2018-11-18 15:45:17 +01:00
logmanoriginal 4348119adf [Configuration] Make file paths explicit 2018-11-18 15:41:43 +01:00
logmanoriginal fd4124cda2 [Configuration] Make class final
This class is essential to the core library of RSS-Bridge and must
not be extended. This improves predictability for the behaviour of
this class.
2018-11-18 15:34:16 +01:00
logmanoriginal 91f7405297 [Configuration] Throw exception creating objects of this class
This class only provides static functions.
2018-11-18 15:29:50 +01:00
logmanoriginal 85685b7758 [Authentication] Throw exception creating objects from this class
Callers must use Authentication::showPromptIfNeeded()
2018-11-18 15:20:43 +01:00
logmanoriginal c4550be812 lib: Add API documentation 2018-11-18 09:41:14 +01:00
logmanoriginal 254fe9212a [Debug] Fix debug mode reports indexing error
Error log reports "PHP Notice:  Undefined offset: 2 in /rss-bridge/
lib/Debug.php on line 112" if the array returned by debug_backtrace
does not contain 3 items.

This commit fixes the issue by always using the last element in the
backtrace "end($backtrace)".
2018-11-16 20:19:52 +01:00
logmanoriginal b47a30ecc1 [rssbridge] Improve documentation 2018-11-15 20:52:17 +01:00
logmanoriginal 860b36c1e3 [Debug] Use self:: instead of Debug:: inside the class 2018-11-15 20:28:26 +01:00
logmanoriginal 3d475572c6 [Debug] Improve documentation 2018-11-15 20:27:32 +01:00
logmanoriginal 59f2d755fe format: Refactor searchInformation
- Rename function to getFormatName
- Add documentation
- Rename variables
- Remove unused variables
2018-11-15 20:16:21 +01:00
logmanoriginal d7c374bd8c [Format] Add function isFormatName
Returns true if the provided format name is valid
2018-11-15 20:14:43 +01:00
logmanoriginal 6b6ab6486a [Format] Store real path to working directory 2018-11-15 20:06:45 +01:00
logmanoriginal 6c4e239f64 format: Refactor class Format 2018-11-15 20:06:23 +01:00
logmanoriginal 88b0656954 bridge: Rename listBridge to getBridgeNames 2018-11-15 19:43:23 +01:00
logmanoriginal 66b11b8c41 [Bridge] Fix typo 2018-11-15 19:38:14 +01:00
logmanoriginal 1b34d9860e [Cache] Check if class is instantiable 2018-11-15 19:36:01 +01:00
logmanoriginal 6e70d461e1 [Bridge] Add function isBridgeName
This function returns true if the provided name is a valid
bridge name.
2018-11-15 19:33:56 +01:00
logmanoriginal 0a92b5d29b [Bridge] Refactor Bridge::create to improve readability 2018-11-15 19:31:31 +01:00
logmanoriginal e3849f45ab [Bridge] Use slashes to enclose regex 2018-11-15 19:30:33 +01:00
logmanoriginal 3d9c4a3718 [Bridge] Improve working directory handling
- Initialize with null to prevent leaking configurations
- Check if the working directory is a directory
- Store the real path instead of raw data
- Add final path separator as expected by Bridge::create
2018-11-15 19:28:56 +01:00
logmanoriginal 5f146a257e [Bridge] Change visibility from private to protected 2018-11-15 19:24:43 +01:00
logmanoriginal 936688e08c [Bridge] Fix typos 2018-11-15 19:22:32 +01:00
logmanoriginal 4b5372638c [Bridge] Use self:: instead of Bridge:: inside the class 2018-11-15 19:19:04 +01:00
logmanoriginal 6f4a8f4d03 [Bridge] Rename to in setWorkingDir 2018-11-15 19:17:18 +01:00
logmanoriginal 39652bb050 [Bridge] Rename to 2018-11-15 19:16:37 +01:00
logmanoriginal fcac5b8b92 [Bridge] Cleanup documentation and exception messages 2018-11-15 19:15:08 +01:00
logmanoriginal 6f7b56cba8 bridge: Rename setDir and getDir to setWorkingDir and getWorkingDir 2018-11-15 19:07:33 +01:00
logmanoriginal 86ac0a4866 [Cache] Fix typos 2018-11-15 19:00:48 +01:00
logmanoriginal 4a99c6e630 cache: Rename setDir and getDir
- Rename setDir to setWorkingDir
- Rename getDir to getWorkingDir
- Rename parameter $workingDir to $dir in getWorkingDir
2018-11-14 20:39:45 +01:00
logmanoriginal e8442a3bf8 [Cache] Refactor class
general

- Use self:: instead of Cache:: or static::
- Rename $dirCache to $workingDir
- Initialize $workingDir with null

function setDir

- Clear previous working directory before checking input parameters
- Change wording for the exception messages
- Store realpath instead of raw parameter
- Add path separator
  This ensures the path always ends with the path separator, as assumed
  by Cache::create
- Add check if the provided working directory is a valid directory

function getDir

- Use static parameter instead of function variable
- Change wording for the exception message

function create

- Rename parameter $nameCache to $name
- Rename $pathCache to $filePath
- Change wording for the exception messages

function isValidNameCache

- Rename function to isCacheName
- Rename parameter $nameCache to $name
- Explain in the function documentation the meaning of a 'valid name'
- Ensure Boolean return value (preg_match returns integer)
- Check if $name is a string
- Use slashes to enclose the regex
2018-11-14 20:33:44 +01:00
logmanoriginal 427688fd67 [Cache] Add documentation 2018-11-14 17:06:07 +01:00
logmanoriginal 4a6b3654eb [Bridge] Add and rewrite documentation compatible to phpDocumentor
This is the first step in adding documentation to the core library
of RSS-Bridge. The documentation is not yet extracted by phpdoc,
yet may prove useful to anyone interested in starting with RSS-Bridge.
2018-11-13 20:28:17 +01:00
logmanoriginal c15b25a07d core: Fix PHPCS violations 2018-11-13 18:27:05 +01:00
logmanoriginal 007ee4d858 [Bridge] Fix broken bridge initialization
Commit e26d61e introduced a bug that causes the error message "The
bridge you [sic!] looking for does not exist." if the bridge name
specified in the query ends on "Bridge"
(i.e. '&bridge=SoundcloudBridge'), while other queries work fine
(i.e. '&bridge=Soundcloud').

This commit fixes that issue by sanitizing the bridge name before
creating the class.

References #922
2018-11-13 17:36:06 +01:00
Thomas Dalichow dd95ec6200 core: Fix grammar (#923) 2018-11-13 17:24:36 +01:00
logmanoriginal 3bb3353897 [Bridge] Use static variable in listBridges()
This prevents the function from re-loading the same data over and over
again. Instead the same data is returned on each call, during a single
request.
2018-11-10 22:31:40 +01:00
logmanoriginal e26d61ec0a core: Refactor bridge whitelisting
- Move all whitelisting functionality inside Bridge.php
- Set default whitelist once in index.php using Bridge::setWhitelist()
- Include bridge sanitizing inside Bridge.php
    Bridge::sanitizeBridgeName($name)

Bridge.php now maintains the whitelist internally.
2018-11-10 22:26:58 +01:00
logmanoriginal a0490e3673 core: Add Debug::isEnabled() and Debug::isSecure()
Also adds documentation to Debug.php!

* Debug::isEnabled()

Checks if the DEBUG file exists on disk on the first call (stored in
memory for the duration of the instance). Returns true if debug mode
is enabled for the client.

This function also sets the internal flag for Debug::isSecure()!

* Debug::isSecure()

Returns true if debuging is enabled for specific IP addresses, false
otherwise. This is checked on the first call of Debug::isEnabled().
If you call this function before Debug::isEnabled(), the default value
is false.
2018-11-10 20:50:34 +01:00
logmanoriginal c63af2e7ad core: Add separate Debug class
Replaces 'debugMessage' by specialized debug function 'Debug::log'.
This function takes the same arguments as the previous 'debugMessage'.

A separate Debug class allows for further optimization and separation
of concern.
2018-11-10 20:03:05 +01:00
logmanoriginal 9379854f7a core: Define path to whitelist.txt in rssbridge.php 2018-11-10 19:51:37 +01:00
logmanoriginal ecdac1b089 core: Add path separator to PATH_CACHE 2018-11-10 19:48:05 +01:00