Commit graph

2507 commits

Author SHA1 Message Date
logmanoriginal
fb8a064e3a [simplehtmldom] Increase MAX_FILE_SIZE to 10 MB
This fixes an issue where larger pages could not be loaded
because the size limit is too small
2018-12-11 17:16:35 +01:00
logmanoriginal
b00971b2c3 [simplehtmldom] Update parser to version 1.7
- Update parser to version 1.7
https://sourceforge.net/projects/simplehtmldom/files/simplehtmldom/1.7/

References #959

-------------------- CHANGELOG --------------------

- Added code documentation to improve readability
- Added unit tests for `simple_html_dom::$self_closing_tags`
- Added unit tests for `simple_html_dom::$optional_closing_tags`
- Added unit tests for bug reports
  - Added test for bug [#56](https://sourceforge.net/p/simplehtmldom/bugs/56/)
  - Added test for bug [#97](https://sourceforge.net/p/simplehtmldom/bugs/97/)
  - Added test for bug [#116](https://sourceforge.net/p/simplehtmldom/bugs/116/)
  - Added test for bug [#121](https://sourceforge.net/p/simplehtmldom/bugs/127/)
  - Added test for bug [#127](https://sourceforge.net/p/simplehtmldom/bugs/127/)
  - Added test for bug [#154](https://sourceforge.net/p/simplehtmldom/bugs/154/)
  - Added test for bug [#160](https://sourceforge.net/p/simplehtmldom/bugs/160/)
- Added unit tests for memory management of the parser
- Added bit flags to `simple_html_dom::load()`
  - Added bit flag `HDOM_SMARTY_AS_TEXT` to optionally filter Smarty scripts (#154)\
  **Note**: Smarty scripts are no longer filtered by default!\
- Added build script to automate releases
- Added support for attributes without whitespace to separate them
- Improved documentation and readability for `$self_closing_tags`
- Improved documentation and readability for `$block_tags`
- Improved documentation and readability for `$optional_closing_tags`
- Updated list of `simple_html_dom::$self_closing_tags`
  - Removed 'spacer' (obsolete)
  - Added 'area'
  - Added 'col'
  - Added 'meta'
  - Added 'param'
  - Added 'source'
  - Added 'track'
  - Added 'wbr'
- Updated list of `simple_html_dom::$optional_closing_tags`
  - Removed "nobr" (obsolete)
  - Added 'th' as closable element to 'td'
  - Added 'td' as closable element to 'th'
  - Added 'optgroup' with 'optgroup' and 'option' as closable elements
  - Added 'optgroup' as closable element to 'option'
  - Added 'rp' with 'rp' and 'rt' as closable elements
  - Added 'rt' with 'rt' and 'rp' as closable elements
- Clarified meaning of `simple_html_dom->parent`
- Changed default `$offset` for `file_get_html()` from -1 to 0 (#161)
- Changed `simple_html_dom::load()` to remove script tags before replacing newline characters
- `simple_html_dom_node::text()` no longer adds whitespace to top level span elements (only to sub-elements)
- `simple_html_dom_node::text()` adds blank lines between paragraphs
- Normalized line endings in the repository to LF via `.gitattributes`
- Improved performance of `simple_html_dom::parse_charset()` by approximately 25%
- Improved performance of `simple_html_dom::parse()` by approximately 10%
- `str_get_html()` is deprecated and should be replaced by `new simple_html_dom()`
- Removed protected function `simple_html_dom::copy_until_char_escaped()`
- Fixed compatibility issues with PHP 7.3
- Fixed typo (#147)
- Fixed handling of incorrectly escaped text (#160)
- Restore functionality of `$maxLen` in `file_get_html()`
- Fixed load_file breaks if an error ocurred in another script
2018-12-11 17:15:38 +01:00
logmanoriginal
a07ead42a7 Bump version to dev.2018-12-11 2018-12-11 17:07:41 +01:00
logmanoriginal
a11ade3442 Bump version to 2018-12-11 2018-12-11 17:01:16 +01:00
logmanoriginal
3932e7b8ef [README] Update list of contributors
Fix links pointing to the API instead of HTML pages
2018-12-10 22:21:33 +01:00
disk0x
5305c405f6 [SoundcloudBridge] Improve Author, Date, Description (#955)
1. Author Name now doesn't include Episode Title
2. It now fetches Episode Creation Timestamp, to allow correct sorting in podcatchers
3. Description is now the actual show notes, and not an <audio> tag
2018-12-10 21:35:18 +01:00
triatic
1c58c04271 [contents] Better error reporting for cUrl errors (#958)
References #954
2018-12-10 21:20:13 +01:00
logmanoriginal
89218f1da6 [.travis.yml] Fix broken checks
- Remove "sudo:false"
- Update composer installation paths

The Linux infrastructure migration removed support for "sudo:false"

-- https://changelog.travis-ci.com/deprecation-container-based-linux-build-environment-82037
-- https://blog.travis-ci.com/2018-11-19-required-linux-infrastructure-migration
2018-12-07 18:52:37 +01:00
disk0x
30e2b79c38 [SoundcloudBridge] Add RSS enclosures (#952)
Minimum viable code change to get SoundcloudBridge produce feeds that podcatchers like gPodder can understand.
2018-12-04 16:16:19 +01:00
Nono
2184f523cd [MozillaSecurity] New Bridge (#946)
* [MozillaSecurity] New Bridge

Kudo to @teromene & @ArthurHoaro on this one !
2018-11-30 18:25:02 +01:00
triatic
242b6953ed [FB2Bridge] Adapt to Facebook html change (#950) 2018-11-30 18:23:37 +01:00
Roliga
bdcb7a9829 [index] Fix detect action after listBridges rename (#947)
Commit 88b0656 renamed listBridges function which was not taken into
account when adding the detect action.
2018-11-29 16:44:38 +01:00
Pierre Mazière
f4b46e497e [GithubIssueBridge] Be consistent in avoiding is_null
Signed-off-by: Pierre Mazière <pierre.maziere@gmx.com>
2018-11-29 16:35:49 +01:00
Pierre Mazière
d5085a4116 [GithubIssueBridge] Fix non existing comments count
Signed-off-by: Pierre Mazière <pierre.maziere@gmx.com>
2018-11-29 16:35:45 +01:00
Pierre Mazière
d7cabfca54 [GithubIssueBridge] Fix issue comments and events parsing
Signed-off-by: Pierre Mazière <pierre.maziere@gmx.com>
2018-11-29 16:35:41 +01:00
Pierre Mazière
de575982a1 [GithubIssueBridge] Fix most relevant coding style related issues
Signed-off-by: Pierre Mazière <pierre.maziere@gmx.com>
2018-11-29 16:35:35 +01:00
LogMANOriginal
3d301fc4ee
[contents] Skip caching if the remote server requests no caching (#945)
* Skip caching if Cache-Control defines no-cache
* Skip caching if Cache-Control defines no-store
2018-11-28 17:36:28 +01:00
triatic
263e8872ea core: Don't use server variables in CLI mode (#939) 2018-11-26 18:33:51 +01:00
logmanoriginal
6e9c188a72 [GlassdoorBridge] Fix bridge is marked as executable
References #938
2018-11-26 18:31:25 +01:00
Roliga
49da67cb33 core: Automatically select a bridge based on a URL (#928)
* core: Add bridge parameter auto detection

This adds a new 'detect' action which accepts a URL from which an
appropriate bridge is selected and relevant parameters are extracted.
The user is then automatically redirected to the selected bridge.

For example to get a feed from: https://twitter.com/search?q=%23rss-bridge
we could send a request to:
'/?action=detect&format=Atom&url=twitter.com/search%3Fq%3D%2523rss-bridge'
which would redirect to:
'/?action=display&q=%23rss-bridge&bridge=Twitter&format=Atom'.

This auto detection happens on a per-bridge basis, so a new function
'detectParameters' is added to BridgeInterface which bridges may implement.
It takes a URL for an argument and returns a list of parameters that were
extracted, or null if the URL isn't relevant for the bridge.

* [TwitterBridge] Add parameter auto detection

* [BridgeAbstract] Add generic parameter detection

This adds generic "paramater detection" for bridges that don't have any
parameters defined. If the queried URL matches the URI defined in the
bridge (ignoring https://, www. and trailing /) an emtpy list of parameters is
returned.
2018-11-26 18:05:40 +01:00
sysadminstory
b4dbd191d0 [ZoneTelechargementBridge] Switch to the new Website (#934)
* [ZoneTelechargementBridge] Switch to the new Website

The website zone-telechargement1.org decided that he will be using a new
domain at the end of november :
https://www.annuaire-telechargement.com/

The bridge uses the new domain but still uses the same filename and
class name to keep the existing feed working.
2018-11-20 16:23:17 +01:00
logmanoriginal
e09f452426 [.gitattributes] Exclude files from git archive
Files with the option "export-ignore" are excluded from "git archive"
commands. Release files from GitHub will also ignore those files, so
packages are smaller and don't include unneccessary files.
2018-11-19 18:11:09 +01:00
LogMANOriginal
7b261d1cc2
[contents] Add server side caching for all requests (If-Modified-Since) (#889)
This commit adds a cache for 'getContents' to '/cache/server'. All
contents are cached by default (even in debug mode). If debug mode
is enabled, the cached data is overwritten on each request.

In normal mode RSS-Bridge adds the 'If-Modified-Since' header with
the timestamp from the previously cached data (if available) to the
request.

Find more information on 'If-Modified-Since' here:
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/If-Modified-Since

If the server responds with "304 Not Modified", the cached data is
returned.

If the server responds with "200 OK", the received data is written
to the cache (creates a new cache file if it doesn't exist yet).

No changes were made for all other response codes.

Servers that don't support the 'If-Modified-Since' header, will
respond with "200 OK".

For servers that respond with "304 Not Modified", the required band-
width will decrease and RSS-Bridge will responding faster.

Files in the cache are forcefully removed after 24 hours.

Notice: Only few servers actually do support 'If-Modified-Since'.
Thus, most bridges won't be affected by this change.
2018-11-19 17:53:08 +01:00
logmanoriginal
96a518c9e7 [html] Remove todo as it is already implemented 2018-11-18 17:52:45 +01:00
logmanoriginal
0d2ea9a677 [html] Rename parameters for sanitize() 2018-11-18 17:43:34 +01:00
logmanoriginal
66e82e46db [html] Remove todo tags
It is not feasible to use a single 'substr' in the functions
2018-11-18 17:36:00 +01:00
logmanoriginal
54800fcc8d [html] Clarify meaning of strange find() parameter
simple_html_dom currently doesnt support "->find('*')", which is a
known issue: https://sourceforge.net/p/simplehtmldom/bugs/157/

The solution implemented by RSS-Bridge is to find all nodes WITHOUT
a specific attribute. If the attribute is very unlikely to appear
in the DOM, this is essentially returning all nodes.

This is the meaning behind

"->find('*[!b38fd2b1fe7f4747d6b1c1254ccd055e]')"
2018-11-18 17:32:07 +01:00
logmanoriginal
67004556e6 [BridgeCard] Use self:: instead of BridgeCard:: 2018-11-18 16:59:13 +01:00
logmanoriginal
c6a7b9ac64 exception: Remove HttpException class
This class served no particular purpose (other than adding a
layer on top of Exception).
2018-11-18 16:53:21 +01:00
logmanoriginal
dbffbd4d4e [FormatAbstract] Check content type before sending header 2018-11-18 16:30:34 +01:00
logmanoriginal
1c17ffb5c4 [FeedExpander] Add constants for feed types 2018-11-18 16:18:40 +01:00
logmanoriginal
326cfb21cf [FeedExpander] Rename $name to $title 2018-11-18 16:11:38 +01:00
logmanoriginal
8ab1fb86a9 [FeedExpander] Let collectExpandableDatas() return self 2018-11-18 16:03:32 +01:00
logmanoriginal
a9ec3d0d1f [Configuration] Change scope of $config to private 2018-11-18 15:58:34 +01:00
logmanoriginal
ac5bcb62ec [Configuration] Add documentation for defined constants 2018-11-18 15:52:28 +01:00
logmanoriginal
f24ab8b51b [Configuration] Rename $category to $section in getConfig() 2018-11-18 15:45:17 +01:00
logmanoriginal
4348119adf [Configuration] Make file paths explicit 2018-11-18 15:41:43 +01:00
logmanoriginal
fd4124cda2 [Configuration] Make class final
This class is essential to the core library of RSS-Bridge and must
not be extended. This improves predictability for the behaviour of
this class.
2018-11-18 15:34:16 +01:00
logmanoriginal
91f7405297 [Configuration] Throw exception creating objects of this class
This class only provides static functions.
2018-11-18 15:29:50 +01:00
logmanoriginal
85685b7758 [Authentication] Throw exception creating objects from this class
Callers must use Authentication::showPromptIfNeeded()
2018-11-18 15:20:43 +01:00
logmanoriginal
41d02554f3 [YGGTorrentBridge] Add URI to feed items
References #931
2018-11-18 09:41:14 +01:00
logmanoriginal
c4550be812 lib: Add API documentation 2018-11-18 09:41:14 +01:00
Thibault Couraud
b29ba5b973 [CrewbayBridge] Update bridge according to new crewbay.com website (#930) 2018-11-18 09:16:24 +01:00
logmanoriginal
254fe9212a [Debug] Fix debug mode reports indexing error
Error log reports "PHP Notice:  Undefined offset: 2 in /rss-bridge/
lib/Debug.php on line 112" if the array returned by debug_backtrace
does not contain 3 items.

This commit fixes the issue by always using the last element in the
backtrace "end($backtrace)".
2018-11-16 20:19:52 +01:00
triatic
3806895059 [FacebookBridge] Improve titles (#924)
A slightly improved version of #454 and #468 . Build titles from content rather than author + pre-content (which doesn't reflect anything useful).
2018-11-16 15:33:54 +01:00
triatic
599d438a0d [FacebookBridge] Decode all elements in $item (#925) 2018-11-16 15:25:58 +01:00
triatic
e5a6baab96 [TwitterBridge] Decode HTML entities (#926)
Removes duplicate encoding like &amp;quot; (should be &quot;).
2018-11-15 22:00:01 +01:00
logmanoriginal
b47a30ecc1 [rssbridge] Improve documentation 2018-11-15 20:52:17 +01:00
logmanoriginal
860b36c1e3 [Debug] Use self:: instead of Debug:: inside the class 2018-11-15 20:28:26 +01:00
logmanoriginal
3d475572c6 [Debug] Improve documentation 2018-11-15 20:27:32 +01:00