Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search plugins fail to parse API from apibay.org due to invalid handling of JSON response #22074

Open
biskweet opened this issue Dec 27, 2024 · 0 comments
Labels
Search engine Issues related to the search engine/search plugins functionality

Comments

@biskweet
Copy link

biskweet commented Dec 27, 2024

qBittorrent & operating system versions

qBittorrent: 5.0.3 (64-bit) Windows and 5.0.2 WebUI (64-bit)
OS: W11 and Arch Linux

What is the problem?

API apibay.org returns weird JSON that causes the piratebay search engine to crash when handling its response. If some search results contain " (quotation marks) characters, the server escapes them by replacing " with " HTML entities in order to still provide a syntactically valid JSON response. While this is not incorrect, it would be best if apibay.org returned properly escaped quotes, i.e. using backslashes.

When handling the response data, functions retrieve_url and htmlentitydecode blindly unescape all entities thereby corrupting previously valid JSON. As a consequence, json.loads crashes. For example:

{
  "title": "Ubuntu 22.04.5 LTS ("Jammy Jellyfish")"
}

becomes

{
  "title": "Ubuntu 22.04.5 LTS ("Jammy Jellyfish")"
}

I believe this is easily fixable by manually replacing all " with \" before going through HTML entity names.

Related issue but not quite the same: 15194.

Steps to reproduce

  1. Install the piratebay search plugin.
  2. Search for any torrent which title contains quotes (e.g. try Dwayne Johnson You re Welcome from Moana 2).
  3. This torrent does exist on the website and is indeed returned by the apibay.org API.
  4. No result is displayed on the qBT user interface due to the script crashing.

Additional context

No response

Log(s) & preferences file(s)

No log is generated when such thing happens.

@thalieht thalieht added the Search engine Issues related to the search engine/search plugins functionality label Dec 27, 2024
Chocobo1 added a commit to Chocobo1/qBittorrent that referenced this issue Jan 4, 2025
Some plugin needed the raw data for further processing.
Related: qbittorrent#22074.
Chocobo1 added a commit that referenced this issue Jan 6, 2025
Some plugin needed the raw data for further processing.
Related: #22074.

PR #22106.
Chocobo1 pushed a commit to qbittorrent/search-plugins that referenced this issue Jan 12, 2025
## Following
- [this issue](qbittorrent/qBittorrent#22074) on the main qBittorrent repository
- and the discussion on [this subsequent pull request](qbittorrent/qBittorrent#22075) 

Here is the fix for the piratebay search engine. A gist of the code is available [here](https://gist.github.com/biskweet/f06ff7b260ef1ce3a31d27ac1a9edcbf) for testing.

## Recalling the problem:
API apibay.org returns weird JSON that causes the piratebay search engine to crash when handling its response. If some search results contain `"` (quotation marks) characters, the server escapes them by replacing `"` with `"` HTML entities in order to still provide a syntactically valid JSON response. While this is not incorrect, it would be best if apibay.org returned properly escaped quotes, i.e. using backslashes.

When handling the response data, functions [`retrieve_url`](https://github.com/LightDestory/qBittorrent-Search-Plugins/blob/master/src/helpers.py#L75-L117) and [`htmlentitydecode`](https://github.com/LightDestory/qBittorrent-Search-Plugins/blob/master/src/helpers.py#L75-L117) blindly unescape all entities thereby corrupting previously valid JSON. As a consequence, `json.loads` crashes. For example:
```json
{
  "title": "Ubuntu 22.04.5 LTS ("Jammy Jellyfish")"
}
```
becomes
```json
{
  "title": "Ubuntu 22.04.5 LTS ("Jammy Jellyfish")"
}
```

## Solution proposed
We no longer use the `retrieve_url` function -- instead, I created a dedicated `retrieve_url` function (which is almost a copy-paste of the original) that fixes the problem by manually escaping quotes *before* escaping the rest of the data.

PR #331.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Search engine Issues related to the search engine/search plugins functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants