Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search query can lead to a 403 Client Error: Forbidden for url #5

Closed
manouchk38 opened this issue Oct 24, 2021 · 4 comments · Fixed by #6
Closed

Search query can lead to a 403 Client Error: Forbidden for url #5

manouchk38 opened this issue Oct 24, 2021 · 4 comments · Fixed by #6
Labels
bug Something isn't working

Comments

@manouchk38
Copy link

manouchk38 commented Oct 24, 2021

Hi,

The following query search lead to a 403 Client Error: Forbidden for url whereas searching in joplin gives a correct result

Step to reproduce
`from joppy.api import Api

api = Api(token='xxx')

q='"https://books.google.com.br/books?id=vaZFBgAAQBAJ&pg=PA83&dq=lakatos+copernicus&hl=pt-BR&sa=X&ved=0ahUKEwjewoWZ6q7hAhUNJ7kGHdy5CZUQ6AEIUDAF#v=onepage&q=lakatos%20copernicus&f=false"'

api.search(query=q)`

I may miss some specific knowledge or should have read a documentation that may help me solve this issue.

The context of the problem is the following. I was using an alternative solution for my program that helps me remove semantically identical notes which was to split body at each special character and search for the longuest line in the obtain list of string but it leads small search query and more than 2 notes in many cases so that I cannot finalize the elimination of doublons I have in joplin. Here what I was doing to remove special characters:

`
import re

notelines=re.split(r'[`-=~!@#$%^&*()_+[]{};'\:"|<,./<>?]', note['body'])

q='"'+max(notelines,key=len)+'"'

identicalnotes=api.search(query=q)
`

I reverted to:

`
notelines=note['body'].replace('(',' ').replace(')',' ').replace('[',' ').replace(']',' ').replace('"',' ').split('\n')

q='"'+max(notelines,key=len)+'"'

identicalnotes=api.search(query=q)
`

The error reported here is based on this "filtering" out of '(', ')', '[', ']' and '"'. If remove other special characters found in http link like '%' or '#', I would loose to much "informations".

(inserting code mode seems not to work properly. I had to include two line breaks in order to make it readable.)

@marph91
Copy link
Owner

marph91 commented Oct 24, 2021

Hi @manouchk38,

I can reproduce the error. This looks like an issue with the special characters, as you assumed. Not sure whether it's a problem of the python interface or the Joplin API itself. I will take a look at it tomorrow.

@manouchk38
Copy link
Author

Ok. I had another problem and I found that substituting "&" by " " removed the error whereas substituting "?" by " " did not remove the error. The error though in the case I tried was: 500 Server Error: Internal Server Error for url.
(with q="https://www.facebook.com/photo.php?fbid=621934757910715&set=a.143794752391387&type=3&app=fbl")

@manouchk38
Copy link
Author

manouchk38 commented Oct 25, 2021

There is maybe a problem too with "#".

@marph91
Copy link
Owner

marph91 commented Oct 25, 2021

I think the issue is fixed. At least in the test case, the example strings can be used and yield the correct results.

In the corresponding PR is an explanation why it happened. In general, the tests for searching are not very detailed, so please report if you encounter more issues.

If you want to try the fix directly, you can do:

git clone https://github.com/marph91/joppy.git
cd joppy
git checkout fix-search-strings-with-special-chars 
pip install .

@marph91 marph91 added the bug Something isn't working label Oct 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants