Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix sources and Added new sources #2176

Merged
merged 7 commits into from
Oct 31, 2023
Merged

Fix sources and Added new sources #2176

merged 7 commits into from
Oct 31, 2023

Conversation

CryZFix
Copy link
Contributor

@CryZFix CryZFix commented Oct 21, 2023

Fix issue #2160

  • Added detection of the teaser page and the transition to the real url of the chapter. The speed of operation has slightly decreased to avoid HTTP ERROR 503.
  • Added a tag in cleaner to avoid collecting information about the book in the text of the chapter.

Fix issue #2073 #2174
Снимок экрана 2023-10-31 в 02 14 11

  • Due to the addition of a captcha for logging in, the previous method stopped working. Now we have to specify the token manually, by analogy with wuxiaworld.

Added new sources: #2183 #2182 #2181 #2180

> novepupdates.py - Added detection of the teaser page and the transition to the real url of the chapter. The speed of operation has slightly decreased to avoid HTTP ERROR 503.
> relibrary.py - Added a tag in cleaner to avoid collecting information about the book in the text of the chapter.
@CryZFix CryZFix changed the title Fix source [Novelupdates] Fix source [Novelupdates] and Added new sources Oct 27, 2023
- Url transformation moved to func parse_chapter_body
- The optimal values of self.workers and time.sleep are selected
@CryZFix CryZFix requested a review from dipu-bd October 30, 2023 13:12
- Due to the addition of a captcha for logging in, the previous method stopped working. Now we have to specify the token manually, by analogy with wuxiaworld.
@CryZFix CryZFix changed the title Fix source [Novelupdates] and Added new sources Fix sources and Added new sources Oct 30, 2023
@dipu-bd dipu-bd merged commit 2ae2d81 into dipu-bd:dev Oct 31, 2023
4 checks passed
@DomID00
Copy link

DomID00 commented Oct 31, 2023

@CryZFix @dipu-bd source "trxs.cc" not working


? Enter novel page url or query novel: https://trxs.cc/tongren/8463.html
Retrieving novel info...
Exception in thread Thread-1 (read_novel_info):
Traceback (most recent call last):
File "C:\Users\Byzz\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner
self.run()
File "C:\Users\Byzz\AppData\Local\Programs\Python\Python310\lib\threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\Byzz.lncrawl\sources\zh\trxs.py", line 14, in read_novel_info
soup = self.get_soup(self.novel_url, encoding='gb2312')
File "C:\Users\Byzz\Desktop\lightnovel-crawler-3.2.8\lncrawl\core\scraper.py", line 294, in get_soup
response = self.get_response(url, **kwargs)
File "C:\Users\Byzz\Desktop\lightnovel-crawler-3.2.8\lncrawl\core\scraper.py", line 199, in get_response
return self.__process_request(
File "C:\Users\Byzz\Desktop\lightnovel-crawler-3.2.8\lncrawl\core\scraper.py", line 103, in process_request
response: Response = method_call(url, **kwargs)
File "C:\Users\Byzz\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\sessions.py", line 600, in get
return self.request("GET", url, **kwargs)
File "C:\Users\Byzz\AppData\Local\Programs\Python\Python310\lib\site-packages\cloudscraper_init
.py", line 257, in request
self.perform_request(method, url, *args, **kwargs)
File "C:\Users\Byzz\AppData\Local\Programs\Python\Python310\lib\site-packages\cloudscraper_init
.py", line 190, in perform_request
return super(CloudScraper, self).request(method, url, *args, **kwargs)
TypeError: Session.request() got an unexpected keyword argument 'encoding'

! Error: No chapters found
<class 'Exception'>
File "C:\Users\Byzz\Desktop\lightnovel-crawler-3.2.8\lncrawl\bots\console\integration.py", line 107, in start
raise e
File "C:\Users\Byzz\Desktop\lightnovel-crawler-3.2.8\lncrawl\bots\console\integration.py", line 101, in start
_download_novel()
File "C:\Users\Byzz\Desktop\lightnovel-crawler-3.2.8\lncrawl\bots\console\integration.py", line 85, in _download_novel
self.app.get_novel_info()
File "C:\Users\Byzz\Desktop\lightnovel-crawler-3.2.8\lncrawl\core\app.py", line 137, in get_novel_info
raise Exception("No chapters found")

@CryZFix
Copy link
Contributor Author

CryZFix commented Oct 31, 2023

@DomID00

I see that you are using version 3.2.8, please use the latest version. Download from the releases section.

@DomID00
Copy link

DomID00 commented Oct 31, 2023

@DomID00

Je vois que vous utilisez la version 3.2.8, veuillez utiliser la dernière version. Téléchargez depuis la section des versions.

indeed, thanks for the help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants