Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

seiso.party support #1635

Closed
thatfuckingbird opened this issue Jun 17, 2021 · 5 comments
Closed

seiso.party support #1635

thatfuckingbird opened this issue Jun 17, 2021 · 5 comments

Comments

@thatfuckingbird
Copy link
Contributor

A kemono.party fork/rewrite(?). Not sure how much of the kemono code could be reused.

Some info from their admin:

Main difference isn't in the UI, but in the code that runs the backend and the importer. It was almost completely rewritten (90%+) and the importer is much more reliable when it comes to embedded things and weird formats. This leads to higher quality content on the site.

Additionally, the storage that Seiso uses is a lot different from Kemono's storage which means that it won't buckle under load even when it gets to the amount of traffic that Kemono has. Images should always load regardless of traffic, even large uncached ones.

That's the gist of it.

@mikf
Copy link
Owner

mikf commented Jun 18, 2021

The site doesn't appear to have a convenient API like kemono.party does, so not much of the current code can be reused, I think. Maybe some from the initial kemono commit that manually parsed HTML, but I doubt it.

Do you know where to find the site's code or any form of documentation for an eventual API?

@thatfuckingbird
Copy link
Contributor Author

https://paywall.party/seiso/catalog.html is all the info I've found. There is a post from the admin that it might be made open source later but right now it is not. No mention of API, looks like we are out of luck for that.

Looking at the source, parsing the HTML of artist galleries shouldn't be too bad. The individual post pages aren't too bad either, looks like all the files we want have URLs beginning with cdn.seiso.party/files/.

Other than those, extracting the post title and text would be nice, especially that the post html can contain relevant links (e.g. to google drive or other file hosters).

@mikf
Copy link
Owner

mikf commented Jun 26, 2021

Initial support got added in f74cf52.
It behaves more or less just like the kemono.party extractors as in:

  • it largely provides the same metadata fields
  • it uses the same filename/directory/archive format strings be default
  • it also needs cookies to get around DDOS-Guard (Kemono: 403 Forbidden #1370)

It also always provides username information without enabling a metadata option. This should probably be used instead of the user ID from user, since that doesn't reflect the real ID like it does on kemono.

@thatfuckingbird
Copy link
Contributor Author

Thank you, appreciate your work a lot! Now I can scratch this off my TODO list.

@mikf
Copy link
Owner

mikf commented Jun 29, 2021

Quick update:

  • files to cdn-2 servers now also get recognized (e4db1ba)
  • changed the default directory names to use usernames instead of IDs (daf821b)
  • added a warning when ddos-guard cookies are missing (344aab3)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants