Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eka's Portal (site request) #390

Closed
kattjevfel opened this issue Aug 19, 2019 · 6 comments
Closed

Eka's Portal (site request) #390

kattjevfel opened this issue Aug 19, 2019 · 6 comments

Comments

@kattjevfel
Copy link
Contributor

Page: https://aryion.com/ (NSFW)

Most files can be downloaded without authentication, but there are rare cases.

Example gallery: https://aryion.com/g4/gallery/jameshoward (NSFW)
Example post: https://aryion.com/g4/view/366689 (SFW)
Download URL: https://aryion.com/g4/data.php?id=366689 (SFW)

All posts can be downloaded like this, by putting the post ID after data.php?id=

headers:

HTTP/2 200 
date: Mon, 19 Aug 2019 11:17:58 GMT
content-type: image/jpeg
content-length: 132455
set-cookie: __cfduid=d27077d2bc5246ee73a828f0887614ab31566213478; expires=Tue, 18-Aug-20 11:17:58 GMT; path=/; domain=.aryion.com; HttpOnly; Secure
x-powered-by: PHP/5.3.29
last-modified: Mon, 31 Oct 2016 21:0747: GMT
expires: Wed, 18 Sep 2019 11:1758: GMT
content-disposition: attachment; filename="jameshoward-366689-streaming copy.jpg"
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
server: cloudflare
cf-ray: 508bb95ed9bedac4-ARN

Note how the last-modified header is malformed, if this could also be adjusted I'd be forever in your debt.

@kattjevfel
Copy link
Contributor Author

As an alternative solution I've written a shell script to do something similar, and perhaps give some ideas.

It makes use of the "Latest Updates" page, so it doesn't deal with folders and it also seems to possibly be getting other peoples submissions as well (I think it's from some kind of retweeting-like feature) but it's a small price to pay for salvation.

#!/bin/sh
# Downloads all posts from a user on Eka's Portal (g4)


if [ $# -eq 0 ]
then
    echo "Usage: $0 <artist> [optional cookie file]"
    exit 1
fi


# Get amount of pages
echo "Checking amount of pages..."
pages=$(curl -s "https://aryion.com/g4/latest/$1" | grep -Eo "Page 1 of [[:digit:]]*" | head -n1 | awk 'NF>1{print $NF}')


echo "Getting list of IDs from $pages pages... (this might take a while)"
# Get list of ID
list=\
"$(
# Get all pages
curl -s "https://aryion.com/g4/latest/$1&p=[1-$pages]" | \
# Get all submissions
grep view/ | \
# Get only what's inside of each href
sed -n 's/.*href="\([^"]*\).*/\1/p' | \
# Get rid of everything before last slash, leaving only IDs
grep -o '[^/]*$' | \
# Add download URL to start of each ID
awk '{ print "https://aryion.com/g4/data.php?id=" $0; }' | \
# Newlines to spaces
tr '\n' ' '
)"


# Start downloading!
curl --cookie "$2" --remote-name-all --fail --remote-time --remote-header-name $list

@mikf
Copy link
Owner

mikf commented Apr 8, 2020

I've written some code adding basic support for user galleries and posts: 6143050
Please test that and, most importantly, tell me what else you'd like to have. It currently only recognizes image posts, but there also appears to be text-stories, videos, etc. and so on. It would be helpful to have some sort of list for things that still need to be done here.

@kattjevfel
Copy link
Contributor Author

Thank you for adding support, however..
It seems it doesn't get the filenames quite right, spaces become + for example:

gallery-dl https://aryion.com/g4/view/593922 
/mnt/jupiter/Temp/gallery-dl/aryion/kainan/Kainan-593922-eila+with+bg.png

You should be able to get this from the headers here:
content-disposition: attachment; filename="Kainan-593922-eila with bg.PNG"

It also doesn't support folders, I know this might be complicated but it would be greatly appreciated, currently it just dumps everything into one big folder, which can become messy when dealing with different comics etc.

Rest seems fine, other than maybe prefix the default filename with aryion or something, doesn't bother me too much but consistency is always neat :P

mikf added a commit that referenced this issue Apr 12, 2020
i.e. /g4/data.php?id=…

- get filename & extension from Content-Disposition header
- handle all downloadable file types (docx, swf, etc)
@mikf
Copy link
Owner

mikf commented Apr 12, 2020

I hope the last few commits fix the remaining issues
(and, again, let me know if there is something else)

  • 96b78bc includes the folder path in the default directory format, although it can't handle multi-level folders properly and only separates their names with -
  • dc65f7d uses https://aryion.com/g4/data.php?id=… for file downloads. This adds support for any file types, not just images, and it also gets the proper filename from Content-Disposition headers
  • 6c531be should fix malformed Last-Modified headers, but I've seen posts posted in 2017 with a last-modified header from 2019. Can artists update their posts?

@kattjevfel
Copy link
Contributor Author

Yo this is perfect, I guess it's time to retire my hackjob of a script, thank you so much for this. <3
Also you got any donate page or something? Would love to send you something for this continuous hard work you're putting in.

@mikf
Copy link
Owner

mikf commented Apr 14, 2020

I don't have anything set up regarding donations or similar, and I don't think that's going to change, but thank you for the offer.

(If you really want to get rid of your money, you can send it to the Paypal account associated with my email address ... someone actually did that for Christmas ...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants