Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downloading deviantart favourites within a range? #3666

Open
MrSeyker opened this issue Feb 17, 2023 · 5 comments
Open

Downloading deviantart favourites within a range? #3666

MrSeyker opened this issue Feb 17, 2023 · 5 comments

Comments

@MrSeyker
Copy link

I'm trying to make a backup of my favourite pictures, but my attempts have never crawled through the entire list of favourites, I have around 85000 pictures there, but less than 60000 have downloaded (and the later number includes text files from the comments to catch links so it's not all images).

I wanted to get around it by using the pagination feature but (https://www.deviantart.com/mrseyker/favourites/all?page=2507) such an url is not supported. Can anything be done to achieve this?

I figured I could go through all the pages and capture the individual links, but we talking about 3500 and change pages here.

@ClosedPort22
Copy link
Contributor

ClosedPort22 commented Feb 17, 2023

It's currently not possible to start at an arbitrary offset value since gallery-dl goes through each collection separately. However, it's still possible use --filter X- to skip the first X deviations, but that would be a massive waste of API calls in this case.

@MrSeyker
Copy link
Author

I don't understand the idea of using the filter. Can you elaborate?

@ClosedPort22
Copy link
Contributor

ClosedPort22 commented Feb 18, 2023

I don't understand the idea of using the filter. Can you elaborate?

Sorry, I meant the --range option, which is documented here. --range 60000- tells gallery-dl to skip the first 59999 deviations and download everything after that, but I don't recommend using it since it will waste about 2500 API requests.

I've made a (rough) version that supports URLs like https://www.deviantart.com/mrseyker/favourites/all?page=2507 and thus does not waste any API calls. It goes through your 'All' folder listing, so it works best if other collections don't have deviations not found in 'All'. You can find the exexutables here: https://github.com/ClosedPort22/gallery-dl/actions/runs/4210840945

For example, gallery-dl-NEW-VERSION https://www.deviantart.com/mrseyker/favourites/all?page=2507 is equivalent to gallery-dl-NEW-VERSION --range 60169- https://www.deviantart.com/mrseyker/favourites/all and gallery-dl --range 60169- https://www.deviantart.com/mrseyker/favourites/all but without unnecessary API calls.

You can even combine ?page= with --range to have more fine-grained control over the offset value. For example,
gallery-dl-NEW-VERSION --range 2- https://www.deviantart.com/mrseyker/favourites/all?page=2507 is equivalent to gallery-dl-NEW-VERSION --range 60170- https://www.deviantart.com/mrseyker/favourites/all.

@mikf
Copy link
Owner

mikf commented Feb 18, 2023

The general problem with --range and ?page=N and ultimately the offset parameter for API requests is DeviantArt's limit of 50k posts per folder.

With commit 725baed, it now uses the /collections/all endpoint with flat=1, but that would fail to reach more than latest 50k deviations.

For https://www.deviantart.com/mrseyker/favourites, I'd get a list of all folders

gallery-dl -o flat= -g https://www.deviantart.com/mrseyker/favourites > folders.txt

and then download them individually

gallery-dl -i folders.txt

That way, you can, at least on a folder-by-folder basis, control the offset with -range.

@MrSeyker
Copy link
Author

MrSeyker commented Feb 19, 2023

Getting the list of folders doesn't really address my issue, as each are usually under a thousand pictures and I have already made backup of their content (I started the back up process with them).

Now I'm stuck trying to get to the older pictures that were never sorted into folders. I can't even do it manually because they removed a lot of useful tools since the stupid eclipse update.

Guess I'll have to manually comb the urls starting from the oldest then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants