Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Deviantart] [Twitter] [Possibly other sites] Downloading a twitter account's main tab sometimes gets the same tweet twice and aborts #1792

Closed
Scripter17 opened this issue Aug 22, 2021 · 3 comments

Comments

@Scripter17
Copy link
Contributor

Sometimes when downloading a twitter account the same tweet will be grabbed twice, causing the extractor to end early. Just now I've seen this happen on a deviantart favourite list.
The only solution I can think of is keeping a list of all images the current gallery-dl command processed and, if it encounters any one of them again, don't abort

This has happened before on Twitter, but the only example whose URL I remember is NSFW and takes about 10GB to happen so you're going to have to take my word on this

@nisehime
Copy link

nisehime commented Aug 23, 2021

Most likely it's just a tweet with a quote retweet of the previous one. I've seen this too (here, for example (NSFW)) and it's not really a bug. Either disable quote retweets or increase the abort threshold.

@Doofy420
Copy link

Doofy420 commented Aug 25, 2021

"quoted": false will result in missing content, because it causes gallery-dl to also skip the actual tweets that have been quote RT'd at any point in the timeline. I forgot why it works this way but it's been discussed here before, I think?

On the flip side, "quoted": true may have gallery-dl downloading self QRTd images earlier than it should, causing abort <5 to trip prematurely when it gets to the original QRTd tweet.

TL;DR: I find that the only way to way to get --abort x to play nice with "quoted": true is to set it to 5.

edit: "quoted": false and abort 1 now works properly together after the update, thanks mikf!

mikf added a commit that referenced this issue Aug 26, 2021
When a user quotes his own Tweet and that Tweet gets filtered by
'"quoted": false', it could also get filtered when it appeared later
as regular Tweet.
@mikf
Copy link
Owner

mikf commented Aug 26, 2021

The only solution I can think of is keeping a list of all images the current gallery-dl command processed and, if it encounters any one of them again, don't abort

That's actually already implemented via the image-unique option, but I concur with nisehime: just increase the abort threshold.

"quoted": false will result in missing content, because it causes gallery-dl to also skip the actual tweets that have been quote RT'd at any point in the timeline. I forgot why it works this way but it's been discussed here before, I think?.

No, that's an actual bug and shouldn't happen. Fixed in ae78d95.

@mikf mikf closed this as completed Sep 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants