Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Twitter - Download Only Replies to Self #1254

Closed
flaccidbagel opened this issue Jan 18, 2021 · 14 comments
Closed

Feature Request: Twitter - Download Only Replies to Self #1254

flaccidbagel opened this issue Jan 18, 2021 · 14 comments

Comments

@flaccidbagel
Copy link

Figure this might be a bit of a stretch, but at the same time figured I'd ask.

Is it possible to have twitter extractor download images from replies, but only if the reply comes from the same username as the originating tweet, but not replies to other users? Or possibly just a reply "whitelist" of sorts where it only downloads the file if a certain, specified username is present as one of the users the reply is targeting?

Noticed several artists would upload large sets as replies to themselves, but they also of course reply to other users with media.

@mikf
Copy link
Owner

mikf commented Jan 20, 2021

It should be possible with --filter "reply_id and reply_to == author['name']"

Reply tweets have a nonzero reply_id value, which is the tweet id of the tweet it is replying to, and also a reply_to value, which is the screen name of the author of the replied-to tweet.

And you should also use search results rather than the regular or media timeline, i.e. https://twitter.com/search?q=from:USER instead of https://twitter.com/USER or https://twitter.com/USER/media. Maybe the media timeline results work as well, but you are better of doing a search.

@flaccidbagel
Copy link
Author

flaccidbagel commented Jan 22, 2021 via email

@arisboch
Copy link

arisboch commented Aug 3, 2021

How do I put that into the .conf file?

@arisboch
Copy link

arisboch commented Aug 3, 2021

And --filter "reply_id and reply_to == author['name']" didn't work ;-(

@mikf
Copy link
Owner

mikf commented Aug 9, 2021

How do I put that into the .conf file?

With image-filter:

        "twitter":
        {
            "image-filter": "reply_id and reply_to == author['name']"
        }

And --filter "reply_id and reply_to == author['name']" didn't work ;-(

As a command-line option or when writing this into a config file?

@arisboch
Copy link

@mikf I tried to put that line into my config file, but it didn't work ;-( For example [https://twitter.com/PrinceCanary/status/1424882701073453056](this post)

yielded that message (and no files downloaded):

 % gallery-dl https://twitter.com/PrinceCanary/status/1424882701073453056 --verbose
[gallery-dl][debug] Version 1.18.3-dev
[gallery-dl][debug] Python 3.9.5 - Linux-5.10.16.3-microsoft-standard-WSL2-x86_64-with-glibc2.33
[gallery-dl][debug] requests 2.25.1 - urllib3 1.26.2
[gallery-dl][debug] Starting DownloadJob for 'https://twitter.com/PrinceCanary/status/1424882701073453056'
[twitter][debug] Using TwitterTweetExtractor for 'https://twitter.com/PrinceCanary/status/1424882701073453056'
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): twitter.com:443
[urllib3.connectionpool][debug] https://twitter.com:443 "GET /i/api/2/timeline/conversation/1424882701073453056.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweet=true&count=100&ext=mediaStats%2ChighlightedLabel HTTP/1.1" 200 8138
[twitter][debug] Active postprocessor modules: [MetadataPP]

Is that an user error?

mikf added a commit that referenced this issue Aug 10, 2021
Allow setting 'replies to '"self"' to only download from self-replies.
@mikf
Copy link
Owner

mikf commented Aug 10, 2021

@arisboch
The filter expression from #1254 (comment) only matches self-replies and nothing else.
Use not reply_id or reply_to == author['name'] to allow both regular Tweets as well as replies to the same users.
Or you set the replies option to "self" once the next release comes out. (e5a93e1)

@arisboch
Copy link

arisboch commented Aug 11, 2021

@mikf I used both with the newest version of gallery-dl fresh off the repo, and it doesn't download the replies

Example: https://twitter.com/melatonezone/status/1425274586002862083

@mikf
Copy link
Owner

mikf commented Aug 12, 2021

@arisboch I forgot to mention that you also need to enable the conversations option to get all tweets including replies from a URL to a specific tweet. Sorry about that.

@arisboch
Copy link

@mikf Thanks! 🥳

@arisboch
Copy link

But there's now another problem with some posts: When I try to download this post, it only downloads the post it is a comment to, not the post itself.

@mikf
Copy link
Owner

mikf commented Aug 13, 2021

Well, that tweet is a reply to a different user and therefore gets filtered by the image-filter settings from #1254 (comment) as well as "replies": "self". I'm not sure how best to work around that. Maybe only have "replies": "self" without image-filter in your config and use -o replies=1 whenever you want to download from a reply tweet.

@arisboch
Copy link

@mikf The problem with that is that it'll download all replies, which would be a real problem with a sufficiently popular tweet (the "self" option works, though).

@mikf
Copy link
Owner

mikf commented Dec 3, 2022

With 749802c and 8a70b94, user is now always the user an input URL points to (e.g. berdacs for https://twitter.com/berdacs/status/1425697321036115968). Use that to properly filter out unwanted Tweets instead of relying on "replies": "self".

@mikf mikf closed this as completed Dec 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants