-
-
Notifications
You must be signed in to change notification settings - Fork 951
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Postprocessors – metadata – how to print to stdout? #2624
Comments
No, that's not possible at the moment.
Also not possible.
I'm guessing this wrapper is not in Python, because if it were, you could access gallery-dl's internals directly and this would be a lot easier.
What do you have in mind? I could possibly implement this directly in gallery-dl itself. |
I'm think it would be the simplest solution to implement. Oh, by the way, how do I print the whole JSON metadata in one line, but prefixed? Suppose I have: (for twitter)
This gives me error Also, how do I print all of the metadata as JSON object, but not a restricted set of keys? Am I missing some special template like {_filename} ?
I think that's fine, as long as they would be each on a new line…
Yup, I use NodeJS. I was already near to release state of my local Twitter viewer (the script recursively walks over
Example right in Elon Musk's profile popup!
In this tweet, note the link target at bottom-left corner of the browser window:
When I found this out, I decided to rework my script to anticipate this nasty Twitter behavior. The hardest and unfinished thing is still a downloader script, that will iteratively add new tweets to the media folder (with metadata), without re-requesting much of already stored stuff.
The problem is, I have more than one gallery-dl invocation for a user:
Even that all of those URLs can be specified together in one call, I have to make a separate call anyway:
I cannot use Worse, I cannot use it even because I ask gallery-dl to download the same stuff several times! After Maybe I can set
The main question is: when to stop? A simple solution would be a map (for each URL/call) of already downloaded tweets IDs (as "post" metadata, not per-media), to check against. When there are N consequential hits to anything in the map – stop. Probably, this is still susceptible to retweets problem, at least in some conditions (and the practical workaround would be just increasing the threshold). So, I want to try another approach: store a sequence of tweets IDs in order as they were downloaded. Then, compare not only the intersection, but the order too! Optimistically assuming that the same invocation should download the same old tweets in the same sequential order. Yes, I saw your «but this is Twitter we are talking about», but:
I think this approach is too wobbly to implement directly in gallery-dl. Maybe I will test it in my program first (at least I can I planned to make three additional command-line parameters: the filename of the archive/list, the integer threshold, and the format string of post keys: for now I use |
Writing metadata to stdout is now possible by setting
This was changed in v1.22.0 (915dba8). You can use
The
This is terrible. I didn't realize Twitter search was that limited. It would be really nice if we could search by user ID, but I haven't found a way to do that with the public website API. According to https://developer.twitter.com/en/docs/twitter-api/tweets/search/integrate/build-a-query#list, this works out-of-the-box with |
Is it possible to set gallery-dl to print metadata to console stdout (or stderr), rather than printing it to a file?
I call gallery-dl from my own program; it would be really nice if I could just capture and reparse its output (grabbing custom formatted via
metadata.content-format
lines and piping everything else verbatim) instead of makingexec
postprocessor calls to a tiny utility that just prints its arguments to stdout (so the metadata will appear in my stream).For example, I tried:
– It doesn't work, creating "-" files with JSON in target folder, while I want it to print to stdout/stderr.
Another possible solution: can gallery-dl optionally append to metadata file, instead of rewriting it? So I could specify just one constant file path (to
metadata.directory
?), to which each downloaded metadata would be appended. (In that case I will shared-open it and watch for changes, reading simultaneously as gallery-dl writes there).The reason for this is that I want to create my wrapper around gallery-dl that will know "where to stop downloads" using more sophisticated approach than currently possible with
--abort
or--download-archive
. So it needs to know what is being processed at real-time, even posts without media (as for Twitter withtext-tweets
).The option
-j
prints all metadata, but doesn't download anything…The text was updated successfully, but these errors were encountered: