-
-
Notifications
You must be signed in to change notification settings - Fork 951
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Twitter] Download all tweets to .jsons, not tweets with media #2588
Comments
|
Followed the config syntax in your issue, works like a charm. For the record, I had to specify the config file location with the |
Also I wanted to know width and height of each image/video in metadata. The only reasonable way I could achieve it – was by using temporary files just to hold media dimensions in their name:
It writes empty files (they actually contain 1 whitespace) for the sole purpose for me to parse them later as a part of directory traversal step, mapping their names to actual images there. I couldn't find a way to output each image metadata to JSON of the post itself; gallery-dl can output media dimensions only to separate jsons. Since each .json is around 3 Kb, while NTFS sector size is 4 Kb, there is no point in compressing them whatsoever. But, actually outputting 1 byte to each .tmp (instead of entire JSON) file results in them taking 0 bytes of disk space! (Instead of 4 Kb, keeping everything in MFT). Finally, the slowest part of my script was not directory traversal (with width&height parsing), but opening and reading all of posts' jsons. So I made a caching mechanism, storing in one big JSON everything I need from all of these small jsons (filtering fields that I'm not interested in). That was a huge performance tweak: each next scan will read only those jsons that weren't cached before, dropping caches of now non-existed jsons. |
Create one jsonl file with a valid json in every line, you can tweak the fields: That config (you can upgrade it) also applies to your question in #2624 (comment):
|
Is there a way to download an entire Twitter profile's tweets (including replies) as .json files, similar to what the
--write-metadata
flag does, but not exclusive to only media tweets? I'm referring to something similar to what a site like https://www.vicinitas.io/free-tools/download-user-tweets does.The text was updated successfully, but these errors were encountered: