|
| 1 | +Archiver |
| 2 | +--- |
| 3 | + |
| 4 | +## ⚠ WARNING |
| 5 | + |
| 6 | +- We cannot guarantee that these features will be available. |
| 7 | +- When archiving someone else's Twitter, please obtain their permission first. |
| 8 | +- The structure of the generated data is still being adjusted, and the current results may not be available in viewer. |
| 9 | + |
| 10 | + |
| 11 | +## Known issues |
| 12 | +- Unable to crawl most of the retweets. |
| 13 | +- Unable to crawl tweets marked as sensitive content (TODO login can solve). |
| 14 | +- Unable to crawl copyrighted media files in some region. |
| 15 | +- Some videos are damaged, which is normal. Downloading the corresponding m3u8 will result in a lower quality version. |
| 16 | +- Unable to crawl tweets from protected/banned/deleted users. |
| 17 | +- The rate limit status after logging in will follow the account rather than the guest token (TODO not implemented yet). |
| 18 | + |
| 19 | +## Features |
| 20 | + |
| 21 | +- Userinfo (not included the author of the quoted tweet). |
| 22 | +- Tweets and replies can be searched anonymously, not included most retweets. |
| 23 | +- Polls |
| 24 | +- Avatar, banner, photos and videos. |
| 25 | +- Following and followers list (optional) |
| 26 | +- Keep raw data for future used. |
| 27 | + |
| 28 | +## TODO |
| 29 | + |
| 30 | +- Space and Broadcast with ffmpeg |
| 31 | +- Login by **COOKIE** |
| 32 | +- Incremental update tweets/followers/following list |
| 33 | + |
| 34 | +## Init |
| 35 | + |
| 36 | +- Execute command: |
| 37 | + |
| 38 | + ```shell |
| 39 | + #bash |
| 40 | + bash init.sh <screen_name> # like 'twitter' |
| 41 | + #or powershell |
| 42 | + .\init.ps1 <screen_name> |
| 43 | + ``` |
| 44 | + |
| 45 | + A folder named `screen_name` will be created. If the folder `screen_name` already exists, you will be prompted to delete or rename the folder. |
| 46 | + |
| 47 | +## Run |
| 48 | + |
| 49 | +### Crawler |
| 50 | + |
| 51 | +```shell |
| 52 | +node archive.mjs [OPTION] |
| 53 | +``` |
| 54 | +|Parameter|Required|Description| |
| 55 | +|:--|:--|:--| |
| 56 | +|--all|Optional|All data (UserInfo, Tweets, Following, Followers)| |
| 57 | +|--followers|Optional|Get Followers| |
| 58 | +|--following|Optional|Get Following| |
| 59 | +|--media|Optional|Get Media| |
| 60 | +|--skip_\<key of argvList \>|Optional|Key of argvList included `user_info_and_tweets`, `followers`, `following` and `media`. Will skip the corresponding job.| |
| 61 | + |
| 62 | +### Retry media |
| 63 | + |
| 64 | +```shell |
| 65 | +node retryMedia.mjs |
| 66 | +``` |
| 67 | + |
| 68 | +Attempt to retrieve the failed images during crawling. (useless) |
| 69 | + |
| 70 | +## View |
| 71 | + |
| 72 | +The front-end project is currently under development and if it is ready, it might be available in <https://github.com/BANKA2017/twitter-archive-viewer>. |
0 commit comments