Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Help] gallery-dl output log #3908

Open
cheese529 opened this issue Apr 13, 2023 · 25 comments
Open

[Help] gallery-dl output log #3908

cheese529 opened this issue Apr 13, 2023 · 25 comments

Comments

@cheese529
Copy link

Would it be possible to get gallery-dl to produce some sort of output log of all the links that it downloads, that way I can just search the word "error" in note pad and figure out which files failed to download.
Windows 10 command prompt only has so much space, after a few hours of running gallery-dl it begins to run out of room therefore simply just doing ctrl-A and ctrl-C and ctrl-V from the cmd window would not work.

@enduser420
Copy link
Contributor

@cheese529
Copy link
Author

cheese529 commented Apr 13, 2023

how exactly does this log file work? is it deleted after every rerun ? also where is this log file stored?

@skulkexpert
Copy link

skulkexpert commented Apr 13, 2023

Here's my log file config:

        "logfile": {
            "path": "logs/latest_run.log",
            "mode": "w",
			"format": {
				"debug"  : "[{asctime}][{levelname}][{name}] {message}",
                                "info"   : "[{asctime}][{levelname}][{name}]  {message}",
                                "warning": "[{asctime}][{levelname}][{name}]  {message} [Source URL: {extractor.url}]",
                                "error"  : "[{asctime}][{levelname}][{name}]  {message} [Source URL: {extractor.url}]"
			},
            "format-date": "%Y-%m-%d-%H-%M-%S"
        }

Put this in the "output" section of your gallery-dl .conf file. The log file is stored in a directory called "logs" inside your gallery-dl root folder.

It will keep a log of your latest gallery-dl run, but every new run will overwrite this file.

Search for "[error]" in some notepad program (I recommend notepad++, it can give you a nice summary of your search results), and you'll get all of the posts that failed to download.

@cheese529
Copy link
Author

cheese529 commented Apr 13, 2023

@skulkexpert I started using this random one that I found from an old post on here.
what exactly is the difference between urs and mine?

		"logfile": {
            "path": "C:/Logs/gallery-dl/logfile.txt",
            "mode": "a",
            "format": {
                "debug"  : "[{asctime}][{levelname}] {message}",
                "info"   : "[{asctime}][{levelname}] {message}",
                "warning": "[{asctime}][{levelname}] {message} [Source URL: {extractor.url}]",
                "error"  : "[{asctime}][{levelname}] {message} [Source URL: {extractor.url}]"
            },
            "format-date": "%Y-%m-%dT%H:%M:%S",
            "level": "info"
        },

    "output":
    {
        "mode": "auto",
        "progress": true,
        "shorten": true,
        "ansi": false,
        "colors": {
            "success": "1;32",
            "skip"   : "2"
        },
        "skip": true,
        "log": "[{name}][{levelname}] {message}",
        "logfile": null,
        "unsupportedfile": null``` 

@skulkexpert
Copy link

skulkexpert commented Apr 13, 2023

I started using this random one that I found from an old post on here.
what exactly is the difference between urs and mine?

Here's my full output config:

    "output":
    {
        "mode": "auto",
        "progress": true,
        "shorten": true,
        "logfile": {
              "path": "logs/latest_run.log",
              "mode": "w",
	      "format": {
		    "debug"  : "[{asctime}][{levelname}][{name}] {message}",
                    "info"   : "[{asctime}][{levelname}][{name}]  {message}",
                    "warning": "[{asctime}][{levelname}][{name}]  {message} [Source URL: {extractor.url}]",
                    "error"  : "[{asctime}][{levelname}][{name}]  {message} [Source URL: {extractor.url}]"
	      },
              "format-date": "%Y-%m-%d-%H-%M-%S"
        },
        "unsupportedfile": {
            "path": "unsupported.txt",
            "mode": "a",
            "format": "{asctime} {message}",
            "format-date": "%Y-%m-%d-%H-%M-%S"
        }
    },

The logfile { ... } config needs to be inside the curly brackets of of the output config for it to work, as far as I understand it. Your config may not have been working since the logfile setting inside of your output config was set to null instead of containing your actual logfile config.

Other than that, the actual contents of your logfile config are similar to mine. The only real difference is that your logs will be appended to the same file after each run, since you have "mode": "a" (append to file) instead of "mode": "w" (overwrite file). This means that your log file will grow with each run, which can lead to the file size getting out of hand. This may or may not be useful to you. I prefer to overwrite it so that I only have the logs for the latest run.

Your timestamp is slightly different from mine. "%Y-%m-%dT%H:%M:%S" will have "T" separating the date and the time, with the units of time being separated by a colon. My timestamp, "%Y-%m-%d-%H-%M-%S", separates all the units with a dash, including the time.

The [{name}] part inside my logging format tells me which extractor was being used (like deviantart, etc) for each download.

The path that I have set is going to create a directory inside the root folder from where you are calling gallery-dl, while your config has a static path that goes to a predefined location inside of your C drive. The .log file extension doesnt really change anything, it works like a .txt file.

@GotoConsol
Copy link

Hi skulkexpert,

I am also interested in log file

How to define its path to create a UserName001_UserId_DateofDownload subfolder with the logfile in
Deviantart/UserName001/ folders
?

@GotoConsol
Copy link

GotoConsol commented Apr 14, 2023

Is this ok, or it should be in other format?

"path": "deviantart/{author[username]}_/{author[username]}_[{author[userid]}]/latest_run.log",

@rautamiekka
Copy link
Contributor

rautamiekka commented Apr 14, 2023

Is this ok, or it should be in other format?

"path": "deviantart/{author[username]}_/{author[username]}_[{author[userid]}]/latest_run.log",

Is the trailing underscore in /{author[username]}_/ supposed to be there ? Wouldn't recommend cuz underscores are legit in the username, which would always invalidate the stored name for everyone since there's always an extra nonexistent char.

I don't recommend putting the user ID after the username cuz the latter can be changed, which makes sorting much harder when it does change.

@GotoConsol
Copy link

GotoConsol commented Apr 14, 2023

with the trailing underscore in /{author[username]}_/ I would like to sign that this is a container folder, that contains

{author[username]} basic gallery folder
{author[username]}_Scraps for Scraps
{author[username]}_Stash for Stash
etc

and
{author[username]}_Log_[{author[userid]}] for Log folder

and this log folder would mainly be used to have the user ID in its name
the reason why I would have the user ID in the name of the log subfolder is that the user ID quite long
if the main container folder would have the user ID in its name that may cause too long path for the actual artwork files
but having the user ID's name in a relatively less used subfolder's such as like the log subfolder's seems practical
to have the user ID at least recorded in some of the subfolders' name

I plan to sort artists by user name, I know that user names could change and I find this somewhat confusing too.
However I use user names to recognize artists and I like to see main folders sorted by alphabet based on user names
since user name changes are relatively rare, and many of them I could follow by remembering of them
thus user IDs would serve only as secondary source to follow possible user name changes
I would search for them if I'm curious to see the change
and for that log subfolders with user IDs would serve good notes/flags

yet again main folder names with user IDs would be too long for me

@skulkexpert
Copy link

skulkexpert commented Apr 14, 2023

Is this ok, or it should be in other format?

"path": "deviantart/{author[username]}_/{author[username]}_[{author[userid]}]/latest_run.log",

I am by no means a gallery-dl expert, but I dont think that the logfile can be configured like this. I think I tried to figure something similar out before, but the logfile config is associated with the "output" part of the config, as in my config above, not individual extractors (deviantart, etc). You'd have to define a separate log for each of your extractors to make this work, and I'm not sure if this is possible. Maybe something similar to what you want can be done with postprocessors or something? @mikf can you confirm this?

@GotoConsol
Copy link

with description folders combined
folder system would be nice like this

in a {author[username]}_ main container folder

there would be

{author[username]} basic gallery folder
{author[username]}_Descriptions basic gallery descriptions folder
{author[username]}_Journal for Journal
{author[username]}_Journal_Descriptions Journal descriptions folder
{author[username]}_Scraps for Scraps
{author[username]}_Scraps_Descriptions Scraps descriptions folder
{author[username]}_Stash for Stash
{author[username]}_Stash_Descriptions Stash descriptions folder
{author[username]}_Status for Status
{author[username]}_Status_Descriptions Status descriptions folder
etc

and

{author[username]}_Log_[{author[userid]}] for Log folder

@mikf
Copy link
Owner

mikf commented Apr 15, 2023

Is this ok, or it should be in other format?
"path": "deviantart/{author[username]}_/{author[username]}_[{author[userid]}]/latest_run.log",

I am by no means a gallery-dl expert, but I dont think that the logfile can be configured like this. I think I tried to figure something similar out before, but the logfile config is associated with the "output" part of the config, as in my config above, not individual extractors (deviantart, etc). You'd have to define a separate log for each of your extractors to make this work, and I'm not sure if this is possible. Maybe something similar to what you want can be done with postprocessors or something? @mikf can you confirm this?

This is correct. Logging output can only be configured globally with output.logfile.

To have different settings per site, you could use a wrapper script around gallery-dl that uses different config files depending on the input URL, but this cannot be done with a single config file by setting output options per category.

@GotoConsol
Copy link

mikf, could you please code custom configurable output options / logfiles per category into gallery-dl?

@GotoConsol
Copy link

rautamiekka,

I would like to ask two things from you
in issue #2620
you enclosed a config file:

what does compare postprocessor do?
what does "action": "enumerate" option do in it?
what else setting for "action" exists?

            {
                "name": "compare",
                "action": "enumerate"
            },

why do you use .json extension for posts
is it recommended or default or practical?
what application is best suitable to open a post.json to take a look at it?

            {
                "name": "metadata",
                "mode": "post",
                "extension-format": "post.json"
            }

@cheese529
Copy link
Author

@skulkexpert I actually downloaded a few things with it written just like this so I guess the config was working and I don't have to change anything ? Or should I still change it? Also thank you very very much for explaining all of that in detail, I appreciate it so much :)

The logfile { ... } config needs to be inside the curly brackets of of the output config for it to work, as far as I understand it. Your config may not have been working since the logfile setting inside of your output config was set to null instead of containing your actual logfile config.

@mikf
Copy link
Owner

mikf commented Apr 17, 2023

@GotoConsol

mikf, could you please code custom configurable output options / logfiles per category into gallery-dl?

No, at least not before v2.0

what does compare postprocessor do?
... etc ...

Reading (and understanding) the docs helps.
https://github.com/mikf/gallery-dl/blob/master/docs/configuration.rst

@GotoConsol
Copy link

GotoConsol commented Apr 17, 2023

@mikf

custom configurable output options / logfiles per category why not before v2.0 ?

@skulkexpert
Copy link

@skulkexpert I actually downloaded a few things with it written just like this so I guess the config was working and I don't have to change anything ? Or should I still change it? Also thank you very very much for explaining all of that in detail, I appreciate it so much :)

The logfile { ... } config needs to be inside the curly brackets of of the output config for it to work, as far as I understand it. Your config may not have been working since the logfile setting inside of your output config was set to null instead of containing your actual logfile config.

No problem :)

If it works the way you want it to, you probably dont have to change anything.

Though, if this behavior changes, you may want to look into it. I'm not sure if this is a strict requirement, but that's how most json configurations work. You may have run gallery-dl with the --write-log FILE command or maybe you got some default version of the logs?

Either way, I think that it's best to keep it organized and have the logfile config inside the output config, to ensure that it will always work. Compare your gallery-dl config file to the example given in the repository if you get some error message.
https://github.com/mikf/gallery-dl/blob/master/docs/gallery-dl.conf
or
https://github.com/mikf/gallery-dl/blob/master/docs/gallery-dl-example.conf
Make sure that you dont have a missing comma or a duplicate "output" somewhere.
Always keep a copy of your original config somewhere so you can go back to it you need to.

As mentioned above, you can also read up on what the different settings mean in the documentation (though it can be pretty technical).
https://github.com/mikf/gallery-dl/blob/master/docs/configuration.rst

If you are wondering about anything else involving gallery-dl configs, feel free to ask. I may not be the most skilled at using it, but I have messed around with the config at it. And there are likely more knowledgeable people here who are willing to help, too.

@GotoConsol
Copy link

I could find compare postprocessor in
(https://github.com/mikf/gallery-dl/blob/master/docs/gallery-dl.conf)

but why is .json extension for posts recommended or default,
I couldnt figure out.

@rautamiekka
Copy link
Contributor

what does compare postprocessor do? what does "action": "enumerate" option do in it? what else setting for "action" exists?

            {
                "name": "compare",
                "action": "enumerate"
            },

https://github.com/mikf/gallery-dl/blob/master/docs/configuration.rst#compareaction

When the same file exists, we add a number to the name to easily see duplicates. The default is "replace".

why do you use .json extension for posts is it recommended or default or practical? what application is best suitable to open a post.json to take a look at it?

            {
                "name": "metadata",
                "mode": "post",
                "extension-format": "post.json"
            }

https://github.com/mikf/gallery-dl/blob/master/docs/configuration.rst#metadatamode

Although I can't find any mention of the post mode for the metadata postprocessor from current code (GitHub's search feature is stupid and highly useless, almost like a full-text search with or without wildcards is too much to implement), it seems that the else statement in the Python code makes all other modes than the around half a dozen legit ones create a JSON file, if one doesn't exist, of the same JSON data that gdl uses. The JSON contains a lot of valuable info for later when needed, so I write it; it's already saved me a few times since its introduction to my config file.

A .json is pure text just like a .txt (normally), so it doesn't matter what you open it with. The real difference comes from whether and how well the app can highlight the code.

@GotoConsol
Copy link

GotoConsol commented Apr 19, 2023

@rautamiekka

Thank you for your answer, could you please give me examples of the mentioned informations that you wanted to get from post.JSON ?

@cheese529
Copy link
Author

cheese529 commented May 8, 2023

@skulkexpert so it turns my log file is emptying itself when it gets too large, }, "logfile": { "path": "C:/Logs/gallery-dl/logfile.txt", "mode": "a", "format": { "debug" : "[{asctime}][{levelname}] {message}", "info" : "[{asctime}][{levelname}] {message}", "warning": "[{asctime}][{levelname}] {message} [Source URL: {extractor.url}]", "error" : "[{asctime}][{levelname}] {message} [Source URL: {extractor.url}]" }, "format-date": "%Y-%m-%dT%H:%M:%S", "level": "info" },

here's the current config, how do i edit this to make it never empty the log file no matter how large it gets

@Hrxn
Copy link
Contributor

Hrxn commented May 9, 2023

Huh?

I mean, you could always change "level" to something different, like "error" for example, but I have some doubts that this is actually the issue here?

What do you mean, your logfile is emptying itself? And what is too large?
I have a huge logfile myself occasionally, and I've never seen anything like that.

@mikf
Copy link
Owner

mikf commented May 11, 2023

it turns my log file is emptying itself when it gets too large

It's not gallery-dl that's emptying your log file depending on its size.
There's no builtin functionality to do something like this.

@cheese529
Copy link
Author

You are correct, after setting up the folder and path again there seems to be no more issue anymore. SIDENOTE: would it be possible to add the event of gallery-dl coming across a duplicate and skipping the download inside the logs? Reason for this #3960 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants