Skip to content

chore: log failure to run CLI command#12863

Merged
Stebalien merged 1 commit intomasterfrom
masih/log-failure-to-start
Feb 4, 2025
Merged

chore: log failure to run CLI command#12863
Stebalien merged 1 commit intomasterfrom
masih/log-failure-to-start

Conversation

@masih
Copy link
Copy Markdown
Member

@masih masih commented Feb 3, 2025

Related Issues

Proposed Changes

Prior code only conditionally logged the output when LOTUS_DEV env var was present. This doesn't make sense for a number of reasons:

  • In case of an error starting up lotus daemon via systemctl for example the logs would miss the error. One then has to search through system journal to see what happened.

  • There are no other places that I can find where LOTUS_DEV env var is used. The commit that introduced this condition is very old with no clear commit message to shed light into the rationale.

For the first reason alone, the changes here remove that condition, and log the message when the logging output is not standard err or out. This way we at least cover the case where lotus is run in a production environment and would avoid partial logging of errors.

Additional Info

Checklist

Before you mark the PR ready for review, please make sure that:

@masih masih added the skip/changelog This change does not require CHANGELOG.md update label Feb 3, 2025
@masih masih requested review from Stebalien and rvagg February 3, 2025 13:56
@Stebalien
Copy link
Copy Markdown
Member

If I run a random CLI command and it fails, will I now see the error twice? Once as a log line, and once as a message?

@masih
Copy link
Copy Markdown
Member Author

masih commented Feb 3, 2025

If I run a random CLI command and it fails, will I now see the error twice? Once as a log line, and once as a message?

Depends on what the logging configuration is. For example, log could be going to a file. But certainly possible for one to see it twice.

I am happy to acomodate an alternative way to do this, as long as: Logs capture the exit error of the application (in this case lotus daemon). As an operator, discovering that it does not was a big surprise to me.

@Stebalien
Copy link
Copy Markdown
Member

I'm not sure the best way to solve this. Right now, the user gets the error on the console twice where the error log is pretty verbose. E.g., running lotus send f01010101 10 outputs:

2025-02-03T18:22:30.931-0800 ERROR cli cli/helper.go:49 Failed to start application {"err": "could not get API info for FullNode: repo directory does not exist. Make sure your configuration is correct", "errVerbose": "could not get API info for FullNode:\n github.com/filecoin-project/lotus/cli/util.GetRawAPIMulti\n /home/steb/src/github.com/filecoin-project/lotus/cli/util/api.go:152\n - repo directory does not exist. Make sure your configuration is correct"}
ERROR: could not get API info for FullNode: repo directory does not exist. Make sure your configuration is correct

Where it would otherwise output just:

ERROR: could not get API info for FullNode: repo directory does not exist. Make sure your configuration is correct


However, I do get the desire to see failures in lotus daemon etc. show up in the log. Options I can see are:

  1. Log if and only if the log output is also being sent to a file or URL.
  2. Log the error from lotus daemon only.

I'd prefer the first and that should be as simple as calling go-log.GetConfig() and checking if either config.File or config.URL are non-empty.

(IMO, if you're using systemd, you should generally just dump everything to systemd and we should probably have better support for syslog log levels, but that's a different can of worms)

@masih
Copy link
Copy Markdown
Member Author

masih commented Feb 4, 2025

(IMO, if you're using systemd, you should generally just dump everything to systemd and we should probably have better support for syslog log levels, but that's a different can of worms)

Indeed. Fiddling around with GetConfig seems hacky but it's the path of least resistance. The root cause IMHO is the duality of CLI output. I would handle it all with logging, and have a custom formatter settable by global flags with a reasonable default for the common case.

Prior code only conditionally logged the output when `LOTUS_DEV` env
var was present. This doesn't make sense for a number of reasons:

* In case of an error starting up `lotus daemon` via systemctl for
  example the logs would miss the error. One then has to search through
  system journal to see what happened.

* There are no other places that I can find where `LOTUS_DEV` env var is
  used. The commit that introduced this condition is very old with no
  clear commit message to shed light into the rationale.

For the first reason alone, the changes here remove that condition, and
log the message when the logging output is not standard err or out. This
way we at least cover the case where lotus is run in a production
environment and would avoid partial logging of errors.
@masih masih force-pushed the masih/log-failure-to-start branch from fb6a8e8 to 8242aa6 Compare February 4, 2025 10:34
@masih
Copy link
Copy Markdown
Member Author

masih commented Feb 4, 2025

@Stebalien This is ready for another look whenever you get a chance. Thanks

@Stebalien Stebalien merged commit bb70977 into master Feb 4, 2025
@Stebalien Stebalien deleted the masih/log-failure-to-start branch February 4, 2025 16:43
masih added a commit that referenced this pull request Feb 6, 2025
Log failure to run CLI command

Prior code only conditionally logged the output when `LOTUS_DEV` env
var was present. This doesn't make sense for a number of reasons:

* In case of an error starting up `lotus daemon` via systemctl for
  example the logs would miss the error. One then has to search through
  system journal to see what happened.

* There are no other places that I can find where `LOTUS_DEV` env var is
  used. The commit that introduced this condition is very old with no
  clear commit message to shed light into the rationale.

For the first reason alone, the changes here remove that condition, and
log the message when the logging output is not standard err or out. This
way we at least cover the case where lotus is run in a production
environment and would avoid partial logging of errors.

(cherry picked from commit bb70977)
shobit000 pushed a commit to shobit000/lotus that referenced this pull request Feb 7, 2025
Log failure to run CLI command

Prior code only conditionally logged the output when `LOTUS_DEV` env
var was present. This doesn't make sense for a number of reasons:

* In case of an error starting up `lotus daemon` via systemctl for
  example the logs would miss the error. One then has to search through
  system journal to see what happened.

* There are no other places that I can find where `LOTUS_DEV` env var is
  used. The commit that introduced this condition is very old with no
  clear commit message to shed light into the rationale.

For the first reason alone, the changes here remove that condition, and
log the message when the logging output is not standard err or out. This
way we at least cover the case where lotus is run in a production
environment and would avoid partial logging of errors.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip/changelog This change does not require CHANGELOG.md update

Projects

Status: ☑️ Done (Archive)

Development

Successfully merging this pull request may close these issues.

2 participants