Skip to content

server: add --enable-core-dump flag#15245

Merged
mattklein123 merged 44 commits intoenvoyproxy:mainfrom
rgs1:prctl-set-dumpable
Mar 6, 2021
Merged

server: add --enable-core-dump flag#15245
mattklein123 merged 44 commits intoenvoyproxy:mainfrom
rgs1:prctl-set-dumpable

Conversation

@rgs1
Copy link
Member

@rgs1 rgs1 commented Mar 1, 2021

This ensures Envoy can core dump when the dumpability bit might have
been unset (e.g.: running inside a container with fewer capabilities
than the ones Envoy itself has).

Fixes #15242.

Signed-off-by: Raul Gutierrez Segales rgs@pinterest.com

This ensures Envoy can core dump when the dumpability bit might have
been unset (e.g.: running inside a container with fewer capabilities
than the ones Envoy itself has).

Fixes envoyproxy#15242.

Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
@repokitteh-read-only
Copy link

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/envoy/.
API shepherd assignee is @lizan
CC @envoyproxy/api-watchers: FYI only for changes made to api/envoy/.

🐱

Caused by: #15245 was opened by rgs1.

see: more, trace.

Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
@rgs1
Copy link
Member Author

rgs1 commented Mar 1, 2021

I've got --enable-core-dump running internally, works great 👌

Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
@davinci26
Copy link
Member

@rgs1 can you open an issue or add in the existing issue a Windows label so that we know that we have to implement this feature on windows

@rgs1
Copy link
Member Author

rgs1 commented Mar 1, 2021

@rgs1 can you open an issue or add in the existing issue a Windows label so that we know that we have to implement this feature on windows

What's the exact problem for Windows? Afaik Windows doesn't have capabilities nor a dumpability bit...

@rgs1
Copy link
Member Author

rgs1 commented Mar 1, 2021

@rgs1 can you open an issue or add in the existing issue a Windows label so that we know that we have to implement this feature on windows

What's the exact problem for Windows? Afaik Windows doesn't have capabilities nor a dumpability bit...

Which makes me wonder if we should hide via ifdefs the --enable-core-dump command line flag for Windows (and other platforms) without capabilities and this core dumping issue...

cc: @mattklein123 for thoughts

Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
@davinci26
Copy link
Member

davinci26 commented Mar 1, 2021

What's the exact problem for Windows? Afaik Windows doesn't have capabilities nor a dumpability bit...

We have flag that is not working for one platform so we should probably document it.

Additionally windows can generate memory dumps from failing applications https://docs.microsoft.com/en-us/windows/client-management/generate-kernel-or-complete-crash-dump (the same applies for user mode applications)

edit: I am not sure if it makes sense to push this functionality to envoy codebase or to the docker image ( see docs). IMHO the latter makes sense more to me but either one should be doable

@rgs1
Copy link
Member Author

rgs1 commented Mar 1, 2021

What's the exact problem for Windows? Afaik Windows doesn't have capabilities nor a dumpability bit...

We have flag that is not working for one platform so we should probably document it.

Yeah I agree, I actually wonder if we should hide the flag entirely from platforms for which it isn't implemented.

Additionally windows can generate memory dumps from failing applications https://docs.microsoft.com/en-us/windows/client-management/generate-kernel-or-complete-crash-dump (the same applies for user mode applications)

Ok, but is this already working out of the box for Windows or is there anything else to be done? In fairness, this --enable-core-dump is not the entire truth for Linux given that it still won't be enough if you didn't set ulimit correctly and furthermore it isn't really needed if you didn't do anything that won't unset the dumpable bit of the process. Hmm, maybe it should be named --enable-dumpable-bit to be more precise.

edit: I am not sure if it makes sense to push this functionality to envoy codebase or to the docker image ( see docs). IMHO the latter makes sense more to me but either one should be doable

You are talking about the Windows case not this PR for the Linux case?

@davinci26
Copy link
Member

You are talking about the Windows case not this PR for the Linux case?

Windows case

Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree that if an option only applies to certain platforms we should document it as such and only expose it on those platforms, optimally without a ton of different ifdefs (e.g. using platform capabilities). Thanks for working on this.

/wait

* fix/improve cli doc
* introduce linux/platform_impl.cc for Linux specific prctl calls
* enabling core dump now happens in MainCommonBase instead of InstanceImpl,
  because we need access to PlatformImpl
* also, move PlatformImpl from MainCommon to MainCommonBase
* added some bazel helpers to distinguish Linux from the rest of Posix

Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
Raul Gutierrez Segales added 11 commits March 2, 2021 12:07
Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
Remove outer calls to envoy_cc_platform_dep, which means we
cannot properly pick the Linux platform.

Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
This should fix the coverage issue and catch any regressions
if our kernel friends were to ever break prctl().

Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
Raul Gutierrez Segales added 3 commits March 4, 2021 16:40
Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
* check that enableCoreDump() was called
* check the logs for the expected warn entry

Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
Raul Gutierrez Segales added 7 commits March 4, 2021 17:38
Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
This probably is enough and we don't need to additionally use
`config_settings_group` to group things further.

Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
Fix
Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
@rgs1
Copy link
Member Author

rgs1 commented Mar 5, 2021

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Check envoy-presubmit isn't fully completed, but will still attempt retrying.
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #15245 (comment) was created by @rgs1.

see: more, trace.

@rgs1
Copy link
Member Author

rgs1 commented Mar 5, 2021

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Check envoy-presubmit isn't fully completed, but will still attempt retrying.
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #15245 (comment) was created by @rgs1.

see: more, trace.

Raul Gutierrez Segales added 5 commits March 5, 2021 09:20
Keep ref instead of copying.

Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks LGTM with small comment.

/wait

Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@repokitteh-read-only repokitteh-read-only bot removed the api label Mar 5, 2021
@mattklein123 mattklein123 merged commit 4f1ec4f into envoyproxy:main Mar 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

supporting core dumps when Envoy has more capabilities

4 participants