Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix panics on failed initialization #165

Merged
merged 6 commits into from
Jan 10, 2024
Merged

Fix panics on failed initialization #165

merged 6 commits into from
Jan 10, 2024

Conversation

sosthene-nitrokey
Copy link
Contributor

@sosthene-nitrokey sosthene-nitrokey commented Jan 4, 2024

This patch

  • adds a Result version of the Device static so that any call can first check that the initialization was successful
  • improves the deserialization of certificate fingerprints to make it fail properly rather than panic
  • Use the CertStore add_parseable_certificates to discard unparseable certificates rather than panic. If no certificate is parsed, the initialization returns an error.

fix #161

@sosthene-nitrokey sosthene-nitrokey marked this pull request as ready for review January 4, 2024 13:52
Copy link

codecov bot commented Jan 4, 2024

Codecov Report

Attention: 34 lines in your changes are missing coverage. Please review.

Comparison is base (43da14b) 90.74% compared to head (4157673) 90.73%.

❗ Current head 4157673 differs from pull request most recent head ff07b5e. Consider uploading reports for the commit ff07b5e to get more accurate results

Files Patch % Lines
pkcs11/src/config/config_file.rs 84.21% 9 Missing ⚠️
pkcs11/src/api/mod.rs 78.26% 5 Missing ⚠️
pkcs11/src/api/object.rs 81.81% 4 Missing ⚠️
pkcs11/src/api/token.rs 88.88% 4 Missing ⚠️
pkcs11/src/config/initialization.rs 94.73% 4 Missing ⚠️
pkcs11/src/backend/events.rs 66.66% 2 Missing ⚠️
pkcs11/src/backend/key.rs 50.00% 2 Missing ⚠️
pkcs11/src/backend/mod.rs 0.00% 2 Missing ⚠️
pkcs11/src/backend/slot.rs 86.66% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #165      +/-   ##
==========================================
- Coverage   90.74%   90.73%   -0.01%     
==========================================
  Files          31       31              
  Lines        6158     6436     +278     
==========================================
+ Hits         5588     5840     +252     
- Misses        570      596      +26     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@robin-nitrokey robin-nitrokey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a step in the right direction, but I would prefer to combine this with a general improvement of the initialization steps:

Generally, we can expect that all other calls are wrapped in a pair of C_Initialize/C_Finalize calls. (Except for C_GetFunctionList, which can be called at any time.)

As far as I see, this implementation would still allow other calls without C_Initialize, and initialization errors would cause an unclear general error. On the other hand, C_GetFunctionList would unnecessarily fail if there is a problem with the configuration. Also, it is easy to miss an ensure_init() call, then access DEVICE and still cause a panic.

My suggestion would be to only initialize the device in C_Initialize and return CKR_FUNCTION_FAILED if it fails. All other calls should check if initialization has been performed (and not finalized) and return CKR_CRYPTOKI_NOT_INITIALIZED instead. Accessing DEVICE directly should not panic in the error case, but also return CKR_CRYPTOKI_NOT_INITIALIZED. C_Finalize should reset the device.

This would lead to better error messages, consistent behavior if a C_Initialize call is missing, and it would reduce the chance of programming errors when accessing DEVICE.

What do you think?

@sosthene-nitrokey
Copy link
Contributor Author

I would like to change the approach so that the statics are only used in the top level calls, and every thing else takes a &<relevant static> but that probably makes sense in another PR IMHO.

But so that I correctly understand your approach, you would make the Device static a Mutex<Option<...>> that is Some between an Initialize call and a Finialize call.

That could work but I'm not sure how much it would work in a multithreaded context. I'd need to learn how PKCS11 deals with that. Given the weird nginx workarournds and #157 I think this can be approached in another PR. Maybe in this one I can remove the ensure_init! for functions that are expected to be infallible.

@robin-nitrokey
Copy link
Member

An alternative approach would be to use OnceLock and only initialize the device on the first C_Initialize call. I don’t think we are required to free all resources on finalize, or to re-initialize everything on each initialize call.

I’m fine with splitting this up into a separate PR, but I’d still like to change the error codes and the direct DEVICE access in this PR. Specifically:

  • In C_Initialize, return FUNCTION_FAILED in the Err case.
  • In all other functions (i. e. the ensure_init macro), return NOT_INITIALIZED in the Err case.
  • Use DEVICE_RESULT instead of DEVICE (or just replace DEVICE with DEVICE_RESULT). In the Err case, return NOT_INITIALIZED.

That would mean that calls after C_Finalize could still work, but it would handle the other cases without introducing big changes.

@sosthene-nitrokey
Copy link
Contributor Author

Ok, I will do that then, but I think some uses of DEVICE are in infallible functions.

@robin-nitrokey
Copy link
Member

Thanks! In that case, having an explicit panic is IMHO still preferable to an indirect panic through accessing a lazy static.

@sosthene-nitrokey
Copy link
Contributor Author

Done, the non-tested lines are all error handling that should never happen.

@ansiwen
Copy link
Collaborator

ansiwen commented Jan 5, 2024

Can you confirm that this still works with forks, that is, if the process that dlopened this module forks and then calls C_Initialize? (for example the tokio runtime can't handle that, see tokio-rs/tokio#1541, so there are dragons luring). One of the issues is: you can't cleanly fork if there are threads created already.

@sosthene-nitrokey
Copy link
Contributor Author

sosthene-nitrokey commented Jan 5, 2024

Do you have an example test for that from the previous errors you and Nils encountered? It would be nice to have it in CI because that would be broken easily on accident.

This will be especially relevant as I plan on tackling #157 next.

@sosthene-nitrokey
Copy link
Contributor Author

I don't think anything I've done in this PR should change how it interacts with fork(). After a fork() the static should keep the same values, which means that the initialization should be the same.

@ansiwen
Copy link
Collaborator

ansiwen commented Jan 5, 2024

Ok, maybe this is not relevant because I mix it up with the LazyStatic that is mentioned in issue #163. I just wanted to throw it in, in case it's related. The only example I have is running an actual nginx, that's where we saw the issues. I'm not aware of a test, that @nponsard wrote for that, but would certainly be useful. It would require loading the module with dlopen(), then doing a fork, and then testing the functionality of the module in the fork.

@nponsard
Copy link
Contributor

nponsard commented Jan 5, 2024

I did not write any test for that, when I was investigatig the forking issue I was running the nginx example.

@sosthene-nitrokey
Copy link
Contributor Author

Just tested on top of the images in #169 , it works with nginx. Now testing with apache.

@sosthene-nitrokey
Copy link
Contributor Author

Both work, with a slight need for an update of the apache image which is incorrect in #169

Copy link
Member

@robin-nitrokey robin-nitrokey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

This patch adds a `Result` version of the `Device` static
so that any call can first check that the initialization was successful.
This removes panic paths in the initialization of the device.
return CKR_CRYPTOKI_NOT_INITIALIZED  in other calls when not initialized
@sosthene-nitrokey sosthene-nitrokey merged commit 457fc9a into main Jan 10, 2024
3 checks passed
@sosthene-nitrokey sosthene-nitrokey deleted the file-panic branch January 10, 2024 13:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve Handling of incorrect p11nethsm.conf
4 participants