-
Notifications
You must be signed in to change notification settings - Fork 361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hasher::new_derive_key accepts a str, why not [u8]? #13
Comments
You could also change the signature to pub fn new_derive_key<T: AsRef<[u8]>>(context: T) -> Self which would allow vec and slices |
@Luro02 That's also almost a non-breaking change.. but there are cases which would break (I.e. |
The way const XYZ_CONTEXT_STRING: &str = "foo bar app 2020-01-10 14:54:14 purpose xyz";
fn purpose_xyz_subkey(root_key: &[u8; 32]) -> [u8; 32] {
let mut out = [0; 32];
blake3::derive_key(XYZ_CONTEXT_STRING, root_key, &mut out);
out
} Of course not everyone is going to follow that exact format for the context string, I'm not kidding myself, but the main thing is that I've picked something that no one else will pick and that I've hardcoded it. In particular, fn purpose_xyz_user_subkey(root_key: &[u8; 32], user_id: &str) -> [u8; 32] {
let mut out = [0; 32];
let context_string = format!("foo bar app purpose xyz {}", user_id);
blake3::derive_key(&context_string, root_key, &mut out); // PLEASE DO NOT DO THIS
out
} Why not? The simplest reason is that some other programmer in the same application might decide to use But beyond avoiding "obvious" mistakes, What we'd rather say, is something like "you don't have to worry about it, as long as you and the FooBar team are both using So anyway, what does this have to do with One objection might be that not all programming languages represent strings as UTF-8 bytes, and that this is going to be a burden on e.g. C# developers. I agree that it's a burden, but I think we'd have the same problem if we were taking bytes, except the problem would fall on every caller instead of on library/bindings authors. Note that So, with apologies for asking you read all that, what use cases do you have in mind for using arbitrary bytes in the context string? |
@oconnor663 https://docs.rs/merlin uses |
fn f(_: &str) {}
fn g(_: impl AsRef<[u8]> + 'static) {}
fn main() {
f("aaa");
g("aaa");
g(b"aaa");
//g(&"aaa".to_string());
//---^^^^^^^^^^^^^^^^^-- temporary value is freed at the end of this statement
// | | |
// | | creates a temporary which is freed while still in use
// | argument requires that borrow lasts for `'static`
} |
Another option would be to use a doc-hidden function as a hidden constructor and use a macro which binds in the crate name or module path. macro_rules! context {
($tt:tt) => {
$crate::do_not_call_me_manually(
concat!(env!("CARGO_PKG_NAME"), " or ", module_path!(), ": ", $tt).as_bytes()
)
}
}
struct SafeContextStr(&'static [u8]);
#[doc(hidden)]
fn do_not_call_me_manually(s: &'static [u8]) -> SafeContextStr {
SafeContextStr(s)
}
fn f(_: SafeContextStr) {}
fn main() {
f(context!("foobar"));
} |
Returning to your question. I'd rather expose Edit: I see @Luro02 mentioned vectors; that'd be a bad example but |
I've used const COMPANY_NAME: &str = "example.com";
const PURPOSE_A: &str = "2020-01-13 16:17:46 purpose A";
const PURPOSE_B: &str = "2020-01-13 16:17:47 purpose B";
let root_key = b"my secret key";
let mut subkey_a = [0; 32];
blake3::derive_key(&format!("{} {}", COMPANY_NAME, PURPOSE_A), root_key, &mut subkey_a);
let mut subkey_b = [0; 32];
blake3::derive_key(&format!("{} {}", COMPANY_NAME, PURPOSE_B), root_key, &mut subkey_b); Those are "dynamic" context strings in a kind of pedantic sense, but they don't contain any sort of input, which is what really matters. It wouldn't be my first recommendation --- mainly because seeing Another consideration is that the patterns in this library will have a big influence on future implementations of BLAKE3 in other languages. If we use |
Concatenating arrays does work with const-generics. const SUBKEY_A = StaticVec::new_from_const_array(COMPANY_NAME)
.concat(StaticVec::new_from_const_array(PURPOSE_A)); The resulting Alternatively without const-generics you could use macros for this. macro_rules! define {
($name:ident = $str:tt) => {
macro_rules! $name {
() => { $str }
}
}
}
macro_rules! context {
($($name:ident),*) => {
concat!("" $(,$name!(),)" "*)
}
}
define!(COMPANY_NAME = "example.com");
define!(PURPOSE_A = "purpose A");
define!(PURPOSE_B = "purpose B");
const SUBKEY_A: &str = context!(COMPANY_NAME, PURPOSE_A);
const SUBKEY_B: &str = context!(COMPANY_NAME, PURPOSE_B); Edit: I have previously tried to define macros from inside macros and was surprised to see this |
Regarding other languages. Both |
pub fn derive_key(context: &str, key_material: &[u8], output: &mut [u8]) Is the |
@Luro02 No specified lifetime does not imply |
You can do a fully dynamic (but also use once_cell::sync::Lazy;
use std::io::prelude::*;
static CONTEXT: Lazy<String> = Lazy::new(|| {
let mut s = String::new();
std::io::stdin().read_to_string(&mut s).unwrap();
s
});
fn main() {
let s: &'static str = &*CONTEXT;
println!("{}", s);
}
In layout yes, but what I meant to suggest was that each language should use its natural string type for the context parameter. As in, whatever type you get from the literal (There is a performance downside to accepting a All of the above comments are good points, and I expect this thread to grow over the coming year. My thoughts about this API are probably going to change over time. The crate is v0.1.1 now, and I'm not at all opposed to changing this API for v0.2, v0.3, etc. Incidentally I also expect the stabilization of const generics to have an influence here: I would rather return an |
The type you get with Side note: |
Tangent: My understanding is that a |
Hmm. Okay. Still the best matching without hunting through |
Going to close this one for now. We will definitely revisit this API when const generics land, if not before. |
I should keep this issue open and tag this with a |
@oconnor663 Almost closed with |
@Luro02 I prefer the current API to the alternatives we've proposed here so far. A caller requiring arbitrary bytes in the context field is odd, and I still suspect that the substantial majority of such callers are actually violating the security requirements and would be better served by an API that did not let them do that. The main thing that would make me reconsider is a caller in the real world with a specific requirement. For example, "we're exposing this crate to a large legacy C++ codebase, and our convention for all string constants is UCS-2, and we cannot afford to pay the cost of converting to ASCII/UTF-8 every time a key is derived." There are several reasons I think such a caller is unlikely to appear:
So anyway, it could happen, but I want to wait until it does happen. |
See #42 for my stream-of-consciousness thoughts about what callers should be doing with arbitrary dynamic bytes. I'd love to have other folks chime in there with ideas. |
This can be enabled with the unstable feature flag, `i_know_what_i_am_doing`. Developers attempting to deploy blake3 via the Rust crate often encounter `derive_key(&str, ...)` which forces the INFO parameter to be a valid UTF-8 string. Please see the discussion in issue [13](BLAKE3-team#13), in particular, this [comment](BLAKE3-team#13 (comment)). The recommended course of action for those with non-UTF-8 INFO is to use `hash` and `keyed_hash` and open code their own `derive_key` which takes an `&[u8]`. This is not good for two reasons: First, it is quickly seen that this forces deviation from the Blake3 paper for how `derive_key` should be performed, as `hash` doesn't let you set the `flags` field. Attempting to use the underlying `hash_all_at_once` fails because it's not exported. Second, the developer is now forced into the position of maintaining their own patches on top of the blake3 repo. This is a burden on the developer, and makes following blake3 upstream releases *much* less likely. This patch proposes a reasonable compromise. For developers who require `&[u8]` INFO field, they can enable the rust feature flag `i_know_what_i_am_doing` to expose the API `derive_key_u8ref`. This enables developers to use upstream blake3 directly, while still discouraging sloppy behavior in the default case. Signed-off-by: Jason Cooper <[email protected]>
There is no reason for UTF-8 here. Please
s/str/[u8]/
here. Alternatively provide another function to avoid the breaking change. Though, as I've mentioned in #11 I'm not against a breaking change for consistency with other crates, especially this early.https://docs.rs/blake3/0.1.0/blake3/struct.Hasher.html#method.new_derive_key
The text was updated successfully, but these errors were encountered: