Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bindings for ClientSession, plus demo program #1

Merged
merged 23 commits into from
Dec 11, 2020
Merged

Add bindings for ClientSession, plus demo program #1

merged 23 commits into from
Dec 11, 2020

Conversation

jsha
Copy link
Collaborator

@jsha jsha commented Dec 9, 2020

The demo program here connects via HTTPS to a host, makes an HTTP request, and prints the response to stdout.

Copy link

@alex alex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One meta note: It's my understanding that the end goal here is for curl to be able to use rustls as a TLS backend.

The approach here seems to be to write C binings to rustls so that curl can contain C code which calls them.

An alternate approach would be to have the curl backend be written in Rust, and have it expose the struct of function pointers that curl wants.


static mut RUSTLS_CONFIG: Option<Arc<rustls::ClientConfig>> = None;

#[no_mangle]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd strongly encourage you not to include global state, it makes things incredibly painful when multiple dependencies want to use this in single project.

slice::from_raw_parts(buf, count as usize)
};
let mut cursor = Cursor::new(input_buf);
let n_read: usize = match session.read_tls(&mut cursor) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me that this is sound -- the caller is quite likely to provide a buf which contains uninitialized memory, I believe turning those until a slice is technically not sound.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any tips on where I can read about the soundness of turning uninitialized memory into a slice? Do I just need to initialize it? Before or after from_raw_parts?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a great citation (boo me!), it's only a soundness issue if read_tls ever reads from &mut cursor, as long as it writes to it only, then it's find.

This is, classically, a big issue in the design of io::Read, which led to this whole thing rust-lang/rust#42002

@jsha
Copy link
Collaborator Author

jsha commented Dec 9, 2020

One meta note: It's my understanding that the end goal here is for curl to be able to use rustls as a TLS backend.

An alternate approach would be to have the curl backend be written in Rust, and have it expose the struct of function pointers that curl wants.

Your understanding is correct, and I like your proposed alternate approach. Note that we now have an additional goal: We'd like for Apache (and someday Nginx?) to be able to use rustls. To me, that seems to suggest writing a more general "C API for rustls" rather than "Curl API for rustls." But if we do "Curl API for rustls," we could similarly do "Apache API for rustls" (etc).

@alex
Copy link

alex commented Dec 9, 2020 via email

Makefile Outdated
cbindgen --lang C --output src/lib.h

target/crustls-demo: target/main.o target/debug/libcrustls.a
$(CC) -Werror -o $@ $^ $(LDFLAGS)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're starting out a brand new C project, you could turn on -Wall and/or -Wextra and/or -Wpedantic, at the risk of having builds break when C compilers add new warnings. This page is a decent discussion of different warning options in gcc and clang. Additionally, you could turn on gcc or clang's static analyzers for more checks. I'm not sure if there are common flags across those compilers for static analysis so you might have to check what $CC is.

.root_store
.add_server_trust_anchors(&webpki_roots::TLS_SERVER_ROOTS);
env_logger::init();
Arc::into_raw(Arc::new(config)) as *const c_void
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't done and Rust/C interop so I'm not familiar with the idioms and best practices, but out of curiosity: it doesn't look like you're using the thread safety features of Arc here (since the Arc values never escape either this function or rustls_client_config_free, below). And even if they did, I would guess that none of the guarantees provided by Arc would hold while the pointer is loaned out to C-land. Is there a reason not to use an Rc instead, and maybe avoid taking a lock in the Arc?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, wait, I think I misunderstood how into_raw and from_raw work: the Arc and its reference count aren't thrown away, they are still in the buffer passed back to C-land. But suppose you have two threads, and one of them calls rustls_client_config_free at the same time that the other one calls rustls_client_session_new with the same pointer. The two functions will separately use Arc::from_raw to recreate an Arc. Will those two distinct Arcs give you the runtime mutual exclusivity you want? That is, is calling Arc::from_raw on the same raw buffer twice the same as doing std::clone::clone() on an Arc?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct that we're not getting any of the semantics of Arc while the pointer is owned by C-land. We don't get the reference counting, let alone the locking. The reason I'm using an Arc rather than an Rc or a Box is that rustls' ClientSession::new requires an Arc as part of its type signature. Also, we need C's copy of the pointer to remain valid long-term, so we can't do Arc::new(ptr) before each call to ClientSession::new, because that would move the value pointed to by ptr, invalidating C's copy.

Also a small note: When we do Arc::into_raw, we actually get a pointer to the inner value (i.e. ClientConfig). The strong and weak counters stay in memory right where they are, a few bytes lower than ClientConfig. When we do from_raw, it does a negative offset to reconstruct the Arc, including those strong and weak counters that were hiding out at a lower memory location.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so presumably the same goes for the mutex protecting those counters, so two separate Arcs that were from_rawed into existence will be using the same lock and incrementing/decrementing the same reference count(s). Wow, that's pretty cool!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, though to be pedantic, there's no lock, just atomic::AtomicUsize counts: https://doc.rust-lang.org/src/alloc/sync.rs.html#281-294

src/lib.rs Outdated
//
// Unsafety:
//
// v must be a non-null pointer that resulted from previously calling `Arc::into_raw`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these are meant to be doccomments, then they need ///

#include <errno.h>
#include <arpa/inet.h>

#include "lib.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the makefile makes it clear this header gets generated from lib.rs, but it'd be nice to have a comment to that effect here in the C program

src/main.c Outdated
m = write(fd, buf, n);
if(m < 0) {
perror("writing to stdout");
exit(1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: calling exit(3) from a function isn't great style but it's fine for a demo/test program like this.

Makefile Outdated
@@ -0,0 +1,26 @@
ifeq ($(shell uname),Darwin)
LDFLAGS := -Wl,-dead_strip
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you get epoll(7) on Darwin so you won't be able to compile there anyway.

src/main.c Outdated
struct epoll_event ev, events[MAX_EVENTS];
int conn_sock, nfds, epollfd;

epollfd = epoll_create1(0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

epollfd is leaked

unsafe {
match (session as *const ClientSession).as_ref() {
Some(cs) => cs.wants_read(),
None => false,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what case would we see None here? if session were NULL?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, as_ref turns a raw pointer into an Option<&T>, which is None if the pointer is NULL.

src/lib.rs Outdated
n_written as ssize_t
}

// Read plaintext bytes from the ClientSession. This acts like
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: doccomment should explain that plaintext is written into buf.

type CrustlsResult = c_int;

pub const CRUSTLS_OK: c_int = 0;
pub const CRUSTLS_ERROR: c_int = 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be a Rust enum that has an appropriate Into implementation to convert into c_int?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started on this and could not get cbindgen to generate the definitions right. I'm sure it's possible, but I'd like to punt to another PR.

@jsha
Copy link
Collaborator Author

jsha commented Dec 11, 2020

Thanks for the detailed comments. They led to some big improvements. This is ready for re-review.

Copy link
Collaborator

@tgeoghegan tgeoghegan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just make sure to initialize getaddrinfo_output to NULL in make_conn.

src/main.c Outdated
int
make_conn(const char *hostname)
{
struct addrinfo *getaddrinfo_output;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to initialize this to NULL

@jsha jsha merged commit baafb0e into main Dec 11, 2020
@jsha jsha deleted the first branch December 11, 2020 23:48
// and the inner ClientConfig will be dropped.
let arc: Arc<ClientConfig> = Arc::from_raw(c);
let strong_count = Arc::strong_count(&arc);
if strong_count < 1 {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I'm pretty sure this is unreachable -- if it was 0 then the value would already have been freed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, good point. My goal here was to catch a double free. And indeed, I can produce unexpected results by double freeing - but they are unsound. I.e. the Arc gets freed on the first call; on the second call, the Arc (including strong_count) is uninitialized. For instance in a test run it winds up as 94281166761504. I guess there really isn't anything I can do to guard against unsound usage on the C side and I should just delete this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out this unreachable code. Looks like I neglected to delete it in this PR. Followed up in #32.

jsha pushed a commit that referenced this pull request Mar 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants