-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New dispatcher infrastructure. #2785
Conversation
I've have
I don't even need to run a test on master to tell you that this is worse than 60% performance loss. |
@@ -204,7 +193,7 @@ impl Isolate { | |||
|
|||
let shared = SharedQueue::new(RECOMMENDED_SIZE); | |||
|
|||
let needs_init = true; | |||
let needs_init = AtomicBool::new(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this need to be atomic now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I can run init without a mutable reference, but I just realized this getting checked every time we run poll
is going to slow things way down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be possible to revert this entirely, but I'm going to wait for the time being just in case I need it somewhere else.
Got it sorted. Calling the op id accessor for every dispatch was slowing things down the most, but that atomic also accounted for about 15%.
Master
Mostly within margin of error as far as I can tell. Maybe slightly in favor of master. |
Now that performance is sorted this is coming along well. Here are just a few nice features this will add(mainly for embedders):
I'm working on a better way to notify js of the current op id for any give op. Maybe something like this?
This should also allow async ops to I'm going to be focusing on getting cli working for now though. Also I'm not sure if you care much, but I had to drop the whole |
This is really a mouthful: Deno.core.ids.builtins.OpListen Can this be shortened to Deno.ops.listen ? |
I still want to include top level namespaceing( I did remove the Also this might be void complaint, since it now supports namespace aliasing:
It also still supports directly reading op ids:
This will still cost some time to evaluate the if, so it's not ideal. This feature is mainly designed for debug/testing. |
@afinch7 why is this bit needed?
Can't it be accessed directly via |
You can, but if you care about performance at all you really shouldn't. This is implemented as a |
I always thought that V8 can optimize away such loops... Is there a big difference in performance using first approach? |
In our case for op ids it's a huge difference like 1/4-1/5 the speed if not slower. If you look at the benchmarks above the first one at the top is
(plus another smaller optimization <10%) |
To be honest that sounds really strange to have 20% perf hit on object lookup. I wonder if that might be related to #2758. I still need to drill down a bit more into this PR. |
676d1ff
to
571ac0f
Compare
Don't merge this yet. I think I reintroduced the bug from #2776. I will see if I can get that one figured out tomorrow. It seems to only occur on linux during |
Benchmarks from the last passing build
|
My theory is that the bug is present in master, we just see it rarely. Appveyor builds fail probably 50% of the time - I think with a segfault - it may be this bug. |
I just ran |
This bug doesn't apply to master, but that means that we know more specifically it's something I included in this PR. |
Never mind about this being specific to this branch. Not so sure about that anymore. I just added some debug prints and rebuilt it, and now it works fine. Either I just got really lucky, or it's a problem where the same code will result in good and bad binaries randomly. I only added debug prints to My main problem at this point is getting another "bad" binary. I've reverted all my debug prints and everything, and rebuilt it about a dozen times and it's worked fine every time. This one seems rare enough that it shouldn't be a problem(<1/10 build is faulty), but still really want to figure it out at some point. |
There is really only one slight issue still left in this commit, we can't safely make context sensitive calls from op id notifiers:
or
I think this can be solved, but I don't think it's really considered critical. It's not an intended use case, so I think we should punt this for now. |
df98322
to
1eb723d
Compare
Finally got it to compile a faulty binary again. Here is a comparison of |
I'm now pretty sure this weird segfault has something to do with the addition of |
@afinch7 I want to add at least one JSON op before doing this refactor (see #2799) - so we can ensure the abstractions are good for that use-case. Hopefully this rebase won't be too bad. I'll try to land the JSON op today. I've recently implemented JSON ops in my dev repo and have a good idea of how they should look: |
@afinch7 It's probably more work, but I want to let the JSON ops settle in a bit before prescribing a new dispatch interface. I think we can find a lot of mini clean ups on that front and it won't be hard to then later modify the ops to use a dispatcher trait as you've outlined above. I've tried to leave the code very dumb and ripe for abstraction. I apologize because it massively conflicts with this patch - I'll help getting this rebased after #2799 (and any other conversions that are ready to land at the time). |
I don't think I'm going to bother keeping anything outside of core(just going to start fresh). I had to rebase most of my changes to I figure now is a good time for a preliminary review of my changes to core, since I don't plan on changing much in core anymore. I've also been working on more in depth embedding example for core. It's a multi worker tcp server using work stealing. I have a feeling that the main speed limit in deno right now is the speed of a single v8 isolate. This should give us a better understanding of the speed limiters on the rust side. |
+4,344 −2,049 :< |
@ry I'm a little bothered by that too, but I think the ability to split our ops into standalone crates is worth this cost. I'm going to work on the cli implementation a bit here, and maybe come up with a way to reduce the boilerplate. |
51a7c58
to
447d07a
Compare
This latest commit removed a lot of the complexity in the cli implementation. Op implementations looked like this: pub struct OpMetrics {
state: ThreadSafeState,
}
impl OpMetrics {
pub fn new(state: ThreadSafeState) -> Self {
Self { state }
}
}
impl OpDispatcher for OpMetrics {
fn dispatch(&self, control: &[u8], buf: Option<PinnedBuf>) -> CoreOp {
wrap_json_op(
move |_args, _zero_copy| {
let m = &self.state.metrics;
Ok(JsonOp::Sync(json!({
"opsDispatched": m.ops_dispatched.load(Ordering::SeqCst) as u64,
"opsCompleted": m.ops_completed.load(Ordering::SeqCst) as u64,
"bytesSentControl": m.bytes_sent_control.load(Ordering::SeqCst) as u64,
"bytesSentData": m.bytes_sent_data.load(Ordering::SeqCst) as u64,
"bytesReceived": m.bytes_received.load(Ordering::SeqCst) as u64
})))
},
&self.state,
control,
buf,
)
}
}
impl Named for OpMetrics {
const NAME: &'static str = "metrics";
} Now the same op looks like this: pub struct OpMetrics;
impl DenoOpDispatcher for OpMetrics {
fn dispatch(
&self,
state: &ThreadSafeState,
control: &[u8],
buf: Option<PinnedBuf>,
) -> CoreOp {
wrap_json_op(
move |_args, _zero_copy| {
let m = &state.metrics;
Ok(JsonOp::Sync(json!({
"opsDispatched": m.ops_dispatched.load(Ordering::SeqCst) as u64,
"opsCompleted": m.ops_completed.load(Ordering::SeqCst) as u64,
"bytesSentControl": m.bytes_sent_control.load(Ordering::SeqCst) as u64,
"bytesSentData": m.bytes_sent_data.load(Ordering::SeqCst) as u64,
"bytesReceived": m.bytes_received.load(Ordering::SeqCst) as u64
})))
},
control,
buf,
)
}
}
impl Named for OpMetrics {
const NAME: &'static str = "metrics";
} |
Perhaps a silly question, but why is Named a separate trait? Could NAME be moved to DenoOpDispatcher? |
@afinch7 I'm fine with a little extra code, but this is +2000 LoC without any significant functionality changes. I think we need to consider a different approach. For example, rather than traits, what if we register ops using closures? cc @piscisaureus advice would be appreciated |
3e4c3f4
to
2e32bd9
Compare
dbe541f
to
f4b817e
Compare
f4b817e
to
acee194
Compare
660a0d5
to
cc66714
Compare
@ry I'm going to punt implementing this in json ops for now. I did at least implement for dispatch minimal ops. This should help keep the size a lot more manageable. |
01406ff
to
baf24e1
Compare
baf24e1
to
6490654
Compare
ref #2730