-
Notifications
You must be signed in to change notification settings - Fork 9.8k
[camera]Fix crash due to calling engine APIs from background thread #4608
Conversation
82b933d
to
64d3f57
Compare
|
||
- (void)setUp { | ||
[super setUp]; | ||
id mock = OCMClassMock([FlutterEventChannel class]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nits:
id mock = OCMClassMock([FlutterEventChannel class]); | |
id mockEventChannel = OCMClassMock([FlutterEventChannel class]); |
/** | ||
* Registers a `FlutterTexture` for usage in Flutter and returns an id that can be used to reference | ||
* that texture when calling into Flutter with channels. Textures must be registered on the | ||
* platform thread. On success returns the pointer to the registered texture, else returns 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use the word "main thread" here as plugin is not part of engine, "platform thread" might not make much sense to plugin devs or users.
ditto other places
__block int64_t textureId; | ||
// Use dispatch_sync to keep FlutterTextureRegistry's sychronous API, so that we don't introduce new potential race conditions. | ||
// We do not break priority inversion here since it's the background thread waiting for main thread. | ||
dispatch_sync(dispatch_get_main_queue(), ^{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dispatch_sync made this method wait for main thread. It is not obvious to the caller.
Maybe we introduce a completion handler and use dispatch_async instead so the wait is more obivous?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess what I wanted to say in my comment is that, if we do not wait for main thread to complete (use dispatch_async
), we effectively change the semantic of the wrapped function (from a sync API to an async one), which may introduce a potential race condition. But looks like my comment was not clearly phrased.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just in case miscommunication, consider this dummy code:
var _foo = 0
func setFooTo2() {
_foo = 2
}
func methodA() {
setFooTo2()
}
func methodB() {
let bar = 1 / foo
}
methodA()
// since `setFooTo2()` is synchronous API, the subsequent code can safely assume `_foo` is updated
methodB()
However, if we create a wrapper and change the API to asynchronous one, we can't make that assumption anymore:
func methodA() {
setFooTo2Asynchronously(completion: {...}) // async API
}
func methodB() {
let bar = 1 / foo
}
methodA()
methodB()
So back to our example, after calling this texture API, there could be a potential code execution assuming that this API is already taken effect.
Since we have like 10+ registered methods, there could be exponential amount of combination that we need to check for race, which can easily get out of hand.
We have like 10+ methods, and race could happen in exponential amount of combinations.
// For example, we may crash in this order:
method2()
method8()
method7()
// or in this:
method1()
method9()
// or in this
method4()
method3()
That's why it's important to keep the sync API here, and changing it to async may introduce new hard-to-find races. Why hard-to-find? Consider this bug. That race condition happens with a combination of the following methods, and with additional requirement that create()
is called right after dispose()
:
dispose()
create()
startVideoRecording()
This is just one execution order out of potentially thousands of combinations.
If we were to re-write this whole plugin from the beginning, i would probably be fine with either approach, but since we are "patching" the code, I'd prefer to stick with sync behavior.
Hope this explanation makes sense (though it's kind of hard to phrase it in a few sentences in my comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your comment is fine.
The race condition could happen on textureId
. if we use a completion block, the textureId
would only be available inside the block. So the caller should be aware that the textureId
is only accessible after the block is executed.
Your point makes sense too, the caller in this plugin always need to wait for the textureId to continue anyway. Maybe we can rename the method to make it clearer to the caller that this method will block the current thread and wait on main thread? Maybe something like:
/**
* Registers a `FlutterTexture` for usage in Flutter and returns an id that can be used to reference
* that texture when calling into Flutter with channels. Textures must be registered on the
* platform thread. On success returns the pointer to the registered texture, else returns 0.
*
* The registration happens on Main Thread. If the calling thread is not Main Thread, the current thread will be blocked until the registration is successful.
*/
- (int64_t)registerTextureSync:(NSObject<FlutterTexture> *)texture;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we use a completion block, the textureId would only be available inside the block. So the caller should be aware that the textureId is only accessible after the block is executed.
We can use completion block for the code in the same function for sure, but we can't use it for separate functions. (e.g. dispose
and create
functions are separate). The issue here is that, it's hard to tell if the developers had made implicit assumption of the synchronous nature when implementing this plugin. If we were to rewrite this plugin from the beginning, I would be fine with either approaches.
I am cool with the new name as you suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think this one should be done with a completion block instead of blocking within queue hops. Something like:
typedef void (^FLTTextureRegistration)(int64_t textureId);
- (void)registerTexture:(NSObject<FlutterTexture> *)texture completionBlock:(FLTTextureRegistration)completionBlock;
- (void)registerTexture:(NSObject<FlutterTexture> *)texture completionBlock:(FLTTextureRegistration)completionBlock {
if (!NSThread.isMainThread) {
[NSOperationQueue.mainQueue addOperationWithBlock:^{
completionBlock([self->_registry registerTexture:texture]);
}];
} else {
completionBlock([_registry registerTexture:texture]);
}
}
[self.registry registerTexture:cam completionBlock:^(int64_t textureId) {
[result sendSuccessWithData:@{
@"cameraId" : @(textureId),
}];
}];
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand with the create
dispose
example the data should be sent before create
returns, but the caller in handleMethodCallAsync
can make it synchronous in that case and wait to return in the completion block, instead of enforcing its synchronicity in the -registerTexture:
API.
You should rebase onto #4592 to pick up the new credentials. |
[self waitForExpectations:@[ _mainThreadExpectation ] timeout:1]; | ||
} | ||
|
||
@end |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
packages/camera/camera/pubspec.yaml
Outdated
@@ -4,7 +4,7 @@ description: A Flutter plugin for controlling the camera. Supports previewing | |||
Dart. | |||
repository: https://github.com/flutter/plugins/tree/master/packages/camera/camera | |||
issue_tracker: https://github.com/flutter/flutter/issues?q=is%3Aissue+is%3Aopen+label%3A%22p%3A+camera%22 | |||
version: 0.9.4+5 | |||
version: 0.9.4+6 |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmagman I removed this change but it's giving me error saying changelog and yaml versions does not match. Maybe i should not update the changelog in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
synced with @cyanglaz offline that we can use ## NEXT
in the changelog
64d3f57
to
4f53674
Compare
4f53674
to
519a890
Compare
packages/camera/camera/CHANGELOG.md
Outdated
@@ -1,3 +1,7 @@ | |||
## 0.9.4+6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
## 0.9.4+6 | |
## NEXT |
If you want to publish this change in a later PR,
return self; | ||
} | ||
|
||
- (int64_t)registerTexture:(NSObject<FlutterTexture> *)texture { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reminder to rename.
519a890
to
05bf4e4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
05bf4e4
to
fdb1c80
Compare
I know it's too late now, but I only just saw this: Why? We explicitly don't want to batch changes in general (see the linked docs), and you also have no control in general over what it will be batched with—there's no guarantee that the next What was the advantage of doing this? |
I hadn't reviewed this yet! |
@hellohuanlin let's revert this at #4629 until Stuart and I can review it. |
@stuartmorgan I wanted to combine the 2 fixes because I wanted to avoid too frequent publishes, so that developers don't have to bump the version twice. But from your comment it seems this is fine. Then I am ok with bumping the version. |
@jmagman Ah sorry. I did not intend to leave you out. I thought I just needed 1 stamp iirc. The land button turned green though, maybe some bug with the CI landing requirement check? |
By policy you generally do only need 1 LGTM from a flutter-hacker, I had just mentioned that I would get a chance to review on Monday. 🙂 You should probably add @stuartmorgan as a reviewer to non-trivial plugin changes, particularly when he has a lot of context for particular patterns (or failures to fix those patterns, in this case). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding stylistic nits since I know you're putting this out for review again.
FLTThreadSafeEventChannel *_channel; | ||
XCTestExpectation *_mainThreadExpectation; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally prefer properties to ivars (unless you have a good reason, like not exposing mutable collections), and then once a property, use dot notation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very interesting. My preference was the opposite (use ivar unless I need extra functionalities in property). But not a strong preference so I'm fine changing it to use property.
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{ | ||
[self->_channel invokeMethod:@"foo" arguments:nil]; | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer higher level abstraction Foundation APIs to gcd C APIs.
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{ | |
[self->_channel invokeMethod:@"foo" arguments:nil]; | |
}); | |
NSOperationQueue *backgroundQueue = [[NSOperationQueue alloc] init]; | |
[backgroundQueue addOperationWithBlock:^{ | |
[self.channel invokeMethod:@"foo" arguments:nil]; | |
}]; |
I know gcd C APIs are used all over this plugin, but if it were written from scratch I would want to use operation queues instead, and this new code doesn't need to interact with the older code using C APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am cool with either, though are we concerned about inconsistency in the same plugin?
if (!NSThread.isMainThread) { | ||
dispatch_async(dispatch_get_main_queue(), ^{ | ||
[self->_channel setStreamHandler:handler]; | ||
}); | ||
} else { | ||
[_channel setStreamHandler:handler]; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer higher level abstraction Foundation APIs to gcd C APIs.
if (!NSThread.isMainThread) { | |
dispatch_async(dispatch_get_main_queue(), ^{ | |
[self->_channel setStreamHandler:handler]; | |
}); | |
} else { | |
[_channel setStreamHandler:handler]; | |
} | |
if (!NSThread.isMainThread) { | |
[NSOperationQueue.mainQueue addOperationWithBlock:^{ | |
[self->_channel setStreamHandler:handler]; | |
}]; | |
} else { | |
[_channel setStreamHandler:handler]; | |
} |
Since this doesn't set the stream handler synchronously, might be better to have a callback so _isStreamingImages = YES
isn't set before the handler is set up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call
FLTThreadSafeMethodChannel *_channel; | ||
XCTestExpectation *_mainThreadExpectation; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer properties to ivars.
FLTThreadSafeTextureRegistry *_registry; | ||
XCTestExpectation *_registerTextureExpectation; | ||
XCTestExpectation *_unregisterTextureExpectation; | ||
XCTestExpectation *_textureFrameAvailableExpectation; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer properties to ivars.
} | ||
|
||
- (void)testShouldDispatchToMainThreadIfCalledFromBackgroundThread { | ||
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NSOperationQueue
/** | ||
* Wrapper for FlutterMethodChannel that always invokes messages on the main thread | ||
*/ | ||
@interface FLTThreadSafeMethodChannel : NSObject |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not really that it's "thread-safe" so much as it executes operations on the main queue.
How about FLTMainQueueMethodChannel
?
/** | ||
* Wrapper for FlutterTextureRegistry that always sends events on the main thread | ||
*/ | ||
@interface FLTThreadSafeTextureRegistry : NSObject |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, it's not just "thread-safe", it's main queue-confined. FLTMainQueueTextureRegistry
? Although I guess this matches FLTThreadSafeFlutterResult
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I was following the convention for consistency. But I can change both if you prefer.
__block int64_t textureId; | ||
// Use dispatch_sync to keep FlutterTextureRegistry's sychronous API, so that we don't introduce new potential race conditions. | ||
// We do not break priority inversion here since it's the background thread waiting for main thread. | ||
dispatch_sync(dispatch_get_main_queue(), ^{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think this one should be done with a completion block instead of blocking within queue hops. Something like:
typedef void (^FLTTextureRegistration)(int64_t textureId);
- (void)registerTexture:(NSObject<FlutterTexture> *)texture completionBlock:(FLTTextureRegistration)completionBlock;
- (void)registerTexture:(NSObject<FlutterTexture> *)texture completionBlock:(FLTTextureRegistration)completionBlock {
if (!NSThread.isMainThread) {
[NSOperationQueue.mainQueue addOperationWithBlock:^{
completionBlock([self->_registry registerTexture:texture]);
}];
} else {
completionBlock([_registry registerTexture:texture]);
}
}
[self.registry registerTexture:cam completionBlock:^(int64_t textureId) {
[result sendSuccessWithData:@{
@"cameraId" : @(textureId),
}];
}];
Developers don't have to bump the version twice; they can update it however often they want to, regardless of how often we release. Releasing more often gives people more options, not fewer. (Also, manually updating versions of individual is not generally how most projects get new bugfix versions of plugins anyway, because of the way the constraint system works.) |
The engine APIs used in camera plugin are required to be run on platform thread. Some of these APIs have explicit asserts (e.g.
MethodChannel
) and some have warning instructions in documentation (e.g.TextureRegistry
). However, in camera plugin we are currently calling these APIs in background thread, causing the crash.There's already a
FLTThreadSafeFlutterResult
wrapper to invokeFlutterResult
in main thread. This PR created similar "thread safe wrappers" forEventChannel
,MethodChannel
andTextureRegistry
.This PR should fix the crash, but we should start looking into simplifying the threading logic in this plugin when we get time. I also wrote up some ideas here.
Note that there's a separate crash due to race condition, which is fixed here.
No version change:
I want to update the pubspec.yaml file in next PR so that we can batch the change.
Issues:
flutter/flutter#94723
flutter/flutter#52578
Pre-launch Checklist
dart format
.)[shared_preferences]
pubspec.yaml
with an appropriate new version according to the pub versioning philosophy, or this PR is exempt from version changes.CHANGELOG.md
to add a description of the change, following repository CHANGELOG style.///
).If you need help, consider asking for advice on the #hackers-new channel on Discord.