-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Google Meet background segmentation model #4177
Comments
this would be useful for us. |
I'll pass this on to our PM. |
Note: I'd also be happy if just the raw model (https://meet.google.com/_/rtcvidproc/release/336842817/segm_lite_v509.tflite) was released under a permissive license - I can figure out the model structure and JavaScript wiring :-) |
+1 to this! Would love to see this as part of the model repos for TFJS - a lot of people making Chrome Extensions to do great things in video calls etc and this would just make those experiences even more efficient when running to get higher FPS etc. |
+1 to this, would be a great, faster alternative to body-pix, really impressed by the performance in Google Meet :) |
Very desirable to have! Though I did just link to this issue from the Jitsi Meets repository, I think it would be very cool to have for other projects that need this functionality but don't have the capabilities to develop an in-house model. |
The blog post about this model links to this Model Card describing the model, which reads
The Model Card also links to this paper describing Model Cards in general, which says that Model Cards can describe a license that the model is released under. So I believe the above license applies to the described model itself (e.g. rather than to the Model Card document). So it seems like the raw .tflite model here is already Apache-licensed! @jasonmayes would you agree with this / is this Google's position? (Thanks to @blaueente for originally noting this license in the Model Card!) |
@jameshfisher I have successfully deployed the raw tflite model (BTW. many thanks for the link!) within a desktop app using MediaPipe. But I failed to do so for web app, since MediaPipe doesn't have any documentation for it yet (just some JS API's for specific examples, but not for custom models). But it looks like you're saying that you did it. How? Have you extracted the layers of the model + weights and "manually" created the same TF model and then converted it to TFJS? Or have you managed to compile the tflite to wasm and use MediaPipe? |
@stanhrivnak I found this while looking into it myself: https://gist.github.com/tworuler/bd7bd4c6cd9a8fbbeb060e7b64cfa008 Unfortunately, I'm not familiar with tensorflow (sad Amd gpu gang), so I have no idea how it works or how to modify it. PINTO0309 uses modified versions of that script for his tflite -> pb scripts. |
I have generated and committed models for .pb, .tflite float32/float16, INT8, EdgeTPU, TFJS, TF-TRT, CoreML, and OpenVINO IR for testing. However, I was so exhausted that I did not create a test program to test it. I would be very happy if you could test it with your help. 😃 If there are any licensing issues, I'm going to delete it. |
Amazing work! |
There was a Japanese engineer who implemented it in TFJS. There still seems to be a little problem with the conversion. It gets shifted to the left. Also, there is no smoothing post-processing called "light wrapping", so the border is jagged. EqCOpUxU8AA9G2Z.mp4 |
Is the shifting fixable? |
I'm using my own tricks in the optimization phase, so that may be affecting the results. Please give me some time so I can try this out. |
That's unfortunate, but nonetheless amazing work man! |
Ah wait, I think that is intentional to reduce the computational requirements of the model. The bilateral filter mentioned in the blog further refines the mask, and it might be the case that the model works best with bright colours. I think all things considered, the model does its job fairly well. By the way, mind sharing the test setup you have for the model? |
@kirawi ### Download test.jpg
$ sudo gdown --id 1Tyv6P2zshOCqTgYBLoa0aC3Co8W-9JPG
### Download segm_lite_v509_128x128_float32.tflite
$ sudo gdown --id 1qOlcK8iKki_aAi_OrxE2YLaw5EZvQn1S import numpy as np
from PIL import Image
try:
from tflite_runtime.interpreter import Interpreter
except:
from tensorflow.lite.python.interpreter import Interpreter
img = Image.open('test.jpg')
h = img.size[1]
w = img.size[0]
img = img.resize((128, 128))
img = np.asarray(img)
img = img / 255.
img = img.astype(np.float32)
img = img[np.newaxis,:,:,:]
# Tensorflow Lite
interpreter = Interpreter(model_path='segm_lite_v509_128x128_float32.tflite', num_threads=4)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()[0]['index']
output_details = interpreter.get_output_details()[0]['index']
interpreter.set_tensor(input_details, img)
interpreter.invoke()
output = interpreter.get_tensor(output_details)
print(output.shape)
out1 = output[0][:, :, 0]
out2 = output[0][:, :, 1]
out1 = (out1 > 0.5) * 255
out2 = (out2 > 0.5) * 255
print('out1:', out1.shape)
print('out2:', out2.shape)
out1 = Image.fromarray(np.uint8(out1)).resize((w, h))
out2 = Image.fromarray(np.uint8(out2)).resize((w, h))
out1.save('out1.jpg')
out2.save('out2.jpg') |
I create the demo page to use PINTO's model converted to tensorflowjs. https://flect-lab-web.s3-us-west-2.amazonaws.com/P01_wokers/t11_googlemeet-segmentation/index.html You can change input device with control panel at right side. If you want to use your camera device, please try. And at default this page use new version of PINTO's model, but it seems shift to left a little yet... You can change the model to old version of PINTO's model with the control panel at right side too. |
I overlaid the image with the tflite implementation at hand. Does it shift when I apply the filter? Screencast.2020-12-26.10.03.33.mp4 |
I don't think it's shifting, it looks more like the one with the white background is capturing more of the background than the other one. |
mmmm, I spent a lot of time to solve the "shifting" problem yesterday. However, I couldn't.
|
Hi guys, first of all, many thanks to @PINTO0309, @w-okada, and others for putting your effort on this! Great work so far! I would really love to have this great model from google in my web app (currently I have bodypix with custom improvements, but still it sucks). Here are my 2 cents. The implications are:
I think the best would be to compare the outputs of the original tflite model and the created TFJS model (or h5/tflite), layer after layer to see where it deviates and focus to fix that part. So this is my plan. I can work on it only ~ 2 hours a day, so if you're faster, go for it and let me know! :) Or if you have any other ideas, share it please! |
That's the one! Cheers! |
Wow!! |
Note that although Google did release the Meet model under the Apache 2.0 licence with that model card pasted above, they no longer have it available for download and there is now a different card with a different licence. |
Yep. The new model is called "Xeno" meet segmentation or something. This is the apache released model: OneDrive link. Also if you tinker around a bit with google meet webpage you can still download the models directly from Google, you just need to find the right url from the js script. At least that was still working as of February. |
Hi, I am product manager for MediaPipe. Please note that only the MediaPipe Selfie Segmentation Model is open sourced and licensed under Apache 2 for external use. Other versions, including those used in the Google Meet product, are licensed under Google Terms and Conditions and are not intended for open source use. |
@jasonmayes Why was this closed? |
Closed as the folk from MediaPipe clarified the T&C for the models they released. |
Reopen to track the segmentation model release through tfjs API |
From https://meet.google.com/, https://meet.google.com/_/rtcvidproc/release/hashed/segm_full_sparse_v1008_0bda82336d236e21e52f2b74129b9883.dat Looks like the latest model is hashed and can not be downloaded anymore. |
@jimmy7799 We are doomed then? or have we found some way to get the model? |
Even if you can get it, you are not allowed to use it. |
@saghul I know I just want to try that out on local. No intention to use it in open source or commercial project. |
JFYI, the MediaPipe Selfie Segmentation model is a) properly Apache licensed and b) can just be downloaded as an Android AAR archive. See https://drive.google.com/file/d/1dCfozqknMa068vVsO2j_1FgZkW_e3VWv/preview . |
I ever tried the model in MediaPipe, but looks like the performance is not good as google meet one. |
Mediapipe segmentation seems to be coming to tfjs https://github.com/tensorflow/tfjs-models/tree/master/body-segmentation/src |
It’s the same I believe they are porting it to use from within tfjs ecosystem |
Mediapipe segmentation model has been deployed here .please verify. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you. |
Closing as stale. Please @mention us if this needs more attention. |
i was wondering if there is any hd model (512 px) available? |
System information
Describe the feature and the current behavior/state.
This Google AI blog post describes the background segmentation model used in Google Meet. This model would be an excellent complement to the models in the tfjs-models collection. (The existing BodyPix model can be (ab)used for background segmentation, but has quality and performance issues for this use-case. I expect the Google Meet model improves on this.)
Will this change the current api? How?
No, it would be an addition to tfjs-models.
Who will benefit with this feature?
Apps consuming and/or displaying a user-facing camera feed. WebRTC video chat apps are the most obvious, where background blur/replacement is becoming expected. I also expect it could be a useful preprocessing step before applying e.g. PoseNet. It can also be used creatively on images as a pre-processing step -- for example, this recent app to enhance profile pictures integrates a background segmentation solution.
The text was updated successfully, but these errors were encountered: