-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding ClipDetection component code. #326
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 15 of 25 files at r1, all commit messages.
Reviewable status: 15 of 25 files reviewed, 18 unresolved discussions (waiting on @ZachCafego)
python/ClipDetection/clip_component/data/seven_templates.txt
line 3 at r1 (raw file):
itap of a {}. a bad photo of the {}. an origami {}.
origami? That's interesting. Not saying we need to change anything.
python/ClipDetection/clip_component/triton_server/Dockerfile
line 45 at r1 (raw file):
LABEL org.label-schema.license="Apache 2.0" \ org.label-schema.name="OpenMPF CLIP Detection" \
Call this "OpenMPF CLIP Detection Triton Server".
python/ClipDetection/clip_component/triton_server/models/ip_clip_512/config.pbtxt
line 4 at r1 (raw file):
https://github.com/openmpf/openmpf-components/blob/master/cpp/OcvYoloDetection/triton_server/models/yolo-608.config.pbtxt doesn't have an instance_group
.
The Triton docs say:
By default, a single execution instance of the model is created for each GPU available in the system.
That is the default behavior we want. If a user wants to change it they can modify this file with their own instance_group
, or groups.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 3 at r1 (raw file):
{ "componentName": "ClipDetection", "componentVersion": "6.3",
Change to 7.1.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 4 at r1 (raw file):
"componentName": "ClipDetection", "componentVersion": "6.3", "middlewareVersion": "6.3",
Change to 7.1.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 37 at r1 (raw file):
{ "name": "CLASSIFICATION_LIST", "description": "Specifies the classification list that will be tokenized for the text encoder (supports ImageNet and COCO). By default, the ImageNet classifications will be used.",
Say:
(supports "imagenet" and "coco")
to be clear exactly which strings the user needs to type in.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 42 at r1 (raw file):
}, { "name": "CLASSFICATION_PATH",
Spelling: "CLASSIFICATION"
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 43 at r1 (raw file):
{ "name": "CLASSFICATION_PATH", "description": "Specifies a path to a csv file containing two names for each classification: one is the full name to display and the other to enter into the CLIP text encoding.",
Say "Optionally specifies a path to a custom csv file ..."
That will make it clear that this is optional, and the "custom" part makes it clearer why there is no default value.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 49 at r1 (raw file):
{ "name": "TEMPLATE_PATH", "description": "Specifies a path to a text file containing custom templates for use in the CLIP model. Include a single {} where each classification is to be inserted.",
Say "Optionally specifies a path to a custom text file ..."
That will make it clear that this is optional, and the "custom" part makes it clearer why there is no default value.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 76 at r1 (raw file):
"actions": [ { "name": "CLIP CLASSIFICATION ACTION",
Call this "CLIP COCO CLASSIFICATION ACTION".
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 77 at r1 (raw file):
{ "name": "CLIP CLASSIFICATION ACTION", "description": "Runs CLIP classification on the COCO dataset classes, or with a specified set of classifications and templates.",
I think we should remove the "or with a specified set of classifications and templates" part from here and elsewhere in this file. We should assume the user is going to use this action without additional job properties, so they won't be providing a custom set of classifications or templates.
More generally, we shouldn't bother describing the pipeline behavior if the user is going to modify it my providing their own properties that override the defaults.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 121 at r1 (raw file):
"tasks": [ { "name": "CLIP CLASSIFICATION TASK",
Call this "CLIP COCO CLASSIFICATION TASK".
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 128 at r1 (raw file):
}, { "name": "CLIP TRITON CLASSIFICATION TASK",
Call this "CLIP TRITON COCO CLASSIFICATION TASK".
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 151 at r1 (raw file):
"pipelines": [ { "name": "CLIP CLASSIFICATION PIPELINE",
Call this "CLIP COCO CLASSIFICATION PIPELINE".
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 158 at r1 (raw file):
}, { "name": "CLIP TRITON CLASSIFICATION PIPELINE",
Call this "CLIP TRITON COCO CLASSIFICATION PIPELINE".
python/ClipDetection/tests/test_clip.py
line 59 at r1 (raw file):
print(result.detection_properties["CLASSIFICATION LIST"]) print(result.detection_properties["CLASSIFICATION CONFIDENCE LIST"]) # self.assertEqual(job.job_properties["NUMBER_OF_CLASSIFICATIONS"], len(self._output_to_list(result.detection_properties["CLASSIFICATION LIST"])))
Uncomment asserts.
python/ClipDetection/tests/test_clip_generic.py
line 42 at r1 (raw file):
class TestClip(unittest.TestCase): def test_image_file(self):
Combine this unit test with test_clip.py
.
python/ClipDetection/tests/test_clip_generic.py
line 57 at r1 (raw file):
print(result.detection_properties["CLASSIFICATION LIST"]) print(result.detection_properties["CLASSIFICATION CONFIDENCE LIST"]) # self.assertEqual(job.job_properties["NUMBER_OF_CLASSIFICATIONS"], len(self._output_to_list(result.detection_properties["CLASSIFICATION LIST"])))
Uncomment asserts.
python/ClipDetection/tests/test_clip_triton.py
line 60 at r1 (raw file):
print(result.detection_properties["CLASSIFICATION LIST"]) print(result.detection_properties["CLASSIFICATION CONFIDENCE LIST"]) # self.assertEqual(job.job_properties["NUMBER_OF_CLASSIFICATIONS"], len(self._output_to_list(result.detection_properties["CLASSIFICATION LIST"])))
Uncomment asserts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 15 of 25 files reviewed, 21 unresolved discussions (waiting on @ZachCafego)
python/ClipDetection/setup.cfg
line 29 at r1 (raw file):
[metadata] name = ClipDetection version = 6.3
Set to 7.1.
python/ClipDetection/setup.cfg
line 34 at r1 (raw file):
packages = clip_component install_requires = mpf_component_api>=6.3
Set to 7.1.
python/ClipDetection/setup.cfg
line 35 at r1 (raw file):
install_requires = mpf_component_api>=6.3 mpf_component_util>=6.3
Set to 7.1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 11 of 25 files reviewed, 21 unresolved discussions (waiting on @jrobble and @ZachCafego)
python/ClipDetection/setup.cfg
line 29 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Set to 7.1.
Done.
python/ClipDetection/setup.cfg
line 34 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Set to 7.1.
Done.
python/ClipDetection/setup.cfg
line 35 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Set to 7.1.
Done.
python/ClipDetection/clip_component/triton_server/Dockerfile
line 45 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Call this "OpenMPF CLIP Detection Triton Server".
Done.
python/ClipDetection/clip_component/triton_server/models/ip_clip_512/config.pbtxt
line 4 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
https://github.com/openmpf/openmpf-components/blob/master/cpp/OcvYoloDetection/triton_server/models/yolo-608.config.pbtxt doesn't have an
instance_group
.The Triton docs say:
By default, a single execution instance of the model is created for each GPU available in the system.
That is the default behavior we want. If a user wants to change it they can modify this file with their own
instance_group
, or groups.
Done.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 3 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Change to 7.1.
Done.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 4 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Change to 7.1.
Done.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 37 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Say:
(supports "imagenet" and "coco")
to be clear exactly which strings the user needs to type in.
Done.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 42 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Spelling: "CLASSIFICATION"
Done.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 43 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Say "Optionally specifies a path to a custom csv file ..."
That will make it clear that this is optional, and the "custom" part makes it clearer why there is no default value.
Done.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 49 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Say "Optionally specifies a path to a custom text file ..."
That will make it clear that this is optional, and the "custom" part makes it clearer why there is no default value.
Done.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 76 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Call this "CLIP COCO CLASSIFICATION ACTION".
Done.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 77 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
I think we should remove the "or with a specified set of classifications and templates" part from here and elsewhere in this file. We should assume the user is going to use this action without additional job properties, so they won't be providing a custom set of classifications or templates.
More generally, we shouldn't bother describing the pipeline behavior if the user is going to modify it my providing their own properties that override the defaults.
Done.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 121 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Call this "CLIP COCO CLASSIFICATION TASK".
Done.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 128 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Call this "CLIP TRITON COCO CLASSIFICATION TASK".
Done.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 151 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Call this "CLIP COCO CLASSIFICATION PIPELINE".
Done.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 158 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Call this "CLIP TRITON COCO CLASSIFICATION PIPELINE".
Done.
python/ClipDetection/tests/test_clip.py
line 59 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Uncomment asserts.
Done.
python/ClipDetection/tests/test_clip_triton.py
line 60 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Uncomment asserts.
Done.
python/ClipDetection/tests/test_clip_generic.py
line 42 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Combine this unit test with
test_clip.py
.
Done.
python/ClipDetection/tests/test_clip_generic.py
line 57 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Uncomment asserts.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 8 of 25 files at r1.
Reviewable status: 17 of 25 files reviewed, 35 unresolved discussions (waiting on @ZachCafego)
a discussion (no related file):
As discussed, please create an openmpf-docker
PR where you add an entry for the CLIP client and server, and link that PR to this one, and vice versa.
a discussion (no related file):
Generating the 80 text embeddings takes too long to make running the unit tests as part of monolithic build feasible. Please update the unit tests to use NUMBER_OF_TEMPLATES=1
.
Additionally, we should set ENABLE_CROPPING=false
for the unit tests to further speed them up.
Also, it's quicker to generate the embeddings for COCO then ImageNet, so let's change the sturgeon unit test + image to detect a COCO class.
a discussion (no related file):
A note to myself to run the Jenkins build for you once the existing comments are addressed.
a discussion (no related file):
Please note in the README that the following warning is expected if building on a host without a GPU:
(venv) mpf@openmpf-dev-ubuntu:~/git/openmpf-projects/openmpf-components/python/ClipDetection/tmp/openai-CLIP-a9b1bf5$ python3 -c 'import clip; clip.load("ViT-B/32")'
/home/mpf/git/openmpf-projects/openmpf-components/python/ClipDetection/venv/lib/python3.8/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
return torch._C._cuda_getDeviceCount() > 0
python/ClipDetection/README.md
line 23 at r1 (raw file):
- `ENABLE_TRITON`: A boolean toggle to specify whether the component should use a Triton inference server to process the image job. By default this is set to false. - `TRITON_SERVER`: Specifies the Triton server <host>:<port> to use for inferencing. By default, this is set to 'clip-detection-server:8001'.
You need to use backticks around <host>:<port>
for the Markdown to render properly:
`TRITON_SERVER`: Specifies the Triton server `<host>:<port>` to use for inferencing. By default, this is set to 'clip-detection-server:8001'.
python/ClipDetection/README.md
line 45 at r1 (raw file):
# Custom Classifications The need for custom classifications arose when training on the ImageNet classifications, where any different class can have many equivalent names. For example, one of the classes is "great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias". We found the model to be most performant when given a single representative class title. For this case, 'great white shark' makes the most sense but the 'imagenet_classification_list.csv' file gives representative titles for each class, adapted from .ipynb files on the CLIP GitHub page.
Let's rephrase this as:
For this case, 'great white shark' makes the most sense. The `imagenet_classification_list.csv` file gives representative titles for each class, adapted from .ipynb files on the CLIP GitHub page.·
First, I removed the "but" since it seemed misleading. Second, I used backticks around imagenet_classification_list.csv
since it's a file name / path.
python/ClipDetection/setup.cfg
line 42 at r1 (raw file):
component = clip_component.clip_component:ClipComponent [options.package_data]
You need to prefix all of these with data/
as I did here:
[options.package_data]
clip_component = data/imagenet_classification_list.csv, data/coco_classification_list.csv, data/eighty_templates.txt, data/seven_templates.txt, data/one_template.txt
Or else the following command will fail:
DOCKER_BUILDKIT=1 docker build . --progress=plain --build-arg RUN_TESTS=true -t openmpf_clip_detection:jrobble
<snip>
#13 19.71 FileNotFoundError: [Errno 2] No such file or directory: '/opt/mpf/plugin-venv/lib/python3.8/site-packages/clip_component/data/one_template.txt'
python/ClipDetection/clip_component/clip_component.py
line 36 at r1 (raw file):
import clip import sys
Pylance in VScode says this isn't used. Please remove it.
python/ClipDetection/clip_component/clip_component.py
line 43 at r1 (raw file):
import tritonclient.grpc as grpcclient from tritonclient.utils import InferenceServerException, triton_to_np_dtype import tritonclient.utils
Pylance in VScode says this isn't used. Please remove it.
python/ClipDetection/clip_component/clip_component.py
line 60 at r1 (raw file):
image = image_reader.get_image() return JobRunner().get_classifications(image, image_job.job_properties)
We should always log some kind of "job complete" message. It usually takes the form:
logger.info(
f'Job complete. Found {len(some_detections)} detections.')
python/ClipDetection/clip_component/clip_component.py
line 73 at r1 (raw file):
self.model = model self.classification_path = ''
Class variables should start with an underscore, like this.
Also, functions that are private (as in not invoked by anything but the class itself), should start with an underscore as well.
python/ClipDetection/clip_component/clip_component.py
line 78 at r1 (raw file):
self.templates = None self.mapping = None
Please give this variable a more descriptive name.
python/ClipDetection/clip_component/clip_component.py
line 121 at r1 (raw file):
width = image_width, height = image_height, confidence = classification_confidence_list.split('; ')[0],
Pylance says you should not try to convert directly from str to float. Do:
confidence = float(classification_confidence_list.split('; ')[0]),
python/ClipDetection/clip_component/clip_component.py
line 152 at r1 (raw file):
@staticmethod def _get_prop(job_properties, key, default_value, accep_values=[]):
👍
python/ClipDetection/clip_component/clip_component.py
line 285 at r1 (raw file):
for inputs, outputs in self.get_inputs_outputs(): responses.append(self.triton_client.infer(model_name=self.model_name, inputs=inputs, outputs=outputs)) except InferenceServerException as e:
Log this as an error. Throw an MPF DetectionException instead of logging the stack trace. Should not need to import traceback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 15 of 26 files reviewed, 35 unresolved discussions (waiting on @jrobble)
a discussion (no related file):
Previously, jrobble (Jeff Robble) wrote…
As discussed, please create an
openmpf-docker
PR where you add an entry for the CLIP client and server, and link that PR to this one, and vice versa.
Done.
a discussion (no related file):
Previously, jrobble (Jeff Robble) wrote…
Generating the 80 text embeddings takes too long to make running the unit tests as part of monolithic build feasible. Please update the unit tests to use
NUMBER_OF_TEMPLATES=1
.Additionally, we should set
ENABLE_CROPPING=false
for the unit tests to further speed them up.Also, it's quicker to generate the embeddings for COCO then ImageNet, so let's change the sturgeon unit test + image to detect a COCO class.
Done.
a discussion (no related file):
Previously, jrobble (Jeff Robble) wrote…
Please note in the README that the following warning is expected if building on a host without a GPU:
(venv) mpf@openmpf-dev-ubuntu:~/git/openmpf-projects/openmpf-components/python/ClipDetection/tmp/openai-CLIP-a9b1bf5$ python3 -c 'import clip; clip.load("ViT-B/32")' /home/mpf/git/openmpf-projects/openmpf-components/python/ClipDetection/venv/lib/python3.8/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.) return torch._C._cuda_getDeviceCount() > 0
Done.
python/ClipDetection/README.md
line 23 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
You need to use backticks around
<host>:<port>
for the Markdown to render properly:`TRITON_SERVER`: Specifies the Triton server `<host>:<port>` to use for inferencing. By default, this is set to 'clip-detection-server:8001'.
Done.
python/ClipDetection/README.md
line 45 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Let's rephrase this as:
For this case, 'great white shark' makes the most sense. The `imagenet_classification_list.csv` file gives representative titles for each class, adapted from .ipynb files on the CLIP GitHub page.·
First, I removed the "but" since it seemed misleading. Second, I used backticks around
imagenet_classification_list.csv
since it's a file name / path.
Done.
python/ClipDetection/setup.cfg
line 42 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
You need to prefix all of these with
data/
as I did here:[options.package_data] clip_component = data/imagenet_classification_list.csv, data/coco_classification_list.csv, data/eighty_templates.txt, data/seven_templates.txt, data/one_template.txt
Or else the following command will fail:
DOCKER_BUILDKIT=1 docker build . --progress=plain --build-arg RUN_TESTS=true -t openmpf_clip_detection:jrobble <snip> #13 19.71 FileNotFoundError: [Errno 2] No such file or directory: '/opt/mpf/plugin-venv/lib/python3.8/site-packages/clip_component/data/one_template.txt'
Done.
python/ClipDetection/clip_component/clip_component.py
line 36 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Pylance in VScode says this isn't used. Please remove it.
Done.
python/ClipDetection/clip_component/clip_component.py
line 43 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Pylance in VScode says this isn't used. Please remove it.
Done.
python/ClipDetection/clip_component/clip_component.py
line 60 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
We should always log some kind of "job complete" message. It usually takes the form:
logger.info( f'Job complete. Found {len(some_detections)} detections.')
Done.
python/ClipDetection/clip_component/clip_component.py
line 73 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Class variables should start with an underscore, like this.
Also, functions that are private (as in not invoked by anything but the class itself), should start with an underscore as well.
Done.
python/ClipDetection/clip_component/clip_component.py
line 78 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Please give this variable a more descriptive name.
Done.
python/ClipDetection/clip_component/clip_component.py
line 121 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Pylance says you should not try to convert directly from str to float. Do:
confidence = float(classification_confidence_list.split('; ')[0]),
Done.
python/ClipDetection/clip_component/clip_component.py
line 285 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Log this as an error. Throw an MPF DetectionException instead of logging the stack trace. Should not need to import traceback.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 7 files at r2, 1 of 1 files at r3, 6 of 6 files at r4, all commit messages.
Reviewable status: 25 of 26 files reviewed, 6 unresolved discussions (waiting on @ZachCafego)
python/ClipDetection/README.md
line 5 at r4 (raw file):
This repository contains source code for the OpenMPF CLIP detection component. CLIP (Contrastive Language-Image Pre-Training) was developed by OpenAI and published in 2021. https://arxiv.org/abs/2103.00020 When building on a host without a GPU, the following warning is expected:
This is too low level to be in the Overview section. Please move to the bottom of the README in a # Known Issues
section.
python/ClipDetection/clip_component/clip_component.py
line 60 at r1 (raw file):
image = image_reader.get_image() return JobRunner().get_classifications(image, image_job.job_properties)
Creating a new instance of JobRunner
for every job does not allow you to cache the text embeddings between jobs. To see if the text embeddings were being cached I modified your unit test:
component = ClipComponent()
job = mpf.ImageJob(
job_name='test-image',
data_uri=self._get_test_file('sturgeon.jpg'),
job_properties=dict(
NUMBER_OF_CLASSIFICATIONS = 1,
NUMBER_OF_TEMPLATES = 1,
CLASSIFICATION_LIST = 'coco',
ENABLE_CROPPING='false'
),
media_properties={},
feed_forward_location=None
)
result = list(component.get_detections_from_image(job))[0]
result = list(component.get_detections_from_image(job))[0]
result = list(component.get_detections_from_image(job))[0]
I saw it generate the embeddings for each get_detections_from_image()
call.
I'm not saying you should permanently update your unit test that way. I made the change temporarily just to test this out.
In retrospect, the reason that only the Azure components may be using a JobRunner
is because they don't need to cache anything between jobs. That's because they are simple clients to the Azure Cognitive Services REST endpoints.
With this in mind, it's probably best to refactor your code not to use a JobRunner
. Creating a new JobRunner
each time also does clip.load()
each time.
You may want to consider refactoring it into a wrapper class like this. Alternatively, EAST just caches its model directly here.
python/ClipDetection/clip_component/triton_server/Dockerfile
line 32 at r4 (raw file):
ARG BUILD_TAG=latest FROM ${MODELS_REGISTRY}openmpf_clip_detection_triton_models:${BUILD_TAG} as models
We need to push this to DockerHub, like we do with the OcvYoloDetection model files here. Let's talk about that in our next meeting. Right now I can't do the build since I don't have access to this Docker image.
python/ClipDetection/tests/test_clip.py
line 74 at r4 (raw file):
This fails for me:
>>> "violent" in ['violent scene', 'dangerous scene', 'safe scene', 'peaceful scene']
False
I think you want
self.assertTrue("violent scene" in self._output_to_list(result.detection_properties["CLASSIFICATION LIST"]))
self.assertEqual("violent scene", result.detection_properties["CLASSIFICATION"])
Also, the latter assert failed because result.detection_properties["CLASSIFICATION"]
was " violent scene" (with the space as the first character). I left a different comment on the CSV file to address this.
python/ClipDetection/tests/data/violence_classes.csv
line 1 at r4 (raw file):
peaceful, peaceful scene
Remove the space after the commas here.
Also, I recommend trimming the whitespace off of the tokens in each line when you parse the CSV to prevent users from making the same mistake.
Previously, jrobble (Jeff Robble) wrote…
I meant to post this comment with the first round of review, so that's why it's using sturgeon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 25 of 26 files reviewed, 8 unresolved discussions (waiting on @ZachCafego)
a discussion (no related file):
Move "triton_server" up one directory in your code.
a discussion (no related file):
Also, I've been wondering how concerned we are that the model we're downloading on the client side during the build may get updated and no longer be compatible with the model we're running in Triton. The Triton model is static. It's never going to change unless we change it, whereas the client model is whatever is downloaded using the clip framework.
More specifically, since we're generating embeddings for the prompts on the client side, is there a chance that those will one day no longer be compatible with image embeddings from the servicer?
If so, one option would be to generate the embeddings for the prompts using the server, but as a stop-gap measure can we specify a specific model version when downloading the model on the client side when using the following command?
clip.load("ViT-B/32")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 25 of 26 files reviewed, 12 unresolved discussions (waiting on @ZachCafego)
python/ClipDetection/setup.cfg
line 29 at r4 (raw file):
[metadata] name = ClipDetection version = 7.1
Change to 7.2
.
python/ClipDetection/setup.cfg
line 34 at r4 (raw file):
packages = clip_component install_requires = mpf_component_api>=7.1
Change the api and util dependencies to 7.2
.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 3 at r4 (raw file):
{ "componentName": "ClipDetection", "componentVersion": "7.1",
Update componentVersion
and middlewareVersion
to 7.2
.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 8 at r4 (raw file):
"batchLibrary": "ClipDetection", "environmentVariables": [], "algorithm": {
Add outputChangedCounter
like it's done here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 25 of 26 files reviewed, 13 unresolved discussions (waiting on @ZachCafego)
a discussion (no related file):
Please update the component to optionally return a detection property representing the whole-image feature embedding. Call the property FEATURE
. This will be similar to the TRTIS component, which has this in its README:
each detection will have a
FEATURE
detection property set to the value of the base64-encoded version of the feature vector
Only include the FEATURE
property when the INCLUDE_FEATURES
property is set to true (false by default).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 16 of 26 files reviewed, 13 unresolved discussions (waiting on @jrobble)
a discussion (no related file):
Previously, jrobble (Jeff Robble) wrote…
Move "triton_server" up one directory in your code.
Done.
a discussion (no related file):
Previously, jrobble (Jeff Robble) wrote…
Also, I've been wondering how concerned we are that the model we're downloading on the client side during the build may get updated and no longer be compatible with the model we're running in Triton. The Triton model is static. It's never going to change unless we change it, whereas the client model is whatever is downloaded using the clip framework.
More specifically, since we're generating embeddings for the prompts on the client side, is there a chance that those will one day no longer be compatible with image embeddings from the servicer?
If so, one option would be to generate the embeddings for the prompts using the server, but as a stop-gap measure can we specify a specific model version when downloading the model on the client side when using the following command?
clip.load("ViT-B/32")
Done.
a discussion (no related file):
Previously, jrobble (Jeff Robble) wrote…
Please update the component to optionally return a detection property representing the whole-image feature embedding. Call the property
FEATURE
. This will be similar to the TRTIS component, which has this in its README:each detection will have a
FEATURE
detection property set to the value of the base64-encoded version of the feature vectorOnly include the
FEATURE
property when theINCLUDE_FEATURES
property is set to true (false by default).
Working, added to descriptor file. Still need to implement detection property if True.
python/ClipDetection/README.md
line 5 at r4 (raw file):
Previously, jrobble (Jeff Robble) wrote…
This is too low level to be in the Overview section. Please move to the bottom of the README in a
# Known Issues
section.
Done.
python/ClipDetection/setup.cfg
line 29 at r4 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Change to
7.2
.
Done.
python/ClipDetection/setup.cfg
line 34 at r4 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Change the api and util dependencies to
7.2
.
Done.
python/ClipDetection/clip_component/clip_component.py
line 60 at r1 (raw file):
Previously, jrobble (Jeff Robble) wrote…
I meant to post this comment with the first round of review, so that's why it's using sturgeon.
Done.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 3 at r4 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Update
componentVersion
andmiddlewareVersion
to7.2
.
Done.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 8 at r4 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Add
outputChangedCounter
like it's done here.
Done.
python/ClipDetection/tests/test_clip.py
line 74 at r4 (raw file):
Previously, jrobble (Jeff Robble) wrote…
This fails for me:
>>> "violent" in ['violent scene', 'dangerous scene', 'safe scene', 'peaceful scene'] False
I think you want
self.assertTrue("violent scene" in self._output_to_list(result.detection_properties["CLASSIFICATION LIST"])) self.assertEqual("violent scene", result.detection_properties["CLASSIFICATION"])
Also, the latter assert failed because
result.detection_properties["CLASSIFICATION"]
was " violent scene" (with the space as the first character). I left a different comment on the CSV file to address this.
Done.
python/ClipDetection/tests/data/violence_classes.csv
line 1 at r4 (raw file):
Previously, jrobble (Jeff Robble) wrote…
Remove the space after the commas here.
Also, I recommend trimming the whitespace off of the tokens in each line when you parse the CSV to prevent users from making the same mistake.
Done.
python/ClipDetection/clip_component/triton_server/Dockerfile
line 32 at r4 (raw file):
Previously, jrobble (Jeff Robble) wrote…
We need to push this to DockerHub, like we do with the OcvYoloDetection model files here. Let's talk about that in our next meeting. Right now I can't do the build since I don't have access to this Docker image.
Done.
Update tests.
Create ClipWrapper in init.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 9 files at r5, 4 of 6 files at r6.
Reviewable status: 19 of 27 files reviewed, 3 unresolved discussions (waiting on @ZachCafego)
python/ClipDetection/clip_component/clip_component.py
line 64 at r6 (raw file):
if not self.initialized: self.wrapper = ClipWrapper()
I addressed the following issue and merged in the changes. Just leaving the comment for posterity:
We never have a reason to change the wrapper. It will be configured the same no matter the job. For that reason we should create it in __init__
.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 131 at r6 (raw file):
"description": "Runs CLIP classification on the COCO dataset classes.", "actions": [ "CLIP CLASSIFICATION ACTION"
I addressed the following issue and merged in the changes. Just leaving the comment for posterity:
This needs to be CLIP COCO CLASSIFICATION ACTION
.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 138 at r6 (raw file):
"description": "Runs CLIP classification on the COCO dataset classes using a Triton Inferencing Server.", "actions": [ "CLIP TRITON CLASSIFICATION ACTION"
I addressed the following issue and merged in the changes. Just leaving the comment for posterity:
This needs to be CLIP TRITON COCO CLASSIFICATION ACTION
.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 161 at r6 (raw file):
"description": "Runs CLIP classification on the COCO dataset classes.", "tasks": [ "CLIP CLASSIFICATION TASK"
I addressed the following issue and merged in the changes. Just leaving the comment for posterity:
This needs to be CLIP COCO CLASSIFICATION TASK
.
python/ClipDetection/plugin-files/descriptor/descriptor.json
line 168 at r6 (raw file):
"description": "Runs CLIP classification on the COCO dataset classes using a Triton Inferencing Server.", "tasks": [ "CLIP TRITON CLASSIFICATION TASK"
I addressed the following issue and merged in the changes. Just leaving the comment for posterity:
This needs to be CLIP TRITON COCO CLASSIFICATION TASK
.
python/ClipDetection/tests/test_clip.py
line 38 at r6 (raw file):
import mpf_component_api as mpf expected_feature = '/dJWPEgftjycQt28dfsEPcTnuDpHFmG9XWPQPHTyLz0vWE89H/b5PJHe37ytpLK8SB82PVODs7wu5pg9DCxXPGYdELwuT/q6x+YkPeaYpbtCnX08QvsAvBqPQD1g1HI7PrHPvJfSzjzG3U89Qp39OwQ5fLyxooq8MVe7PP5N4jzQSza8+eYoO8yYI73weEI7feWKPFo+BzzNQ3W9fZBcO92lSr1OCtA874uAPICPyLzI3Ls751UhvWPcs7zjJ4M6Q3YMPQTQGjvKJwG7AtqDPGpCWTtzsL886wi0u3mb2bvVLfs8y6IMPTDcr7v8V8s6VP6+PPgpLbvhKBc9aBMnu3OwP70dlwG9FZFoOwiMAj2sm928xHUCvTjGtbyHcPm6vvwePXwV0bvddQS9hc+QvKPElTytpLK8Go/AOdHGQTu6/kY6JXiyu5VqAbzxwwe9p7kYPHavqzsWmr09sQvsvAe8SLxfWWe8zFYzO15sJb3brzO/YeYcPSqvpbzUQLk8OL3gvPtOdrxDdgy9W2RkPSTEC7wKgpm73vAPPWxBRTwaj8A8OQgmvWFYUztgaxE9uYwQuu+LALyksde8F9wtvTFXOzykuiy9KyqxO+r/3rx9Hia9xdT6vNK8WD3o0Ky6AE1OugY46DyT3Us9jelcvLeEzzxtjIo6kd7fu1xtOTw6Spa7pfwcvI13Jj3P0Kq8t//auwJMuj1yLN88MNyvPLv0XTxKFc29+igZvQ4+gTw+qHo8EHaIPIu6Kj0IxR08x62JuyPrfD0dl4E88cOHPVODszw8B5I70I2mvAq7NDxGKZ898fNNvbjPFLyGEQE9NJiXPLAeKroahuu7b7u8PLzElzte3ts8NxIPPf5WN7yQ8Z07fx2SvLSFY7xpx0097P7KPCnf6zzG3U89LygJvAJMOj2yHZY8PrHPOwuo9rymsMM8vP0yPOQLcD2gvFS6jm29O/gygjtQ0CA9lFhXvL66LrpaPge8MKMUPMjT5ry+w4O8if0uPbWOODyx26W8CDfUu8etibzMHRi9qyBSPI92Ejx0uRQ73zIAvlMIqLubHIA8ZdLKPOr/3jwi/jo9kd5fvBoBd7y1VZ087waMOrv03TzMXwg9K97XvPR2mj0Mscs7piL6O0rcsbyHjAy960qkutWCqTx3pcI9Mcnxu+l7fj36WN88dyq3vIvzRT2NuZa8I0mAPFh4Njy59XG8yVdHPGYdEDzFYkS89t7nPKybXTyS5zQ8CIwCPOgJyLuR3t+77kkQvTxArTxEKjO9hoO3vK9qgzuUWFc9Yd3HvD6xT7z0+w48BDl8PIMJQL37TvY8jPwaPZ1LsjzTxS28kTMOu1aCHzx0uZQ8AR0IvM1Ddbwt8AG9CsQJvn8Uvbyc0Ka8ifTZPDJEfTzHWFu9Ehdxu2u9ZDw6g7G8nlSHOzlBQTzrzxg74SgXPHsoDz11NCA97IM/PK2kMjyiLXc8AkPlu+tKJD3gH8K8azhwPDhLqrtEKrO8wPI1vXsojzs7N1g8SSgLPdWCqb2FCKw8zJijvIHajTzvb+25orvAOg8rQzuEhEs8OMa1OTNWp7pjTmq7Uo0cPYaDt7sPK8O8uREFPAdBvbxh3ce8XG05PI92krw6vMy8z0LhvAbGMTyYogg9hnpiPHyjGr3Io6A9ex+6PbIUwbxU9ek6uAiwvCNJgD0pbTU9kue0vAJDZb1ebKU9wujMPRhgDr1mHRA9VXB1vN0zlDsWmj27H40YvW+7vLyiu8A8XWPQvN4g1ryR3l+8l1dDvdFBTTx08i+926+zO+MerjxjTmo7obLrvO6Cq7tQCTy8bYwKvZTmID1Veco845Bku2xBxTshxR89QjScvHuR8LwvWE+9ehblO9W7xLqfs/+8FigHPGdfgDwt8AE9PTbEPH/bIT2jxJU8Pbu4PAs2QLyoNKQ8TgpQPRB2CL2Y2yM884CDvGoSk7yObb28a73kvOWGe7ypKrs8wGTsPOAfQjwMgQW9H//OuqXDAT4ouQ49kTMOvH+G8zwr3lc817owvYz8GryUWNc8G9EwPaGya7xlmS+94B/CPFcGgDx8FdG8NNGyvDlBwb3oCcg8mtE6vWJhqDgDx0U8QjScu68V1bzkoo69PECtvV/nMDpzsD+8+d3TPOnZAb3uSZC8orvAvN4pK7yG9W07aL74PB2XAb0l6mg9QD4FveYK3Lwr3lc8nb3ovN2lSjycx1G8qpzxPG7FJbzugiu9X66VvPHDBz0u5hi8jfKxvXqtAz0Qpk48+9PqPIh5zrw5QcG9QD6FO3thqjugN2A8JYGHO4Z64rvDY1g8QD4FvP2ikL3ugiu8gF+CvAJMujtc3++7Vz+bPBWR6Lznjjw8+DICPRMgxrwZFLU8TQF7vDSYl7zJ5ZC8qpzxPC5rDb3MmKM8kmJAPKoX/bwZ25m7sJBgPGZN1jtNHY48EmyfPIETqbze8I+8I0mAvPd1hrwE0Jo9XHaOPGlVlzyX0k48xVlvvUabVTvjYJ49glUZPSfp1LypKju98neuO/LpZLwC0S68UADnPBxVETxkVz+8Us+MPAaNFr2P6Mg7qSq7ulcGgLw12oe826beuxJsn7sHgy29zB0YOHGxUztgYjy7A8fFPLjPFL3weEI8UAm8vKqlxrpZZXg9jfIxvdCNpjuL80W9Go9AvU4TJbxcdg48AcjZPCtsoTwx5YS9F9wtPAy6IL3+HRw8j3aSvGdfALw='
I addressed the following issue and merged in the changes. Just leaving the comment for posterity:
Remove this.
python/ClipDetection/tests/test_clip_triton.py
line 52 at r6 (raw file):
ENABLE_CROPPING='True', ENABLE_TRITON='True', TRITON_SERVER='localhost:8001'
I addressed the following issue and merged in the changes. Just leaving the comment for posterity:
YOLO does:
std::string TRITON_SERVER = "ocv-yolo-detection-server:8001"
python/ClipDetection/tests/test_clip_triton.py
line 59 at r6 (raw file):
result = list(ClipComponent().get_detections_from_image(job))[0] self.assertEqual(job.job_properties["NUMBER_OF_CLASSIFICATIONS"], len(self._output_to_list(result.detection_properties["CLASSIFICATION LIST"]))) self.assertTrue("sturgeon" in self._output_to_list(result.detection_properties["CLASSIFICATION LIST"]))
When I run this I'm not getting sturgeon. I set the number of classes to 10 and got these results:
{
'CLASSIFICATION': 'common newt, Triturus vulgaris',
'CLASSIFICATION CONFIDENCE LIST': '0.06746961921453476; 0.04325991868972778; 0.03234070539474487; 0.02576291561126709; 0.02522958070039749; 0.018444359302520752; 0.016724808141589165; 0.01640925742685795; 0.015537785366177559; 0.014883051626384258',
'CLASSIFICATION LIST': 'common newt, Triturus vulgaris; spotted salamander, Ambystoma maculatum; eel; banded gecko; European fire salamander, Salamandra salamandra; eft; king snake, kingsnake; sports car, sport car; gar, garfish, garpike, billfish, Lepisosteus osseus; water snake'
}
What confidence do you get for sturgeon in your tests? I assume you set the number of classes to 5 since sturgeon wasn't the top class.
The top class I got had a pretty low confidence. To make this test more reliable, considering choosing an image such that the top class has a relatively high confidence. It doesn't have to be a sturgeon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 6 files at r6, 7 of 7 files at r7, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @ZachCafego)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: 26 of 27 files reviewed, 2 unresolved discussions (waiting on @ZachCafego)
python/ClipDetection/clip_component/clip_component.py
line 285 at r7 (raw file):
raise try:
I removed these and some other data members that weren't being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r8.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @ZachCafego)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r9, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @ZachCafego)
python/ClipDetection/clip_component/clip_component.py
line 334 at r8 (raw file):
self.preprocess = self.crop elif enable_triton: self.preprocess = lambda image: TF.to_tensor(image).unsqueeze(0)
What the reason for removing this?
python/ClipDetection/tests/test_clip_triton.py
line 59 at r9 (raw file):
result = list(ClipComponent().get_detections_from_image(job))[0] self.assertEqual(job.job_properties["NUMBER_OF_CLASSIFICATIONS"], len(self._output_to_list(result.detection_properties["CLASSIFICATION LIST"]))) self.assertTrue("collie" in self._output_to_list(result.detection_properties["CLASSIFICATION LIST"]))
I'm not getting "collie" in the top 5, but it does show up in slot 10. Here's what I get:
toy terrier; wire-haired fox terrier; Ibizan hound, Ibizan Podenco; Chihuahua; papillon; Border collie; Scotch terrier, Scottish terrier, Scottie; Australian terrier; basenji; collie
0.11867158859968185; 0.11635984480381012; 0.08234529942274094; 0.05162452161312103; 0.049181509763002396; 0.047043319791555405; 0.043919168412685394; 0.04185420647263527; 0.0391409806907177; 0.03007649816572666
Please set NUMBER_OF_CLASSIFICATIONS
to 10 and check for "collie" or "Border collie". Either class is acceptable. That should work in most cases.
Please capitalize the "b" since the class is "Border collie". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r10, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @ZachCafego)
So basically, the reason for removing this is that the resize_pad function is doing what that lambda function was doing, except that it resizes and pads into the 224x224 image containing the entire image before converting to a tensor. We could compare the performances but I felt like it made the most sense for the resize_pad to be the default preprocessing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r11, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @ZachCafego)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ZachCafego)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dismissed @ZachCafego from a discussion.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @ZachCafego)
Issues:
Related PRs:
This change is