-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple source images #20
Comments
Thanks for your interest! We experimentally support passing multiple source images with corresponding boxes for inferring in this branch https://github.com/THU-MIG/yoloe/tree/multi-source-predict-vp. Could you please try it? Thanks! |
Thank you for the reply James. I see that in the branch you mentioned the reference as well is for a single image , Does it mean i should apply the predictor method/class in series over multiple images ? yoloe/predict_visual_prompt.py Line 33 in e26f071
|
Hi, it actually predicts for multiple images (source_image and target_image) at once yoloe/predict_visual_prompt.py Line 36 in e26f071
|
Not sure if i get it right , but do you mean i pass in like this ? Since I have multiple images with different bboxes i would expect the logic to take in a list of dictionary ( where each dictionary has the list of bboxes for a particular image ) ? |
The format is like this.
Each image can have a list of Box and Cls. |
Thanks, |
This was very insightful, Thank you ! Also the current validation expects the target image to have at least one bbox/class ( in the dict ) , Can this be excluded ? |
The target image in yoloe/predict_visual_prompt.py Lines 38 to 45 in e26f071
|
Would you mind sharing more details about the hampering detection? ;) |
The Observed behavior was that when i added multiple source image it happens to miss some objects that it detected with only 1 image as source. Note : I am under the assumption that with more images/annotations the predictions of target image will get better, which i hope is the case ? |
This is what i am trying to do source_image = 'testing_data/source_image.png' model.predict([source_image,source_image_2], save=True, prompts=visuals, predictor=YOLOEVPSegPredictor) model.set_classes(["object0", "object1", "object3"], model.predictor.vpe) I am basically looking for cross image prompt with multiple source images . |
Yes, it is expected that the predictions will get better with more images/annotations.
Could you please try to set |
Hi @jameslahm I have same question too. Here is my code visuals = dict(
bboxes=[
np.array([
[78.0, 202.0, 130.0, 333.0]
]),
np.array([
[240.0, 240.0, 268.0, 283.0]
])
]
,
cls=[
np.array([
0
]),
np.array([
0
])
]
)
source_image0 = "source00.jpg"
source_image1 = "source01.jpg"
target_image = "target.jpg"
# model.predictor = None # remove VPPredictor
model.predict([source_image0, source_image1], save=True, prompts=visuals,
predictor=YOLOEVPSegPredictor, return_vpe=True)
model.set_classes(["object0"], model.predictor.vpe)
model.predictor = None # remove VPPredictor
model.predict(target_image, save=True) And here is error message 0: 640x640 1 object0, 105.6ms
1: 640x640 1 object0, 105.6ms
Speed: 5.1ms preprocess, 105.6ms inference, 424.7ms postprocess per image at shape (1, 3, 640, 640)
Results saved to runs/segment/predict8
Traceback (most recent call last):
File "/media/ian/disk/Ian/playground/yoloe/predict_visual_prompt_test.py", line 36, in <module>
model.predict(target_image, save=True)
File "/media/ian/disk/Ian/playground/yoloe/ultralytics/engine/model.py", line 551, in predict
self.predictor.setup_model(model=self.model, verbose=is_cli)
File "/media/ian/disk/Ian/playground/yoloe/ultralytics/engine/predictor.py", line 307, in setup_model
self.model = AutoBackend(
File "/media/ian/disk/Ian/playground/yoloe_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/media/ian/disk/Ian/playground/yoloe/ultralytics/nn/autobackend.py", line 148, in __init__
model = model.fuse(verbose=verbose)
File "/media/ian/disk/Ian/playground/yoloe/ultralytics/nn/tasks.py", line 233, in fuse
m.fuse(self.pe.to(device))
File "/media/ian/disk/Ian/playground/yoloe_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/media/ian/disk/Ian/playground/yoloe/ultralytics/nn/modules/head.py", line 443, in fuse
conv.weight.data.copy_(w.unsqueeze(-1).unsqueeze(-1))
RuntimeError: output with shape [2, 256, 1, 1] doesn't match the broadcast shape [2, 2, 256, 1, 1] |
@Ian-Work-AI Thanks for your interest! Due to that there are two source images,
Could you please try it? Thanks! |
model.set_classes(["object0"], model.predictor.vpe.mean(dim=0, keepdim=True).normalize(dim=-1, p=2))
AttributeError: 'Tensor' object has no attribute 'normalize'. Did you mean: 'normal_'? Should I change to np.array() before using normalize? |
Sorry, could you please try to use
|
@jameslahm Fantastic!! Here is the result of one image: |
Thanks for the reply, Actually, the data I am testing is sensitive hence can't be shared, but after a round of testing I think it's the limitation due to kind of data I am testing with, the objects that i am looking to detect are not the most common ones, it has lot more to detect from table kind of structures. Source with groundtruth Target(same image as source) with prediction I am now wondering if training the pretrained net on similar image will improve the results ? |
Good to see that it works for common objects pretty well 👍 I am planning to write an article on the module where these results would be helpful in defining YoloE's strength. |
Also Is it possible to substitute the pretrained model with an yolov8 object detection model instead of segmentation ? AttributeError: 'DetectionModel' object has no attribute 'get_visual_pe' |
Hi @Akhp888 , thanks!
Yes, we thought that training the pretrained net on similar image can improve the results.
Hi, for detection only model, you could delete the segmentation part from models like model = YOLOE("yoloe-v8l.yaml")
model.load("yoloe-v8l-seg.pt") Then, you could use YOLOEVPDetectPredictor for this model with only the detection. |
Hi, below is the code for your reference:
|
@jameslahm , it would be very helpful if multi visual prompt feature (available in multi_source_predict_vp branch) is integrated in the main branch. Zero shot detection by providing multiple visual prompts is feasible in branch multi_source_predict_vp but transferring pretrained model on custom dataset causes issue in this branch. As in multi_source_predict_vp branch, there are various checks like save_json should be True which is not in main branch. Also I tried training by keeping save_json = True but I could not get the confusion matrix. It would be useful to have multiple visual prompts and issue free training in one branch. |
@Akhp888 Hi, did you train |
@shataxiDubey Merged in c21bc24 ;) |
Yes, I trained a yolov8 model as it said yoloe can out of the box load yolov8 models . |
Hi, @Akhp888 , we thought that you need to train model based on |
Is it possible to have multiple source images with their corresponding boxes be used for inferring on multiple images ?
The text was updated successfully, but these errors were encountered: