Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[train on my own dataset] pointcloud version works but ff version's result in all zero #37

Open
Promethe-us opened this issue Mar 12, 2025 · 2 comments

Comments

@Promethe-us
Copy link

Promethe-us commented Mar 12, 2025

Thanks for your great work, I make my data in SUNRGBD format, it works successfully in pointcloud mode.
classes AP_0.25 AR_0.25 AP 0.50AR 0.50
chair 1.0000 1.0000 0.9994 0.9994
table 1.0000 1.0000 1.0000 1.0000
bottle 1.0000 1.0000 1.0000 1.0000
cup 1.0000 1.0000 0.9700 1.0000
Overall 0.9999 1.0000 0.9810 0.9925

But when I try to train a ff model, the result during trains is all zero:
+---------+---------+---------+---------+---------+
| classes | AP_0.25 | AR_0.25 | AP_0.50 | AR_0.50 |
+---------+---------+---------+---------+---------+
| chair | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| table | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| bottle | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| cup | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
+---------+---------+---------+---------+---------+
| Overall | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
+---------+---------+---------+---------+---------+

I have checked my *.pkl files:

{'point_cloud': {'num_features': 6, 'lidar_idx': 0},
 'pts_path': 'points/000000.bin',
 'image': {'image_idx': 0,
  'image_shape': array([512, 512], dtype=int32),
  'image_path': 'image/000000.jpg'},
 'calib': {'K': array([[238.84535,   0.     , 256.     ],
         [  0.     , 238.84535, 256.     ],
         [  0.     ,   0.     ,   1.     ]], dtype=float32),
  'Rt': array([[1., 0., 0.],
         [0., 1., 0.],
         [0., 0., 1.]], dtype=float32)},
 'annos': {'gt_num': 4,
  'name': array(['chair', 'table', 'bottle', 'cup'], dtype='<U6'),
  'bbox': array([[  0., 0., 512., 512.],
         [  0., 0., 512., 512.],
         [  0., 0., 512., 512.],
         [  0., 0., 512., 512.],]),
  'location': array([[ 2.13502165,  4.05805589, -0.08512209],
         [ 0.84577456,  5.20489811, -0.31459103],
         [-0.18873514,  1.05541911, -0.12484555],
         [ 0.7586191 ,  5.18078147, -0.04048699]]),
  'dimensions': array([[0.68855083, 0.58342147, 1.18504596],
         [0.6000011 , 1.20000041, 0.45000005],
         [0.18035316, 0.18032134, 0.70449841],
         [0.27000153, 0.35957909, 0.30000052]]),
  'rotation_y': array([ 2.37364756e+00,  6.80678119e-01, -2.73684540e-07, -2.73684540e-07]),
  'index': array([0, 1, 2, 3], dtype=int32),
  'class': array([0, 1, 2, 3]),
  'gt_boxes_upright_depth': array([[ 2.13502165e+00,  4.05805589e+00, -8.51220912e-02,
           6.88550830e-01,  5.83421469e-01,  1.18504596e+00,
           2.37364756e+00],
         [ 8.45774562e-01,  5.20489811e+00, -3.14591028e-01,
           6.00001097e-01,  1.20000041e+00,  4.50000048e-01,
           6.80678119e-01],
         [-1.88735143e-01,  1.05541911e+00, -1.24845549e-01,
           1.80353165e-01,  1.80321336e-01,  7.04498414e-01,
          -2.73684540e-07],
         [ 7.58619104e-01,  5.18078147e+00, -4.04869902e-02,
           2.70001531e-01,  3.59579086e-01,  3.00000519e-01,
          -2.73684540e-07]])}}

I want to ask 2 questions:
(1) I set all box 2d to [0, 0, 512, 512], because the loss is calculated using box_3d, so I set a random value to box 2d, is it works?
(2) I set 'Rt' to np.eye(3), will this influence the result?

@Promethe-us
Copy link
Author

I noticed that "load_from = 'https://download.openmmlab.com/mmdetection3d/v0.1.0_models/imvotenet/imvotenet_faster_rcnn_r50_fpn_2x4_sunrgbd-3d-10class/imvotenet_faster_rcnn_r50_fpn_2x4_sunrgbd-3d-10class_20210323_173222-cad62aeb.pth' # noqa" in config file, if I train on my own dataset, I should change to another one or train a 2d detector myself?

@filaPro
Copy link
Contributor

filaPro commented Mar 12, 2025

(1) Yes, i think we simply ignore 2d bboxes.
(2) No, I believe it is a serious issue. R should be correct, e.g. each 3d box should be projected to the correct position of its object in 2d after applying K and R.
(3) Probably yes, you can even try imagenet-trained resnet from mmcv. But I don't think it is important. Just be careful with image normalization, to be sure that it is the same as was during pre-training.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants