Skip to content

intrinsic-ai/ipd

Repository files navigation

alt text

Industrial Plentopic Dataset

alt text

Paper accepted at CVPR 2024!

Abstract

6DoF Pose estimation has been gaining increased importance in vision for over a decade, however it does not yet meet the reliability and accuracy standards for mass deployment in industrial robotics. To this effect, we present the Industrial Plenoptic Dataset (IPD): the first dataset and evaluation method for the co-evaluation of cameras, HDR, and algorithms targeted at reliable, high-accuracy industrial automation. Specifically, we capture 2,300 physical scenes of 22 industrial parts covering a $1m\times 1m\times 0.5m$ working volume, resulting in over 100,000 distinct object views. Each scene is captured with 13 well-calibrated multi-modal cameras including polarization and high-resolution structured light. In terms of lighting, we capture each scene at 4 exposures and in 3 challenging lighting conditions ranging from 100 lux to 100,000 lux. We also present, validate, and analyze robot consistency, an evaluation method targeted at scalable, high accuracy evaluation. We hope that vision systems that succeed on this dataset will have direct industry impact.

Bibtex

@InProceedings{Kalra_2024_CVPR,
    author    = {Kalra, Agastya and Stoppi, Guy and Marin, Dmitrii and Taamazyan, Vage and Shandilya, Aarrushi and Agarwal, Rishav and Boykov, Anton and Chong, Tze Hao and Stark, Michael},
    title     = {Towards Co-Evaluation of Cameras HDR and Algorithms for Industrial-Grade 6DoF Pose Estimation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {22691-22701}
}

This Repo

  • March 29th: Dataset Released!
  • July 3rd: Scanned CAD Available
  • July 3rd: Code for downloading and visualization data
  • August 15th: Code for Robot Consistency Evaluation Method
  • February 2025: Leaderboard for submitting results on test images

Dataset

alt text

In the repo you can find the evaluation dataset as well as links to relevant cad models

Please use our scripts to download/extract datasets and cad models in scripts/dataset

Dataset and CAD model descriptions along with download links are available here

Dataset is in BOP format.

DATASET_NAME/
--- dataset_info.json
--- test/
------ SCENE_ID/
--------- CAM_ID/
------------ scene_camera.json
------------ scene_gt.json
------------ scene_pose.json
------------ rgb/
--------------- 000000.png  # for Photoneo only, a single 16bit image; we don't provide separate exposures for Photoneo
--------------- 0_EXPOSURE_1.png
--------------- 0_EXPOSURE_2.png
--------------- 0_EXPOSURE_3.png
--------------- 0_EXPOSURE_4.png
------------ depth/  # for Photoneo only
--------------- 000000.png

scene_gt.json contains the part poses in the respective camera coordinate frame.

scene_pose.json contains the hand eye calibration (robot base in the camera coordinate frame) and the gripper pose in robot base coordinate frame.

For FLIR_polar we include originally captured distorted images and add the distortion parameters in scene_camera.json. Undistortion of FLIR_polar before computing AOLP and DOLP can lead to artifacts.

Toolkit and Demo Notebooks

Please see the demo notebooks for using the IPD Toolkit to download, read, render, and match & evaluate predictions using Robot Consistency as described in the paper.

Basler-LR sample visualization Basler-LR sample visualization Basler-HR sample visualization Basler-HR sample visualization
FLIR_polar sample visualization Basler-LR sample visualization Photoneo sample visualization Basler-HR sample visualization

BOP compatibility (EXPERIMENTAL)

Notebook with documentation for converting to BOP format provided in convert_to_bop.ipynb. You do not need to clone the bop_toolkit submodule unless using this script.

Parts used

We purchased all physical parts from McMaster-Carr's website (see the links to the parts here). We provide the recreated 3D models of the parts here

License

All dataset, code, and models available in this repository are given under the CC-BY NC SA license, and are intended for Non-Commercial use only.