Generates an extrinsics matrix, using an intrinsics matrix and a set of person detection boxes as required inputs. Allows finding screen-to-world correspondences. The extrinsics matrix is calculated by using a set of person detection boxes to calculate the distances to the camera. Knowing the camera bearing, world-to-screen points correspondences are automatically found, and the extrinsics matrix is calculated in some steps.
- To find screen-to-world correspondences (which point in the space in the camera image corresponds to a point in the screen), another approach is suggested. Given that an extrinsics matrix is required, the extrinsics matrix is calculated using a set of world-to-screen point correspondences. See the full calculation here:
- The previous applications require an intrinsics matrix (also called calibration matrix, camera matrix or K matrix) which can be calculated using
- To find screen-to-world correspondences in a simple manner, another approach is proposed: See the full calculation here: In such example, In such example, an homography matrix is calculated just by using a set of world-to-screen point correspondences. Therefore, the intrinsics matrix (the one calculated with this application) is not needed.
Generating a set of screen-to-world points correspondences could be a tedious process, usually it implies identifying points on a map tile service (e.g. OSM, Google Maps or The application could help, but the process is costly anyway. Mainly when there's a hundred cameras to calibrate.
So, for that, if we have the camera pole height, the camera bearing (towards north, in degrees), and some person detection boxes, distances from the camera can be calculated. With that, screen-to-world correspondences can be generated, and all the necessary elements to generate the extrinsics matrix are ready. This is, in some way, an automatic camera calibration.
See the draft here with a complete description of the process.
The actual relevant code (autocalib.cpp
) takes 20 lines, standardizing the detection behavior with linear regressions, generating new sets from the original and generating the matrices...
mkdir -p build && pushd $_;
cmake ..
$ head test/detections.csv
This can be obtained using
$ cat test/camera_matrix_1280x720.yaml
camera_matrix: !!opencv-matrix
rows: 3
cols: 3
dt: d
data: [ 9.8658267922418236e+02, 0., 6.5644063029267375e+02, 0.,
9.2143502631862248e+02, 3.5621735537789453e+02, 0., 0., 1. ]
distortion_coefficients: !!opencv-matrix
rows: 5
cols: 1
dt: d
data: [ 2.6029460512088087e-01, -1.8101009855718500e+00,
-3.8176576250647945e-04, 1.4483555874521570e-02,
5.9783794886029265e+00 ]
$ ./gen-points test/detections.csv test/camera_matrix_1280x720.yaml
Processing test/detections.csv...
Done. Generating matrix...
The generated matrix is called rotation_translation_matrix.yaml
. Test it with some points.
Incidentally, the application has generated a points_matrix.yaml
, which you can also test with some points.
$ cat rotation_translation_matrix.yaml
rotation_matrix: !!opencv-matrix
rows: 3
cols: 3
dt: d
data: [ -4.8359770760309317e-01, 8.7471580995178844e-01,
3.1709762241613038e-02, -2.2331512014778204e-01,
-1.5832934118414987e-01, 9.6180152673697417e-01,
8.4632358723741985e-01, 4.5804374413431237e-01,
2.7190497263751068e-01 ]
rotation_vector: !!opencv-matrix
rows: 3
cols: 1
dt: d
data: [ -8.0397627454137988e-01, -1.3000894692412277e+00,
-1.7524112718452212e+00 ]
translation_vector: !!opencv-matrix
rows: 3
cols: 1
dt: d
data: [ 1.9824397141331261e+01, 9.9559567883918714e+00,
-3.7528364626343134e+01 ]
camera_position: !!opencv-matrix
rows: 3
cols: 1
dt: d
data: [ 4.3571488872267587e+01, 1.4252391215663509e+00,
-1.3240228121778462e-04 ]