Gaze Estimation Round 2! #456

saeidn95 · 2024-04-23T03:11:52Z

saeidn95
Apr 23, 2024

As we discussed before I made some progress on measuring the gaze at movement Iris center on the eyeball sphere.

Here is what I have got so far:

a. Assuming the eyeball to be a sphere, with 4 mesh points that are used in calculateGaze, I used the opencv linear solver,cv::solve(), to find the center and the radius of the eyeball sphere.

b. Next I computed the x, y, and z coordinates of the center of the eye and the center of iris with reference to the center of the eyeball sphere, from their x and y coordinates and information obtained in step a. The numbers I obtain make sense. but I am not sure how accurate they are. The accuracy depends on the accuracy of the scaling factor between the x and z coordinates.

c. Having the points on the sphere is my next step to compute the distance on the curved sphere surface between the center of the eye and the iris center to obtain the two angles.

Before I go through step c. I like to pick on your insight and experience.
I need to get a better handle of the scaling of x and y with respect to z, that arises from the bug in mediapipe network. Without that it is hard to get the right value for the z coordinate.

I wonder if you can explain a bit more as to how you discovered that distances x and y need to be scaled by the factor of 1.5, and how you came up factor 1,5.

From what I can see the scaling has some dependency on the face detection network input size. Humans use a 256 x 256 network. I use a 128x128 network.
My second observation is that the scalining depends on the camera resolution. I noticed that with a higher resolution image of 1600x1300 a scaling 1.5 is good. For a low resolution image of 640x480 I need to increase the scaling by 2.5. With 1.5 the mesh only covers the center of the face, and I had to increase the box size by a factor of 2.5 to stretch the mesh to cover the whole face.

It makes me wonder why the scaling depends on the face detection network input share, and the resolution of the camera changes the scale factor.
Any insight that you can provide in finding the right scale factor for z, would be appreciated.

vladmandic · 2024-04-24T15:20:22Z

vladmandic
Apr 24, 2024
Maintainer

I wonder if you can explain a bit more as to how you discovered that distances x and y need to be scaled by the factor of 1.5, and how you came up factor 1,5.

I wish i could tell you - honest answer is that there was a loooot of experimentation and scaling in different places and i'm still not perfectly happy with that part of the code, but i haven't touched it in a long time and i wish i kept better notes - sometimes i'm too aggressive with removing old code and would be nice to look through it and looking through git history is a nightmare.

From what I can see the scaling has some dependency on the face detection network input size. Humans use a 256 x 256 network. I use a 128x128 network.

those are two differently trained networks. human supports both, i just choose 256x256 one as default as 128x128 network is trained primarily on close-up images so its not that good detecting faces on images with full-body visible.

My second observation is that the scalining depends on the camera resolution. I noticed that with a higher resolution image of 1600x1300 a scaling 1.5 is good. For a low resolution image of 640x480 I need to increase the scaling by 2.5. With 1.5 the mesh only covers the center of the face, and I had to increase the box size by a factor of 2.5 to stretch the mesh to cover the whole face.

it shouldn't, that sounds like issue with scaling of cropped face when you pass it to mesh model.

2 replies

saeidn95 Apr 24, 2024
Author

Thanks! I am going through the same process; "a lot of experimentation"

a. "it shouldn't, that sounds like issue with scaling of cropped face when you pass it to mesh model." I agree, it makes no sense. But that is what I see. I will have a closer look.

b. Do you recall what you communicated with mediapipe? That probably should give us sufficient clue about the issues you faced.

c. What is the camera(s) resolution(s) that you have used in your examples?

d. I have finished coding my proposed technique; should be able to report it soon.

vladmandic Apr 24, 2024
Maintainer

b. Do you recall what you communicated with mediapipe? That probably should give us sufficient clue about the issues you faced.

i'll dig it up

c. What is the camera(s) resolution(s) that you have used in your examples?

mostly fullhd, but i did test with different resolutions and also with input videos/images, not just webcam.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gaze Estimation Round 2! #456

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

Gaze Estimation Round 2! #456

saeidn95 Apr 23, 2024

Replies: 1 comment · 2 replies

vladmandic Apr 24, 2024 Maintainer

saeidn95 Apr 24, 2024 Author

vladmandic Apr 24, 2024 Maintainer

saeidn95
Apr 23, 2024

Replies: 1 comment 2 replies

vladmandic
Apr 24, 2024
Maintainer

saeidn95 Apr 24, 2024
Author

vladmandic Apr 24, 2024
Maintainer