Replies: 1 comment 2 replies
-
I wish i could tell you - honest answer is that there was a loooot of experimentation and scaling in different places and i'm still not perfectly happy with that part of the code, but i haven't touched it in a long time and i wish i kept better notes - sometimes i'm too aggressive with removing old code and would be nice to look through it and looking through git history is a nightmare.
those are two differently trained networks.
it shouldn't, that sounds like issue with scaling of cropped face when you pass it to mesh model. |
Beta Was this translation helpful? Give feedback.
-
As we discussed before I made some progress on measuring the gaze at movement Iris center on the eyeball sphere.
Here is what I have got so far:
a. Assuming the eyeball to be a sphere, with 4 mesh points that are used in calculateGaze, I used the opencv linear solver,cv::solve(), to find the center and the radius of the eyeball sphere.
b. Next I computed the x, y, and z coordinates of the center of the eye and the center of iris with reference to the center of the eyeball sphere, from their x and y coordinates and information obtained in step a. The numbers I obtain make sense. but I am not sure how accurate they are. The accuracy depends on the accuracy of the scaling factor between the x and z coordinates.
c. Having the points on the sphere is my next step to compute the distance on the curved sphere surface between the center of the eye and the iris center to obtain the two angles.
Before I go through step c. I like to pick on your insight and experience.
I need to get a better handle of the scaling of x and y with respect to z, that arises from the bug in mediapipe network. Without that it is hard to get the right value for the z coordinate.
I wonder if you can explain a bit more as to how you discovered that distances x and y need to be scaled by the factor of 1.5, and how you came up factor 1,5.
From what I can see the scaling has some dependency on the face detection network input size. Humans use a 256 x 256 network. I use a 128x128 network.
My second observation is that the scalining depends on the camera resolution. I noticed that with a higher resolution image of 1600x1300 a scaling 1.5 is good. For a low resolution image of 640x480 I need to increase the scaling by 2.5. With 1.5 the mesh only covers the center of the face, and I had to increase the box size by a factor of 2.5 to stretch the mesh to cover the whole face.
It makes me wonder why the scaling depends on the face detection network input share, and the resolution of the camera changes the scale factor.
Any insight that you can provide in finding the right scale factor for z, would be appreciated.
Beta Was this translation helpful? Give feedback.
All reactions