Wrt upping the frame rate the main problem is that the phone may run a bit hot, newer iPhones/iPads should be able to handle it just fine, but the older ones based on, say, the A10, might have trouble keeping up, especially with multiple remote parties connected.
* The framing depends on a transformation derived from the face landmarks, and the amount of padding is somewhat flexible. Distance from the camera seems to impact this, so it could be that my landmarks model needs some tweaking to work better when you are sitting very close to the camera.
* This is closer to being a general video codec than a face-generating GAN, so there is not a lot of "cheating" in that respect. It is optimized for transmission of faces, but other images will pass through if you let them (which I currently don't).
* I built the AI engine and the face recognizer etc from scratch, though with the help of a former co-founder who was originally the one training our models (in pytorch). The vertigo.ai home page has some demo videos. We initially targeted raspberry-pi style devices, NVIDIA Jetsons, etc., but have since ported to IOS and MacOS. Our initial customers were startups, mostly in the US, and a large Danish university that uses us for auditorium head counting.
* It empirically does seem to work on diverse faces, both in real life and when testing on for example the "coded bias" trailer. Ideally I would like to test more systematically on something like Facebook/Meta's "casual conversations" dataset.
What are you saying, that the Danish are selling the technology on to the Chinese? As if the Chinese government didn’t already have massively deployed facial recognition tech?
Wrt upping the frame rate the main problem is that the phone may run a bit hot, newer iPhones/iPads should be able to handle it just fine, but the older ones based on, say, the A10, might have trouble keeping up, especially with multiple remote parties connected.
* The framing depends on a transformation derived from the face landmarks, and the amount of padding is somewhat flexible. Distance from the camera seems to impact this, so it could be that my landmarks model needs some tweaking to work better when you are sitting very close to the camera.
* This is closer to being a general video codec than a face-generating GAN, so there is not a lot of "cheating" in that respect. It is optimized for transmission of faces, but other images will pass through if you let them (which I currently don't).
* I built the AI engine and the face recognizer etc from scratch, though with the help of a former co-founder who was originally the one training our models (in pytorch). The vertigo.ai home page has some demo videos. We initially targeted raspberry-pi style devices, NVIDIA Jetsons, etc., but have since ported to IOS and MacOS. Our initial customers were startups, mostly in the US, and a large Danish university that uses us for auditorium head counting.
* It empirically does seem to work on diverse faces, both in real life and when testing on for example the "coded bias" trailer. Ideally I would like to test more systematically on something like Facebook/Meta's "casual conversations" dataset.