Makes me think of that chapter in Infinite Jest where videoconferencing gets popular, until people start using "optimized" computer-rendered images instead of showing their actual faces, at which point everyone goes back to audio-only.
I wouldn't mind video conferencing with computer generated avatars. I don't video conference to know what the other person looks like, and in fact knowing what they look like just creates lots of unnecessary bias. I do it for the cues from their gestures, facial expressions, the direction they are looking, etc. With a good tracking setup that works perfectly well today with digital avatars.
Anyway, your comment is bullshit. I'm a white guy (actually from Spain, so I don't know if I qualify as "white" from the USA point of view) living in Japan, so I am the discriminated minority here (and yes, I have been a victim of minor racism), and I agree with GP. We humans evolved for face to face conversation, if racism is the problem, hiding behind a mask is not the solution.
Because racism exists and is a real problem for lots of people.
You don't fix that by pretending race doesn't matter, because in the lived experience of all of those people, it matters, a lot. They would love it if it didn't matter, but that is not what happens in reality.
It doesn't prevent discrimination, it just allows me to do what I want to do in the short term, e.g. raise funding for startups or get dream jobs or whatever.
I mean if a VC hands me a term sheet or a hiring manager hands me a job and the ONLY thing I misrepresented is my face, I don't think they have any ethical grounds to retract their offer.
Solving the discrimination problem is another matter, and will take years if not decades, my dreams cannot wait for that.
Sincerely, I'm sorry you feel that you haven't achieved certain things because of skin colour/accent/etc. However let me challenge that belief, only because if your assumption isn't true, that your lack of desired success is because of things outside of your control, then perhaps you won't make the adjustments potentially needed to actually achieve them. If your dreams truly cannot wait (as you said, and they shouldn't!), then imaginary AI solutions to not-catastrophic-not-life-ending barriers shouldn't be seen as anything relevant to your eventual success.
None of us can control skin colour, accent, connected families, or other benefits that others may have. But we can outwork the privileged, and today's world, with just a computer and internet connectivity, gives us all much more options than previously imaginable.
I do sincerely wish you great luck and success, but I really think you shouldn't get stuck on things you can't control, and devote all your energy to things you can.
Oh yes of course. I just think it would be a super cool technical project to actually make this work real time in a videoconference, and there would be lots of use cases for it. Eliminating survey biases during interview-style surveys is another.
Good luck to you. I am not convinced people would be more likely to give money or a job to someone when they have only seen an avatar, but what do I know.
If you deceive someone by wearing a different face during an interview, I expect that you would not be hired. How do you prove you were the person interviewed?
There are plenty of ways to prove that, including stating a public key during the interview, interviewer signs it with their key, vice versa is done with the interviewer's key, and then the same is demonstrated in person with the same keys.
Hell, how to do that can even be an interview question.
That's a terrible answer to the hypothetical interview question because your solution only works if the person that took the interview is not colluding with the person that wants the job. The interviewee could just email their private key to the job seeker.
The key signing only protects against some very uncommon threats
That defeats the entire purpose of using facial and body expressions that only video provides.
We already have video filters that remove wrinkles and blemishes in videoconferencing to make you look better.
Even if we replace ourselves entirely with computer-rendered images, they're still going to be reproducing our expressions, movements and gestures, which is what matters.
My phone already has a video call beautification setting built into the OS at the camera level.
We've been skirting the line for a while.
If I could, right now I absolutely would prefer to be sending a synthesized avatar then the real me - my desktop setup doesn't allow very optimal camera placement with large monitors, but for maximum impact I ideally want to send my face making direct eye contact with the camera.
Is there a filter available that just fixes the apparent direction of the pupil, so the image is looking at the camera, without doing any other edits? That would be really useful.
When you upload, submit, store, send or receive User Content to or through the NVIDIA Research AI Playground, you give NVIDIA (and parties NVIDIA works with, including its affiliates, suppliers and customers) a worldwide license to use (including without limitation for neural network training), host, store, reproduce, modify, create derivative works (such as those resulting from translations, adaptations or other changes), communicate, publish, publicly perform, publicly display and distribute such User Content. The rights you grant in this license are for the limited purpose of operating, promoting, and improving the NVIDIA Research AI Playground and content available to all users, and to develop new NVIDIA offerings. This license continues even if you stop using the NVIDIA Research AI Playground. The NVIDIA Research AI Playground may offer you ways to access, download, and remove content that has been provided, but make sure to keep your own back-up copies of your User Content. Also, the scope of services is limited and not all content in all formats can be loaded in the NVIDIA Research AI Playground.
The reason it’s usual is because all sites with upload features need the right to host what you upload. Facebook, for example, has the exact same thing.
“Permission to use content you create and share. [...] when you share, post, or upload content that is covered by intellectual property rights on or in connection with our Products, you grant us a non-exclusive, transferable, sub-licensable, royalty-free, and worldwide license to host, use, distribute, modify, run, copy, publicly perform or display, translate, and create derivative works of your content”
It doesn't sound like they are trying to sell or somehow leverage to make profit? I guess 'develop new NVIDIA offerings' is kind of vaguely suspicious but the rest doesn't seem onerous to me unless I'm missing something?
Those license terms seem to me like they'd allow using someone's picture in unlimited advertising, for free. They also sound like they'll definitely be using them to train models. I wouldn't be surprised if some of that means they can sell them to advertisers for whatever nonsense.
My take is that they are limiting their liability in case anyone is overly letigius, or if they mishandle the data. Yes, they could do the things you describe, but my guess is they don't want to get sued by people they are providing a free service to.
Interesting that they're running plaintext HTTP on the HTTPS port. Are some networks filtering 80 on egress now and this is how people get around it? Or are the devs just using some cloud setup that tries to force serving HTTPS but the builders of the service don't want to?
Surprisingly flexible algorithm. I threw a headshot illustration of a comic villain into it and it was able to successfully do all the rotations and even manipulate the eyes despite it not being photorealistic.
There was an app going around a few months ago called “wombo.ai” that will make a headshot sing. It had the same outcome when given non human pictures as a source. Easily killed an afternoon trying out different things.
That was using something similar to First Order Motion Model & not the same model as what Nvidia has been using for Maxine or subsequent improvements to it as in the linked demo
If you allow me, about such animated pictures, you should be interested in my pet project -- 1000x animated(!) fantasy avatar faces. 100% AI-Generated. Check it out, it's free and beautiful, it feels like next-gen.
I made it shamelessly with https://www.fantasy-faces.com/ for the GAN and myheritage.fr for animation -- it was still a lot of work to select 1000 of the most beautiful.
While the client-side has input validation, it appears that the server-side does not, as I can edit the request body freely and it'll return accordingly.
It's interesting to see how the model fails at extreme values. I can see why they chose the cutoffs they did!
Seems to use the EXIF data to display the image, so if uploading an image from a portait image you might want to strip that or it breaks.
Its pretty cool, but one thing that always bugs me - why is the demo page so bad?
For the majority of people, this is the only way they are going to see this thing working - would it really have hurt to have had an actual frontend dev nock something together for this? It takes away from the great work behind the scenes imho
I'm quite impressed by how NVIDIA Broadcast cleans up a simple webcam image already, on a 3070 GPU; the background blur will get the gap between headphone bridge and head with sharp cuts - it's impressive enough in my books to warrant such a gaming grade GPU for work purposes, if a remote worker.
I have my cam off to the side; I'm really looking forward to being able to try the angle correction!
Makes me think of that chapter in Infinite Jest where videoconferencing gets popular, until people start using "optimized" computer-rendered images instead of showing their actual faces, at which point everyone goes back to audio-only.