Hacker News new | past | comments | ask | show | jobs | submit login
Pose Animator: SVG animation tool using real-time TensorFlow.js models (github.com/yemount)
719 points by hardmaru on May 9, 2020 | hide | past | favorite | 74 comments



This is like creating a live version of author drawings, e.g. of New Yorker contributors:

https://www.newyorker.com/contributors

How interesting would it be to have your own live emotionally expressive avatar for videoconferencing, when you don't want to worry about your hair, makeup, lighting, or general visual state at all?


David Foster Wallace ruminated on this in Infinite Jest.

"The proposed solution to what the telecommunications industry's psychological consultants termed Video-Physiognmoic Dsyphoria (or VPD) was, of course, the advent of High-Definition Masking. Mask-wise, the initial option of High-Definition Photographic Imaging — i.e. taking the most flattering elements of a variety of flattering multi-angle photos of a given phone-consumer and‚ thanks to existing image-configuration equipment already pioneered by the cosmetics and law-enforcement industries — combining them into a wildly attractive high-def broadcastable composite of a face wearing an earnest, slightly overintense expression of complete attention."

Always thought it was fascinating that he came up with this in 1996!


"You can look like a gorilla or a dragon or a giant talking penis in the Metaverse. Spend five minutes walking down the Street and you will see all of these. Hiro's avatar just looks like Hiro, with the difference that no matter what Hiro is wearing in Reality, his avatar always wears a black leather kimono. Most hacker types don't go in for garish avatars, because they know that it takes a lot more sophistication to render a realistic human face than a talking penis. Kind of the way people who really know clothing can appreciate the fine details that separate a cheap gray wool suit from an expensive hand-tailored gray wool suit.", Neil Stephenson, Snowcrash, 1992.


You can do that with some premade and custom models via https://facerig.com/ already. They're not perfect, but they aree pretty expressive.


IOS has the animated animal characters that mimic gestures pretty well.


I was playing with OpenFrameworks and Kinect years ago, and you could pretty much do this, so I am curious why you would TF.js instead of simple OpenCV or other libraries that don't need to to use ML or DL. Or am I mistaken, and it is using simple bits of TS.js?

We had a large, flip-disc (or dot) wall at the site in Macau in 2012 that we purchased from a company that was a wall of black and white discs or dots that would flip to create a cool effect by tracking people in front of the wall with Kinect units in real time. It also made a cool clicky sound like old train/airport physical arrival/departure boards did.

I had a feature on my HTC phone or early Skype over 10 years ago where a cartoon cat mimicked my mouth, eyes, and head movement live on camera, which I can't find the reference to, but I have a screen recording of it when talking with my kids in the US from Macau.

I also remember using animata, a software that animated 2D puppet-like cut-outs to the music, I played with over 7 years or so ago, that was really cool [1 YouTube].

[1] https://www.youtube.com/watch?v=Dz8OMxB8m_M


Re the feature in HTC or Skype, it was Skype, there were a set of novelty faces that mimicked the movement (I can only remember the dog, but there were a few, some more effective than others).


Wow, this would change how animation is done completely! If there existed an easy way to create animated character cartoons. It would launch a thousand southpark/rick and morty type of shows. A team with 10k can launch a show.


I believe Adobe has had this for a long time in character animator

https://www.adobe.com/products/character-animator.html


Since 2015, according to Wikipedia. https://en.wikipedia.org/wiki/Adobe_Character_Animator

They won an Emmy for it this year, their post about it (https://theblog.adobe.com/adobe-character-animator-receives-...) notes that it was used for a live broadcast of the Simpsons, and a Showtime series called "Our Cartoon President" (https://www.sho.com/our-cartoon-president).


Adobe Character Animator is not bad, has skeletal rigging and tracks mouth but doesn't do deformations like this one


Good to know, it's a awesome feature I am surprised I don't see more successful shows made with this on reddit and youtube


That's because the animation is very much the least important part of a successful show. It's the writing and voice acting. South Park is a great example of this. It's successful because it is funny and edgy. The animation could have gone a lot of different ways and had similar success I think.


I think fundraising is the hardest part of a project, especially for creatives. If a group of writers can wrangle some amateur voice actors to work, (Which I think would be the lowest paid actors) it should be easy to get a project up.


On the other hand, I expect a rash of poorly done "explainer" type videos coming.


Heh, literally my first thought when I saw the sample gifs was that I could use this to make some explainer videos. And yes, they would probably end up looking poorly done.


The industry has had several tools like this for year’s. it can be useful but it hasn’t caught on in the past


please give us urls to other open source tools the industry uses for this kind of work, thank you very much.


https://www.blender.org/user-stories/japanese-anime-studio-k...

Blender. Performance capture is used just like it is in computer games but the 3D objects are shaded differently to give them a more 2D look.


This is something I have been tracking for a few years now. If anyone is interested please DM me.


It would be cool privacy-wise to have some sort of virtual webcam that could share such an animation rather than what the webcam actually sees.


This is fairly easy to wire up on Linux using v4l2loopback and pyfakewebcam.

I am currently using a little setup that uses OpenCV to acquire frames from the real camera, TensorFlow/BodyPix to compute an alpha mask for the foreground (me) and then OpenCV again to transform and composite myself behind news desks and into car infotainment screens and the like, eventually writing it to a virtual webcam I can use from Zoom (over its own virt bg feature this adds the layering and perspective transforms), Jitsi, Teams, etc.

The above looks like another fun thing to add. Time to go full Who Framed Roger Rabbit? ...


Do you have any more details written up anywhere on how you do this? That sounds like a great project.


There was a recent technical write-up about something similar: https://news.ycombinator.com/item?id=22823070


Yep. Complete agree with sbarre. Could you kindly share more details? Would definitely like to do this as a weekend hack.


It sounds like a cool idea. Probably a platform to act on a play/script as the characters themselves? Like, a social theatre kind of thing with pre-animated background and stuff, where you only do your character part.


Aha, that's a completely different application than what I had in mind (just an alt webcam that's better than a static picture for communication while still protecting your visual identity), but it would be amazingly cool =).


The Snap Camera desktop app offers features somewhat similar to what you're seeking, as long as you combine it with the right filters. I imagine that the open source repo in the OP will eventually be used in an app like this.


This is a great idea! It should be great for bandwidth too - rather than sending full cam frames multiple times a second, it only needs to send deformation information describing movements.


Hah that's what video compressors do!


OBS studio can do that.


Please more details - I can not find any feature on the OBS website similar to what we see here. Do we need some plugin / extension? Please give us some urls, thank you!


As a commercial solution there's facerig+live2d that already does that.


> This is not an officially supported Google product.

Does that indicate the author is a Google employee who happened to make this in their free time, and has to say this somewhere by Google policy?


"Unless your project is an official Google product, you must state “This is not an officially supported Google product” in an appropriate location such as the project’s README file."

https://opensource.google/docs/releasing/publishing/


Maybe they used their 20% time on it.


I, for one, am ready for the metaverse


do let me know when you’ve got a decent motorcycle model up and running.


Exactly :)


The annotated SVG skeleton is a separate download, but that server is out of quota :( Someone have it?


you can find about 5 svg skeletons inside the github's folder: resources/illustration/ Have fun


Look forward to seeing this evolve. Very fun project. Commenting to save for later. Thank's.


Thanks, found 'em :)


Why is this down voted? I hate it when people do that and offer no explanation. Totally useless input, even harmful. HN should force comment on down-vote.


Posting like this breaks the site guidelines. Would you please read them? https://news.ycombinator.com/newsguidelines.html

One reason that guideline exists is that unfair downvotes frequently get canceled by users who come along, see the situation, and make a corrective upvote. Meanwhile complaints like this linger on in the thread, inaccurate and off-topic—they don't garbage-collect themselves. As an example, I noticed your other comment and upvoted it before I saw this comment here. Similarly, other users have upvoted the GP.

As with any stochastic process, there is a lot of error and spillage with downvotes. There's no way to perfect it; you have to ask whether the system is better off with it than without it. Forcing comments wouldn't help, and posting complaints certainly doesn't help.


Why not have a meta discussion like a Wikipedia Talk page so comments like these would have room and critique won't be shut down.

Yesterday (you can look in my comment history) I ran into a situation where the person doing the down-voting turned out to be basing it on their opinion not fact (after they finally stated their opinion on the matter, which contradicts peer-reviewed research on the topic, I realized why they were down voting: insufficient depth of understanding of the topic) and no one came after them to correct the situation...

Either have people explain why they down-voted or have a Talk page where people can discuss their reasons, complain, etc, behind the scene.


HN is a site for intellectual curiosity (https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...) and a meta forum would be a step away from that. It would fill up with litigious bickering and nitpicking, and demands for bureaucratic administration—all things that intellectually curious discussion requires having the restraint to avoid.

Edit: and it's for similar reasons that we don't publish a moderation log.


> HN should force comment on down-vote

This is actually a great idea


My other comment on this thread followed others in congratulating the author. It specifically said "Bravo!" and guess what? It got down voted.

So the explanation may be that it is useless input but believe me it is not useless to the author to have people congratulate them for the fruit of their labor. It's a human desire to be recognized by others. How is it not a positive comment? Why down-vote something as benign and empathic as saying congratulations? So force a comment on each down-vote and things might get a little bit better because at least we'll know it's not just random acts of hostility and that the down-voter has a rational reason for it, at least in their mind.


Great to see this thing open sourced. Adobe has a product for such things.


At the level of fidelity of the demo, it's not yet a product.

The demo is very promising, though.


Also, cartoon animator 4 includes this, both using FaceID and RGB webcams.


I would really like this imposed on a Teams, Zoom, Lync, WebMeeting. This way you cam still be in pajamas...


This is freaking amazing. So many potential applications. Great work!


This is great :) 10 years ago I'd been using Animata http://animata.kibu.hu + kinect, this rolls it all into one web based thing. cool stuff. Would it be easy to move it to a local GPU accelerated version for more FPS?


Hmm tfjs does use the gpu, and there are plenty of small models for posenet to run fast on mobile. Maybe this is an old version. (Or maybe the bottleneck is in the face or svg stuff) But to answer your question, yeah it could be faster (everything could be faster given enough time! :)


This is great! It's amazing to see what can be made out of open sourced machine learned building blocks


Awesome, this is getting the idea mill in my head started. Animated short comics would be a great use case.

Need some South Park vector models!


IIRC there was a similar thing done for one of the VR systems recently, but for the hands instead of for the face. Has anyone seen any open models or software for tracking hand shape and location in this way? I would love to use it in connection with sign language processing.


Yep! I do have something bookmarked somewhere. I use sign language myself and do want the same thing. Curious about your background


If you find that bookmark could you share it with me? One project I've looked at is sign puppet [1]. It has all of the animation basics needed; the tough thing is inputting the parameters to animate the puppet. Traditionally capturing sign language data for the computer requires really sophisticated tracking equipment (gloves, etc.). Being able to do it with a web cam could be a game changer!

I studied linguistics and CS in school, and I learned a little JSL to speak with deaf friends in Japan. I think sign language processing is a really neat combination of computer vision/graphics and linguistics. Lately there have been so many great advances in speech processing, but there hasn't been a huge leap forward for sign language processing, though I feel there should be.

Deaf people are already really disadvantaged in many places, and getting left behind technologically doesn't help. I really resented looking for JSL books in the "disabled" section of the book stores in Japan, and when I spoke with some people about JSL, they didn't believe it was its own language. Even just linguistic work for sign languages is limited; I haven't seen a single reference grammar (re: comprehensive documentation) on any sign language. I think the difficulty of working with sign language data makes it more daunting to work with. (Paucity of speakers is certainly not a deterrent for linguists.)

[1] https://github.com/aslfont/sign-puppet


Wow really cool! I'm wondering how much work it is to define the specific anchor/interest points on the SVG for the correct mapping to occur.

But I guess since "modern" illustrations are quite minimal, said work probably shouldn't take too long.


Next step: output to a deep-fake synthesizer instead of SVG. Jump from animation to cinema.


Awesome stuff! I was just looking at face-api.js the other day, this kicks it up a notch.


Works okay for a browser application, but are there any native alternatives for this?


That's what a Side-Project should look like, curious to see the blog post !


This is very cool! I wonder if using input from multiple cameras, plus a 3D rig can improve the accuracy of the rendering.


Certainly, there are lots of apps leveraging the TrueDepth sensor on iPhones that reach an even higher level of fidelity as far is pose estimation is concerned.

The limiting factor for accuracy in a lot of these technologies is the actual rigging process of the characters, probably because that is very difficult to standardize or generalize across different geometries, art styles, animation drivers, 2D vs 3D, etc.


Head-turning doesn't work that great, live2d looks much better here.


Is this very similar to the tech that powers Adobe Character Animator?


This is awesome! Great work, thanks for sharing.


This is incredibly awesome


This is awesome!


Bravo!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: