Hacker News new | past | comments | ask | show | jobs | submit login
LivePortrait: A fast, controllable portrait animation model (github.com/kwaivgi)
203 points by cleardusk 5 months ago | hide | past | favorite | 25 comments
We are excited to announce the release of our video-driven portrait animation model! This model can vividly animate a single portrait, achieving a generation speed of 12.8ms on an RTX 4090 GPU with `torch.compile` from PyTorch. And, we are actively updating and improving this repo!

Related Resources: - Homepage: https://liveportrait.github.io

- Paper: https://arxiv.org/abs/2407.03168

- Code: https://github.com/KwaiVGI/LivePortrait

- Jupyter: https://github.com/camenduru/LivePortrait-jupyter

- ComfyUI: https://github.com/kijai/ComfyUI-LivePortraitKJ and https://github.com/shadowcz007/comfyui-liveportrait

We hope you give it a try and enjoy!




The videos on your homepage are encoded with HEVC which can't be viewed in Firefox. Please consider using an open codec like AV1.


Thanks for the reminder. The homepage is a GitHub page and does not support git lfs, so I have compressed the files as much as possible to reduce their size. We consider re-encode the mp4 files to x264, and provide a packed zip of the homepage.


I scanned the open codec page: https://en.wikipedia.org/wiki/List_of_open-source_codecs I'm a little confused, is H.265 not OPEN? :-(


It's a bit of a mess. The implementation of a codec, that is, and encoder or a decoder can be open source, despite the format itself not being open. H265 does have open implementation, but the format itself is not open. The opposite can be true as well, there are proprietary encoders to open formats for example. Actual list of open video formats: https://en.wikipedia.org/wiki/List_of_open_file_formats#Vide...

What OP meant is that they would like an open format on the website, which, then, can be viewed in any modern browser. I think that caniuse is a good resource in this regard.

https://caniuse.com/av1

https://caniuse.com/hevc

WebM with VP9 video is a good general browser target I think:

https://caniuse.com/webm

But funnily enough, even though h264 is not open, it's a widely decoded video format as well:

https://caniuse.com/mpeg4


This is exactly why I am not convinced that VVC is going to be useful; seems to have little advantage over AV1, as well as being late to the party in the first place.


Well yeah, they wanna rent, so they have to develop these things. It also depends on what business deals they make in the background. If the format is secured in some applications, that might cement it as a quasi-standard, which then they can leverage for further popularity.

I hope open standards keep winning. Overall, everyone wins with the infrastructure being openly accessible, especially the common folk.


This is amazing. I can immediately see it being used by StableDiffusion and other generative image communities. It gives life to those lifeless faces and it doesn't look outstandingly odd. Not to my eyes at least.

Edit: it's definitely being used already https://www.reddit.com/r/StableDiffusion/comments/1dvepjx/li...


It will allow for more realistic emotions in current SD Model merges and fine tunes by generating frames correctly labelled with their associated emotions.

Most SD1.x/SDXL models images depict humans with the same expression so the frames generated by LivePortrait will help with training datasets.

I believe the Pixar animators in Toy Story 1 used facial expressions /emotions database called F.A.C.S to make the characters more humanly relatable.

It's not clear if the "expressions" will generalise to new faces


This is .. remarkably fast. Fast as in a quick response to Microsoft’s announcement earlier this year, and as in low latency. I love it.

I’d love to see a database of facial expression videos that’s used for some sort of standardized test expression testing.. are you guys aware of one?


The "generalization to animals" part seems like it opens a lot of interesting avenues!


did you mamage to make it work? the model can't find my cat face


never mind. https://github.com/KwaiVGI/LivePortrait/issues/20#issuecomme... explains it needs custom fine tuning to work


- Fast! - Getting some unstable results, the head keeps moving up and down, just a few pixel, maybe it needs some stabilization - Single frame rendering are quite good, a bit cartoony though - No lip syncing - Head rotation is bad, it deforms the head completely


There's a typo right at the beginning of your paper's page: exsiting


Fixed! h_h


Is this how the Luma dream machine works?


no


How does that work


looks good


For everyone wanting to use this commercially be wary of the insightface models licensing...


https://github.com/deepinsight/insightface?tab=readme-ov-fil... ---------- The code of InsightFace is released under the MIT License. There is no limitation for both academic and commercial usage.


that is the code. the weights are non commercial

Both manual-downloading models from our github repo and auto-downloading models with our python-library follow the above license policy(which is for non-commercial research purposes only).


Understood. The core dependency of InsightFace in LivePortrait is the face detection algo. The face detection is easily to be replaced with self-developed or MIT-licensed model.


Exactly just replace it with any segmentation model lol, FastSAM or a YOLO model can find the face lol. No reason to be using insight for that.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: