Alibaba’s new Al system ‘EMO’ (Emote Portrait Alive) creates realistic talking and singing videos from photos
The EMO system employs an Al technique known as a diffusion model, which has shown tremendous for generating realistic synthetic imagery. The researchers trained the model on a dataset of over 250 hours of talking head videos curated from speeches, films, TV shows, and singing performances.
It can generate various videos from images, examples in this post:
Make a portrait sing
– From different languages & portrait styles
– Rapid rhythm
– Talking with different characters
– Cross-actor performance
source
