Microsoft's New AI Can Make Photographs Sing and Talk

Complete Story

04/22/2024

Microsoft's New AI Can Make Photographs Sing and Talk

AI model already has the Mona Lisa lip-syncing

Microsoft published a research paper this week highlighting a new AI model called VASA-1 that can transform a single picture and audio clip of a person into a realistic video of them lip-syncing — with facial expressions, head movements, and all.

The AI model was trained on AI-generated images from generators like DALL·E-3, which the researchers then layered with audio clips. The results are images-turned-videos of talking faces.

The researchers built on technology from competitors such as Runway and Nvidia, but state in the paper that their method of doing things is higher-quality, more realistic, and "significantly outperforms" existing methods.

Please select this link to read the complete article from Entrepreneur.

Printer-Friendly Version