Animating Faces from Photos: New Microsoft AI brings the Mona Lisa to life with rapping skills

Microsoft has developed a new AI model called VASA-1 that can create realistic videos of a person speaking by using a still image of their face and an audio clip. The videos created by VASA-1 are complete with compelling lip syncing and natural face and head movements. In one demo, researchers animated the Mona Lisa to recite a comedic rap by Anne Hathaway. While the technology could be used for education or improving accessibility for individuals with communication challenges, there are concerns about potential misuse to impersonate real people.

There is a growing concern about the misuse of AI-generated images, videos, and audio leading to new forms of misinformation. Experts worry about the impact on creative industries from film to advertising as these technologies become more advanced. Microsoft has stated that they do not plan to release the VASA-1 model to the public immediately, similar to how OpenAI is handling their AI-generated video tool, Sora, by only making it available to some professional users and cybersecurity professors for testing.

Microsoft’s new AI model, VASA-1, was trained on numerous videos of people’s faces while speaking, allowing it to recognize natural face and head movements such as lip motion, facial expressions, eye gaze, and blinking. This results in a more lifelike video when animating a still photo. The AI tool can be directed to produce videos where the subject is looking in a certain direction or expressing a specific emotion. While there are still signs that the videos are machine-generated, Microsoft believes its model outperforms other similar tools and allows for real-time engagements with lifelike avatars.

The technology behind the VASA-1 AI model has potential applications beyond creating entertaining videos, such as in education or improving accessibility for individuals with communication challenges. However, there are concerns about the misuse of the technology to impersonate real people or create misleading content. Microsoft has stated that they are opposed to any behavior that creates misleading or harmful contents of real persons and have no plans to release the product publicly until they are certain it will be used responsibly and in accordance with regulations.

As more tools emerge for creating convincing AI-generated images, videos, and audio, there is a growing concern about the potential for misuse and the impact on creative industries. Microsoft’s VASA-1 model is designed to create realistic videos of people speaking using a still image of their face and an audio clip. While the technology has potential applications for education and accessibility, there are concerns about its misuse and the need for responsible use and regulation before making it publicly available.

Trending Now

Developing a Trustworthy and Transparent Organizational Culture

Promoting Security, Trust, and Collaboration Among Cloud Providers

Two of Michael Jackson’s albums are currently charting and experiencing a surge in popularity.

Why Business Needs Diversity of Thought

Unlikely Trio Raises $125 Million for Saga Ventures, Investing Beyond Silicon Valley Trends with Max Altman

Trump Media terminates accounting firm following accusations of being a ‘sham audit mill’ by previous firm

Howard Schultz pushes for Starbucks to improve its domestic operations

Qantas to Pay $79 Million to Settle Case Involving ‘Ghost Flights’ in Australia

Three important concepts investors might be unaware of, yet should be aware of

AI and Grief Intersect: Individuals Communicating with the Deceased

China sees spike in tourism during May Day holiday as travelers shift to budget-friendly options

Trending Now

Animating Faces from Photos: New Microsoft AI brings the Mona Lisa to life with rapping skills

You Might Like