This AI is so scary that Microsoft dare not release it

AI even makes Mona Lisa come alive and sing. It is so insane that Microsoft is afraid to release it.

Leonardo da Vinci’s Mona Lisa is the most famous painting in the world. Every day, hundreds of people continuously stand in a packed hall of the Louvre to make eye contact with this mysteriously smiling lady. Microsoft makes it even more special.

Behind that mysterious smile is a special talent. It turns out she can sing thanks to Microsoft which, in turn, is terrified by it. It has everything to do with AI, but what about it exactly?

Microsoft’s singing Mona Lisa

Did you think AI was already insane? Just wait, because things can get even crazier by Microsoft. It now has a tool that lets you create videos of people based on a single photograph. The Mona Lisa even comes to life in such a way that Leonardo da Vinci is turning over in his grave.

With that one video, Microsoft’s AI can make that person do anything it wants. Whether it’s singing, rapping or speaking lyrics. It so insane that Microsoft is too afraid to actually make this technology available.

Twitter is not loading because you did not give permission.

Microsoft just dropped VASA-1.

This AI can make single image sing and talk from audio reference expressively. Similar to EMO from Alibaba

10 wild examples:

1. Mona Lisa rapping Paparazzi pic.twitter.com/LSGF3mMVnD

– Min Choi (@minchoi) April 18, 2024

Microsoft calls the AI technique VASA-1 (Visual Affective Skills Animator). It also makes it possible in the future to create virtual avatars that can say anything the creator wants. They can behave just like humans, like the Mona Lisa, for example.

Microsoft trained this model using Oxford University’s VoxCeleb2 dataset. This is a database of more than a million utterances by 6,112 celebrities from YouTube videos.

A stunning feat of engineering

But Microsoft goes a step further. The videos have a resolution and frame rate not inferior to what you normally see during a video chat such as FaceTime or Teams. The team has created several videos including one featuring a rapping Mona Lisa.

Still, there is one thing the tool thankfully cannot do. It is impossible for the AI to clone voices. It would be even more dangerous then. Anyway, Microsoft is only using this as a showcase and never really wants this program to come out.

Twitter is not loading because you did not give permission.

7. Power of disentanglement

Example of same motion sequence with different photos pic.twitter.com/MSLFobwJTx

– Min Choi (@minchoi) April 18, 2024

And while you might be all too eager to share a stunning feat of engineering, it’s a good thing Microsoft doesn’t. If the Mona Lisa already looks so insanely good while rapping, how must it be for people we really know also have an impact.

Even more fear of deepfakes

These are so-called deepfakes and they are certainly not new. We’ve seen quite a few pass by in recent years. However, deepfakes involved quite a bit of work. With newer AI techniques, this is becoming increasingly simple, something Microsoft is now demonstrating.

WANT on WhatsApp

We can also be found on WhatsApp. In our channel we share the best stories, videos and exclusive content you won’t get anywhere else. Follow us here.

Therefore, the new technology for Microsoft also raises new concerns. Soon, with a photo that you post on your social media, such as Instagram, can already cause it to be made into a fake video. So it may be even more important in the future to think carefully about what kind of photos you post yourself. By comparison, a singing Mona Lisa is pretty harmless.

Although Microsoft is not releasing this technology, it sees more advantages than disadvantages. For example, it could increase educational equality, give people with communication problems more access and, according to the company, it can be used as therapeutic support for people in need. Microsoft emphasizes then the technology can also promote human well-being.

Exit mobile version