Deepfakes: Audio artificial intelligence impacts on society

Tiktok’s new AI audio causes questions of AI and ethics. PHOTO CREDIT: analyticssteps.com

Artificial intelligence is becoming increasingly popular among social media users, as exemplified by recent trends and posts on sites like TikTok and Instagram.

Recently, AI audio has been revolving around the internet featuring high-profile celebrities, like Taylor Swift and Harry Styles, as well as politicians.

According to Indrani Mandal, assistant teaching professor of computer science at the University of Rhode Island, this type of technology is created through data systems that can copy the mannerisms of any subject they are programmed to.

Mandal said certain types of AI audio technology can be very effective and constructive to society — for example, the technology used in self-driving cars and certain robots. However, other technologies, like “deepfakes,” can be dangerous.

“Deepfakes [are] a model where they use deep learning techniques to train, like, you know, videos and audios of famous people and thereby generating this, you know, audio and videos of famous politicians and celebrities,” Mandal said. “So this can create a lot of confusion because now you don’t know what is real and what is fake because some of these deepfakes that are generated by AI are — you can’t tell the difference. They look very real.”

Madhukara Kekulandara, a computer science Ph.D. student at URI, said this technology is dangerous because it can create issues where it looks like people are saying things that they aren’t. In particular, he stated that this could be very dangerous to the career of a politician if someone programs their voice to say something they never said or don’t stand behind.

“The implication, or like, the opportunity is that the harm that can [be created] from creating these audios or deepfake videos can go from anywhere and can make many complications in society,” Kekulandara said.

The technology used in order to create AI audio and deepfakes is called Generative Adversarial Networks (GANs).

According to both Madhukara and Madal, the audio used in deepfakes is generated when the GANs technology creates an “alphabet” using syllables and mannerisms it has studied off of a subject.

“Sounds are like alphabets, right?” Mandal said. “If you make that analogy, the sounds are like the phonics. Once all the phonics of a person is mimicked correctly by the model, now it can put those phonics together to generate any complex sentence.”

The more people interact with AI technology, the more we will see it around, according to Mandal. This is because social media sites gather the technology to see what people are most interested in so that they can produce more of that specific kind of content.

To promote internet safety and computer literacy, URI’s AI Lab in Carothers Library hosts summer camps for students as young as 4th grade all the way through high school. Not only do these camps teach students about AI technology, but also teach students about coding.

For more information on artificial intelligence and these kinds of technology, visit the URI AI Lab, or visit their website.