In the past two years, in Zhejiang, Anhui, Jiangsu and other places, a number of suspects of stealing personal information have been arrested by the public security department. The criminal suspect’s crime process is very similar: first, illegally obtain other people’s photos or paid for other people’s voice and other “materials”, and then use artificial intelligence technology to “activate” the photos and synthesize dynamic video. after that, it may directly deceive the face verification mechanism of the social platform and Alipay account to make an illegal profit. Or deceive the manual examination link in the process of mobile phone card registration, and then use other people’s mobile phone numbers for telecommunications network fraud, network gambling and so on, so that the people who collect information suffer security threats and property losses.
In front of a demonstration computer in the laboratory of the Institute of artificial Intelligence at Tsinghua University, the reporter saw that after a static photo of the face of a stranger who had just been downloaded from Wechat’s moments was imported into the computer, under the operation of a technician, the character in the photo could instantly “live”, make corresponding fine movements and expression changes such as blinking, opening his mouth, and frowning according to instructions, and generate a smooth video in just ten seconds.
“the technology that completes the driving operation from static to dynamic is called deep synthesis technology, which is a kind of artificial intelligence content synthesis technology.” Xiao Zihao, an engineer at the Institute of artificial Intelligence at Tsinghua University, said that deep synthesis technology has derived a variety of technologies, including image synthesis, video synthesis, sound synthesis, text generation and so on.
With the blessing of technology, it is no longer difficult to steal faces. In the mobile phone card registration, bank card application, payment software login and other links that require dynamic face recognition, these forged synthetic videos can help lawbreakers pass the background verification.
The technician demonstrated the operation of sound synthesis to the half-month reporter. A few 60-second stranger voices can be generated through deep synthesis technology, “No need to sign in, just transfer money to me through Wechat” and “you don’t have to pick up your kids today. I am near the school, on the way to pick up the children, “and other voice, the effect is like the sound of a real person. This kind of sound synthesis is “terrifying to think about”.
On domestic and foreign content platforms and social platforms, deep synthetic content presents a “simultaneous increase in quantity and quality”. Among them, the synthetic clips of movies and TV dramas and the face-changing videos of topic characters have been widely spread because of their strong entertainment.
The Ten Ten Trends in Deep Synthesis (2022) jointly released by the Institute of artificial Intelligence of Tsinghua University, Beijing Ruilai Wisdom Technology Co., Ltd., the Intelligent Media Research Center of Tsinghua University, the National Industrial Information Security Development Research Center, and the big data Center in Beijing show that the number of deeply synthesized videos on domestic and foreign mainstream audio and video websites and social media platforms grew at an average annual rate of more than 77.8% from 2017 to 2021. The number of new deep composite videos released in 2021 is 11 times that of 2017. At the same time, the exposure, attention and transmission power of deep composite content have also increased exponentially, with more than 300 million likes of newly released deep composite videos in 2021.
“the video and voice circulated on the Internet are not necessarily filmed or recorded by real people.” Ren Kui, dean of the School of Cyberspace Security at Zhejiang University, said that it is often difficult to tell by the human eye whether it is full-face synthesis, audio synthesis, or real recording.
Zhu Jun, professor of computer science at Tsinghua University and director of the basic Theory Research Center of the Institute of artificial Intelligence, believes that deep synthesis technology is changing the underlying logic and complexity of the trust chain of information dissemination content, and the risks are increasing rapidly. On the one hand, the definition of “seeing is believing” has changed. Although the public has known that static information such as photos is easy to be tampered with, it still has a high degree of trust in dynamic information such as video and sound, and deep synthesis technology has once again disintegrated the trust logic of “seeing is believing”. Second, the wide spread of short videos makes the abuse of deep synthesis technology produce a wide range of influence and destructive power.
Xue Lan, dean and professor of the Schwarzman College of Tsinghua University, believes that when artificial intelligence technologies such as deep synthesis become “abused”, it will bring a series of ethical and governance problems: lightly violating personal property security, harming personal dignity and privacy, and seriously threatening national security and affecting social stability.
Technology is a double-edged sword. Making good use of this double-edged sword can neither let technology become a runaway wild horse, nor let technological innovation stand still.
From the point of view of making good use of technology, Wu Hequan, an academician of the Chinese Academy of Engineering and an expert in information technology, pointed out that new applications and developments of technology should not be prohibited and interfered in an across-the-board manner, so as not to hinder its innovation. However, the security problems derived from technology should be solved from the source, and the ability of testing technology should be continuously improved and iterated by means of technological innovation and technological confrontation.
Zhu Jun believes that the current detection technology for deep synthesis applications is still in the exploratory stage and the means are not yet mature. It is suggested that scientific research institutes and scientific and technological enterprises should be brought into full play to form an effective and efficient detection capability of deep synthetic application technology as soon as possible, so as to strive for technological superiority in public opinion warfare and information warfare.
From the perspective of risk governance, Qiu Huijun, deputy chief engineer of the National Industrial Information Security Development Research Center, pointed out that the digital transformation in recent years has forced the landing of multi-national artificial intelligence security risk management. The European Union has taken the lead in legislation in the field of artificial intelligence, focusing on the regulatory framework for high-risk artificial intelligence systems based on risk analysis.
“artificial intelligence security includes data security, framework security, algorithm security, model security, operational security and other components. In this regard, we should build an integrated governance rule system of “stipulation + standard + law”, issue guidelines, standards and evaluation norms for risk governance, and improve legislation when conditions are available. ” Qiu Huijun suggested that focusing on the perspective of data, algorithms, models, and operation and maintenance, the first is to build quality standards for data collection; the second is to classify the system risks of artificial intelligence according to application scenarios; and the third is to establish a security responsibility system to clarify the respective responsibilities of design and development units, operation and maintenance units, and data providers.
Chen Jihong, a partner of Zhonglun Law firm, said that in order to crack down on the crime of “face-changing” fraud, it should be regulated from the aspects of the legal use boundary of technology, the safety assessment procedure of technology, and the legal regulation of technology abuse, so as to increase the illegal cost of technology abuse.
Zhu Jun suggested that the public should form a correct understanding of new technologies and applications of deep synthesis, raise awareness of their adverse applications, protect personal voice prints, photos, and other information, and not easily provide personal biological information such as faces, fingerprints, and iris to others.