一张图片就能“活化”成视频?警惕AI深度合成击穿风险底线

科技1yrs ago (2022)release 半月谈
798 0
网站公众号快速收录
一段视频、一段语音,未必是真人拍摄或录制,在你不知道的手机App后台、支付界面、门禁闸机,或许有人正在盗刷你的脸。随着人工智能(AI)深度合成技术日益精湛,合成的音频、视频等伪造内容越来越能以假乱真。毫无疑问,我们生活的现实世界正在面临技术滥用的风险与挑战。

盗刷人脸、篡改声音,那都不叫事儿

近两年来,在浙江、安徽、江苏等地,多名盗取个人信息的犯罪嫌疑人被公安部门抓获。犯罪嫌疑人作案流程极为雷同:先是非法获取他人照片或有偿收购他人声音等“物料”,然后利用人工智能技术将照片“活化”、合成动态视频,之后或直接骗过社交平台、支付宝账户的人脸核验机制,进行非法获利;或骗过手机卡注册过程中的人工审核环节,继而利用他人名下的手机号进行电信网络诈骗、网络赌博等,使被收集信息的人遭受安全威胁和财产损失。

一张陌生人的图片,如何“活化”成为视频?

一张图片就能“活化”成视频?警惕AI深度合成击穿风险底线

半月谈记者在清华大学人工智能研究院实验室的演示电脑前看到,一张刚从微信朋友圈中下载的陌生人的正脸静态照片导入电脑后,在技术人员的操作下,照片上的人物可瞬间“活”起来,根据指令做出相应的眨眼、张嘴、皱眉等精细动作和表情变化,并在短短十几秒内生成流畅视频。

“完成由静到动这一驱动操作的技术叫深度合成技术,是人工智能内容合成技术的一种。”清华大学人工智能研究院工程师萧子豪说,深度合成技术已经衍生出包括图像合成、视频合成、声音合成、文本生成等多种技术。

在技术加持下,盗刷人脸不再是难事。在手机卡注册、银行卡申请、支付软件登录等需要人脸动态识别的环节,这些伪造的合成视频可协助不法分子通过后台审核验证。

技术人员向半月谈记者演示了声音合成的操作。几段60秒的陌生人语音通过深度合成技术,即可生成“不用打卡,直接微信转账给我吧”“今天你不用去接孩子了。我就在学校附近,顺路去接孩子”等语音,效果如同真人发出的声音。这种声音合成令人“细思极恐”。

深度合成正瓦解“眼见为实”

在国内外内容平台、社交平台上,深度合成内容呈现“量质齐升”。其中合成的影视剧片段、话题人物的换脸视频等因具有较强娱乐性而被大量传播。

清华大学人工智能研究院、北京瑞莱智慧科技有限公司、清华大学智媒研究中心、国家工业信息安全发展研究中心、北京市大数据中心联合发布的《深度合成十大趋势报告(2022)》显示,2017年至2021年国内外主流音视频网站、社交媒体平台上,深度合成视频数量的年均增长率超过77.8%。2021年新发布的深度合成视频数量是2017年的11倍。与此同时,深度合成内容的曝光度、关注度、传播力也呈指数级增长,2021年新发布深度合成视频的点赞数已超3亿次。

“网上流传的视频、语音,未必是真人拍摄或录制。”浙江大学网络空间安全学院院长任奎说,是全脸合成、音频合成,还是真实拍摄录制,许多时候凭借人眼难以分辨。

清华大学计算机系教授、人工智能研究院基础理论研究中心主任朱军认为,深度合成技术正在改变信息传播内容信任链的底层逻辑和复杂程度,风险隐患在迅速加大。一方面,“眼见为实”的定义发生改变。尽管公众对照片等静态信息易被篡改已有认知,但对视频、声音等动态信息仍持有较高信任度,深度合成技术再次瓦解了“眼见为实”的信任逻辑。二是短视频的广泛传播,使深度合成技术的滥用产生了较大范围的影响力和破坏力。

清华大学苏世民书院院长、教授薛澜认为,当深度合成等人工智能技术走向“滥用”,就会带来一系列的伦理和治理问题:轻则侵犯个人财产安全、伤害个人尊严和隐私,重则威胁国家安全、影响社会稳定。

引导技术向善,完善AI风险治理体系

技术是一把双刃剑。用好这把双刃剑,既不能让技术成为脱缰的野马,也不能让技术创新原地踏步。

从善用技术的角度,中国工程院院士、信息技术专家邬贺铨提出,对于技术的新应用、新发展,不能“一刀切式”地禁止和干预,以免阻碍其创新。而应当从源头上解决技术衍生的安全问题,利用技术创新、技术对抗等方式,持续提升和迭代检测技术的能力。

朱军认为,当前针对深度合成应用的检测技术仍处于探索阶段,手段尚不成熟。建议充分发挥科研院所、科技企业等力量,尽快形成有效、高效的深度合成应用技术检测能力,以在舆论战、信息战中争取技术优势。

从风险治理的角度,国家工业信息安全发展研究中心副总工程师邱惠君指出,近年来的数字化转型倒逼多国人工智能安全风险治理落地。欧盟率先在人工智能领域开展了立法,基于风险分析的方法,重点明确针对高风险人工智能系统的监管框架。

“人工智能安全包括数据安全、框架安全、算法安全、模型安全、运营安全等组成部分。对此,我们应当构建‘规定+标准+法律’的一体化治理规则体系,出台风险治理的指南、标准、评估规范,在条件具备时完善立法。”邱惠君建议,重点围绕数据、算法、模型和运维的角度,一是构建数据采集质量规范;二是根据应用场景对人工智能进行系统风险分级分类;三是建立安全责任体系,明确设计开发单位、运维单位、数据提供方的各自责任。

中伦律师事务所合伙人陈际红表示,打击“变脸”诈骗犯罪,应从技术的合法使用边界、技术的安全评估程序、滥用技术的法律规制等方面予以规范,提高技术滥用的违法成本。

朱军提示,公众应当对深度合成新技术、新应用形成正确认知,对其不良应用提高防范意识,保护好个人声纹、照片等信息,不轻易提供人脸、指纹、虹膜等个人生物信息给他人。

盗刷人脸、篡改声音,那都不叫事儿

In the past two years, in Zhejiang, Anhui, Jiangsu and other places, a number of suspects of stealing personal information have been arrested by the public security department. The criminal suspect’s crime process is very similar: first, illegally obtain other people’s photos or paid for other people’s voice and other “materials”, and then use artificial intelligence technology to “activate” the photos and synthesize dynamic video. after that, it may directly deceive the face verification mechanism of the social platform and Alipay account to make an illegal profit. Or deceive the manual examination link in the process of mobile phone card registration, and then use other people’s mobile phone numbers for telecommunications network fraud, network gambling and so on, so that the people who collect information suffer security threats and property losses.

一张陌生人的图片,如何“活化”成为视频?

In front of a demonstration computer in the laboratory of the Institute of artificial Intelligence at Tsinghua University, the reporter saw that after a static photo of the face of a stranger who had just been downloaded from Wechat’s moments was imported into the computer, under the operation of a technician, the character in the photo could instantly “live”, make corresponding fine movements and expression changes such as blinking, opening his mouth, and frowning according to instructions, and generate a smooth video in just ten seconds.

“the technology that completes the driving operation from static to dynamic is called deep synthesis technology, which is a kind of artificial intelligence content synthesis technology.” Xiao Zihao, an engineer at the Institute of artificial Intelligence at Tsinghua University, said that deep synthesis technology has derived a variety of technologies, including image synthesis, video synthesis, sound synthesis, text generation and so on.

With the blessing of technology, it is no longer difficult to steal faces. In the mobile phone card registration, bank card application, payment software login and other links that require dynamic face recognition, these forged synthetic videos can help lawbreakers pass the background verification.

The technician demonstrated the operation of sound synthesis to the half-month reporter. A few 60-second stranger voices can be generated through deep synthesis technology, “No need to sign in, just transfer money to me through Wechat” and “you don’t have to pick up your kids today. I am near the school, on the way to pick up the children, “and other voice, the effect is like the sound of a real person. This kind of sound synthesis is “terrifying to think about”.

深度合成正瓦解“眼见为实”

On domestic and foreign content platforms and social platforms, deep synthetic content presents a “simultaneous increase in quantity and quality”. Among them, the synthetic clips of movies and TV dramas and the face-changing videos of topic characters have been widely spread because of their strong entertainment.

The Ten Ten Trends in Deep Synthesis (2022) jointly released by the Institute of artificial Intelligence of Tsinghua University, Beijing Ruilai Wisdom Technology Co., Ltd., the Intelligent Media Research Center of Tsinghua University, the National Industrial Information Security Development Research Center, and the big data Center in Beijing show that the number of deeply synthesized videos on domestic and foreign mainstream audio and video websites and social media platforms grew at an average annual rate of more than 77.8% from 2017 to 2021. The number of new deep composite videos released in 2021 is 11 times that of 2017. At the same time, the exposure, attention and transmission power of deep composite content have also increased exponentially, with more than 300 million likes of newly released deep composite videos in 2021.

“the video and voice circulated on the Internet are not necessarily filmed or recorded by real people.” Ren Kui, dean of the School of Cyberspace Security at Zhejiang University, said that it is often difficult to tell by the human eye whether it is full-face synthesis, audio synthesis, or real recording.

Zhu Jun, professor of computer science at Tsinghua University and director of the basic Theory Research Center of the Institute of artificial Intelligence, believes that deep synthesis technology is changing the underlying logic and complexity of the trust chain of information dissemination content, and the risks are increasing rapidly. On the one hand, the definition of “seeing is believing” has changed. Although the public has known that static information such as photos is easy to be tampered with, it still has a high degree of trust in dynamic information such as video and sound, and deep synthesis technology has once again disintegrated the trust logic of “seeing is believing”. Second, the wide spread of short videos makes the abuse of deep synthesis technology produce a wide range of influence and destructive power.

Xue Lan, dean and professor of the Schwarzman College of Tsinghua University, believes that when artificial intelligence technologies such as deep synthesis become “abused”, it will bring a series of ethical and governance problems: lightly violating personal property security, harming personal dignity and privacy, and seriously threatening national security and affecting social stability.

引导技术向善,完善AI风险治理体系

Technology is a double-edged sword. Making good use of this double-edged sword can neither let technology become a runaway wild horse, nor let technological innovation stand still.

From the point of view of making good use of technology, Wu Hequan, an academician of the Chinese Academy of Engineering and an expert in information technology, pointed out that new applications and developments of technology should not be prohibited and interfered in an across-the-board manner, so as not to hinder its innovation. However, the security problems derived from technology should be solved from the source, and the ability of testing technology should be continuously improved and iterated by means of technological innovation and technological confrontation.

Zhu Jun believes that the current detection technology for deep synthesis applications is still in the exploratory stage and the means are not yet mature. It is suggested that scientific research institutes and scientific and technological enterprises should be brought into full play to form an effective and efficient detection capability of deep synthetic application technology as soon as possible, so as to strive for technological superiority in public opinion warfare and information warfare.

From the perspective of risk governance, Qiu Huijun, deputy chief engineer of the National Industrial Information Security Development Research Center, pointed out that the digital transformation in recent years has forced the landing of multi-national artificial intelligence security risk management. The European Union has taken the lead in legislation in the field of artificial intelligence, focusing on the regulatory framework for high-risk artificial intelligence systems based on risk analysis.

“artificial intelligence security includes data security, framework security, algorithm security, model security, operational security and other components. In this regard, we should build an integrated governance rule system of “stipulation + standard + law”, issue guidelines, standards and evaluation norms for risk governance, and improve legislation when conditions are available. ” Qiu Huijun suggested that focusing on the perspective of data, algorithms, models, and operation and maintenance, the first is to build quality standards for data collection; the second is to classify the system risks of artificial intelligence according to application scenarios; and the third is to establish a security responsibility system to clarify the respective responsibilities of design and development units, operation and maintenance units, and data providers.

Chen Jihong, a partner of Zhonglun Law firm, said that in order to crack down on the crime of “face-changing” fraud, it should be regulated from the aspects of the legal use boundary of technology, the safety assessment procedure of technology, and the legal regulation of technology abuse, so as to increase the illegal cost of technology abuse.

Zhu Jun suggested that the public should form a correct understanding of new technologies and applications of deep synthesis, raise awareness of their adverse applications, protect personal voice prints, photos, and other information, and not easily provide personal biological information such as faces, fingerprints, and iris to others.

© Copyright notes

Related posts

网站公众号快速收录