By: Lotus Ruan, Jeffrey Knockel, Jason Q. Ng, and Masashi Crete-Nishihata
Read a blog post on the report from Citizen Lab Director Ron Deibert.
Media coverage: Bloomberg , Reuters , Wall Street Journal , The Globe and Mail , Quartz .
Keyword filtering on WeChat is only enabled for users with accounts registered to mainland China phone numbers, and persists even if these users later link the account to an International number.
Keyword censorship is no longer transparent. In the past, users received notification when their message was blocked; now censorship of chat messages happens without any user notice.
More keywords are blocked on group chat, where messages can reach a larger audience, than one-to-one chat.
Keyword censorship is dynamic. Some keywords that triggered censorship in our original tests were later found to be permissible in later tests. Some newfound censored keywords appear to have been added in response to current news events.
WeChat’s internal browser blocks China-based accounts from accessing a range of websites including gambling, Falun Gong, and media that report critically on China. Websites that are blocked for China accounts were fully accessible for International accounts, but there is intermittent blocking of gambling and pornography websites on International accounts.
WeChat, ( Weixin 微信 in Chinese), is the dominant chat application in China and fourth largest in the world, with 806 million monthly active users.
WeChat encompasses more than just text, voice, and video chat; it includes a rich set of features such as gaming, mobile payments, and ride hailing, which make it more of a lifestyle platform than a mere chat app. It is estimated that Chinese users spend a third of their mobile online time on WeChat and typically return to the app ten times a day or more. WeChat is owned and operated by Tencent, one of China’s largest technology companies.
Operating a chat application in China requires following laws and regulations on content control and monitoring. Accordingly, the popularity of WeChat has also been met with suspicions of surveillance and media reports of censorship . Despite these concerns, there is limited technical research into the operation and scale of content monitoring and filtering. In this report, we provide the first systematic analysis of keyword censorship and URL filtering on WeChat to determine how the app filters content and the type of content that is blocked.
We found that keyword filtering is enabled on WeChat for users with accounts registered to mainland China phone numbers. Filtering remains enabled even if users later link their account with a non-mainland China number, which means that users with accounts registered to mainland China will remain under censorship regardless if they travel or unlink their Chinese phone number from the account. Differentiating content access based on user registration seemingly creates a “one app, two systems” model of censorship.
WeChat performs censorship on the server-side. When you send a message it passes through a remote server that contains rules for implementing censorship. If the message includes a keyword that has been targeted for blocking, the message will not be sent. Documenting censorship on a system with a server-side implementation such as WeChat’s requires devising a sample of keywords to test, running those keywords through the app, and recording the results.
We used a sample of keywords found blocked on other apps used in China and systematically tested that sample in two modes: one-to-one chat and group chat. We found a greater number of keywords blocked on group chat compared to one-to-one chat, which suggests that communications on group chat are specifically targeted, potentially because group chats can reach a larger number of users.
In both chat modes, users are no longer presented with a warning message when they enter blocked keywords, as indicated by previous reports . This change means there is no feedback to users that censorship has occured making the restrictions on WeChat less transparent. Censored keywords spanned a range of content, including current events, politics, and social issues.
In addition to keyword censorship, WeChat implements a URL filtering system in its built-in browser, which uses different lists of blacklisted and whitelisted websites for China and International accounts. To sample which URLs WeChat censors, we used a script to automatically test the Alexa Top One Million list of websites using both China and International accounts.
We found that 41 of the websites we tested blocked only on accounts registered with mainland Chinese phone numbers. Moreover, e very site that is uniquely blocked on China accounts is fully accessible on International accounts, meaning that international users can successfully access the same URLs with WeChat’s internal browser. However, we did find intermittent blocking of other gambling and pornography websites on International accounts.
We proceed by providing an overview of the legal and regulatory system in China, past work on censorship on WeChat, report our new results, and conclude with a discussion on the implications of our findings.
Legal and Regulatory Environment
WeChat thrives on the huge user base it has amassed in China, but the Chinese market carries unique challenges. Any Internet company operating in China is subject to laws and regulations that hold companies legally responsible for content on their platforms. Companies are expected to invest in staff and filtering technologies to moderate content and stay in compliance with government regulations. Failure to comply can lead to fines or revocation of operating licenses. In 2010, China’s State Council Information Office (SCIO) published a major government-issued document on its Internet policy. It includes a list of prohibited topics that are vaguely defined, including “disrupting social order and stability” and “damaging state honor and interests.” In late-May 2014, China’s State Internet Information Office (SIIO), Ministry of Public Security (MPS), and the Ministry of Industry and Information Technology (MIIT) jointly launched a month-long campaign targeting Chinese instant messaging (IM) services in a bid to clean up “illegal and harmful information” and to fend off “hostile forces at home and abroad.”
In recent years, WeChat has faced increased regulatory pressures. WeChat offers a microblogging feature called “Public Accounts” that allows certain users to publish daily posts. On March 13, 2014, Tencent shut down nearly 40 Public Accounts without giving any prior notice. Popular Public Accounts that discuss current affairs and politics, such as the Consensus Website (共识网), Truth Channel (真话频道), Luo Changping (罗昌平), and Elephant Magazine (大象工会), were shut down overnight . Tencent issued a statement explaining that it “strictly prohibits publishing pornographic, vulgar, violent, bloody, political rumors and any illegal content.” The company said the action was “part of the commitment to providing quality user experience on Weixin in China,” and that it would “ continually review and take measures ” on suspicious content.
In August 2014, the SIIO announced rules on instant messaging tools , requiring service providers to obtain “Internet news service qualifications,” users to authenticate their identities before registering, public accounts owners to undergo “examination and verification” by the companies, and store this information on file with the “controlling department for Internet information and content.”
This strict regulatory environment has led to suspicions that communications on WeChat may be monitored . There have also been cases of Tibetans being arrested for sharing chat messages, songs, and photos on WeChat with content related to the Dalai Lama and Tibetan culture that Chinese authorities alleged carried “anti-China” sentiments .
Beyond the Chinese market, WeChat has made considerable efforts to grow its user base internationally. Tencent launched advertising campaigns targeting foreign markets, recruiting football star Lionel Messi and Bollywood actors to endorse the app. However, the impact of these efforts has been questionable. Tencent has never disclosed how many active users it has outside of China, but WeChat has yet to make the same impact in other countries as it has in its home market. Some commentators speculate that WeChat has not enjoyed the same success internationally because outside of China the application does not have the same rich set of features, such as mobile payments and taxi hailing, that make it a compelling platform for users within China.
Market growth outside of China has also been hampered by incidents that remind international users of the restrictions WeChat faces at home. In January 2013, media reported that WeChat users outside of China experienced censorship of chat messages that contained the keywords “法轮功” (“Falun Gong”) or “南方周末” (“Southern Weekend”), a Guangzhou-based liberal newspaper in China (see Figure 1 ). Tencent responded with a statement that claimed a technical error had enabled keyword filtering for international users temporarily and that immediate actions would be taken to rectify the issue.
Figure 1: Screenshots from media reports show international users experiencing keyword censorship on WeChat. Source: Tech in Asia and The Next Web
In 2015, WeChat introduced a temporary feature to commemorate Martin Luther King Day in the United States. If users typed “civil rights” into the chat window, animated American flag emojis would rain down on the screen (see Figure 2 ). This feature was only intended for users based in the U.S., but was accidentally enabled for China-based users. Tencent was criticized in China for the mistake, quickly disabled the feature for China based users, and issued a statement : “WeChat’s path to internationalization isn’t easy… We will try even harder!”.
These incidents demonstrate the balancing act Chinese tech companies must perform, as they attempt to grow outside of China while staying within the lines of domestic regulations.
Figure 2: Screenshot of animated American flags raining down on the screen when WeChat users typed “civil rights” into the chat window.
How WeChat Censors Keywords
Keyword censorship can be implemented in two ways: on the client-side ( i.e., on the application itself) or on the server side ( i.e., on a remote server). In a client-side implementation, all of the rules to perform censorship are inside of the application running on your device. Often the application has a built-in list of keywords that it uses to perform checks to determine if any of these keywords are present in your chat messages before your messages are sent. If your message contains a keyword from the list then the message is not sent. In a server-side implementation the rules to perform censorship are on a remote server. When a message is sent, it passes through the server that checks if banned keywords are present and, if detected, blocks the message.
Client-side implementations can be analyzed by reverse engineering the application and extracting the keyword lists used to trigger censorship A censorship keyword list provides a comprehensive look into exactly what content an application was censoring over a specific period of time. Previous research has uncovered client-side censorship in TOM-Skype (the version of Skype available for the Chinese market until 2013), Sina UC (a chat app that was provided by Sina Corporation), and live-streaming platforms used in China. Client-side censorship was also found inLINE, a mobile chat client developed by a Japanese company and marketed to countries around the world, including China. The keyword filtering features in LINE were only enabled for users with accounts registered to mainland China phone numbers in an effort to comply with Chinese regulations.
Server-side implementations pose greater challenges for researchers. Analyzing server-side implementations generally rely on sample testing in which researchers develop a set of content suspected to be blocked by a platform, send the sample to the platform, and record the results. In the case of a chat app, this process means developing a set of keywords suspected to be blocked, sending these keywords in a chat, and documenting if the keyword is received or not and if any warning message is presented. In comparison to extracting keyword lists from client-side implementations, sample testing cannot gain a comprehensive view of what a platform is censoring, as the results are only as accurate as the overlap between the sample and the actual content filtered.
WeChat censors keywords on the server-side. Therefore, sample testing has to be used to determine what specific keywords are blocked. In the next section, we describe previous sample testing results on WeChat.
Previous Examples of WeChat Keyword Censorship
Following the 2013 media reports of international users experiencing keyword filtering, we ran a series of tests to attempt to document the presence of the filtering and determine the conditions that trigger it.
In May 2013, using a WeChat account registered to a U.S. phone numbers while on a Chinese network, we found keyword censorship of “法轮功” (Falun Gong) but not for “南方周末” (Southern Weekend). Running the same test from a U.S.-based network with the same accounts resulted in no censorship for “法轮功”. These results suggest that at that time, censorship was triggered depending on what network the user was on.
In December 2013, we ran a similar test using an account registered to a mainland China phone number while on a Canadian network. Again, our test found that the keyword “法轮功” (Falun Gong) was being filtered but “南方周末” (Southern Weekend) was not. Figure 3 shows screenshots from our two rounds of testing in 2013.