How the University of Hong Kong is tracking China’s censorship of Weibo users.
The Weiboscope project documents Chinese censorship of the popular social network. Now, HKU is building a similar tool for WeChat.
A version of this story first appeared in Nieman Lab on March 9, 2018. It is republished here under a Creative Commons license.
Xi Jinping and Winnie the Pooh; the letter “n,” and Pu Yi, too.
A hodgepodge of text and images have been deleted by Chinese censors as thousands of delegates gathered in Beijing for the National People’s Congress this month to officially vote to abolish the two-term limit for Chinese presidents, paving the way for Xi Jinping to remain in power beyond 2023, and potentially for life.
The reasoning behind censoring posts ranges from obvious to absurd to convoluted. Pooh memes, because the cartoon bear’s figure, when viewed alongside the trimmer Tigger, resembles a photograph taken of Chinese president Xi Jinping and Barack Obama in California in 2013 (judge for yourself here). Images of Pu Yi, China’s last emperor, with a caption referencing the return of the Qing dynasty, for its ridicule of consolidation of power. The letter “n,” maybe because n = number of terms, and n > 2.
King-wa Fu, an associate professor at the University of Hong Kong’s Journalism and Media Centre, and his team at HKU, have since 2011 monitored these types of censored content on China’s Weibo service, a social media site made in the image of Twitter (Twitter itself is officially blocked in China). The team’s Weiboscope tool tracks a large sample of accounts for posts that have been deleted, and has collected a substantial dataset of the types of terms and content that seem to trigger censors, whether human or algorithmic or a combination of both, and on which days and over what events censorship of Weibo has been highest. (Posts Weiboscope checks up on that show the message “permission denied” suggest they’ve been censored. Weiboscope, of course, can only track the posts that managed to be published in the first place.)
From Weibo to WeChat, a team at University of Hong Kong is working to track censorship of China's social media users. Story by @shansquared at @NiemanLab
Paths of censorship
Over the course of 2017, the team captured 20,561 deleted posts across 120,000 Weibo users, a sample that includes both users with large followings and tens of thousands of randomly chosen users. Among the most heavily censored events: Nobel Prize winner Liu Xiaobo’s death on July 13, 2017. Among the words commonly found in censored posts of the past year: 习近平 (Xi Jinping). (The HKU team published its 10 most-censored Weibo events of 2017 roundup at ChinaFile last Thursday.)
Fu has seen enough aggregate data to recognize paths of censorship and the ways netizens — the term of choice in English-language reporting on Chinese internet users — try to circumvent censors or cloak their dissent. A Weibo user with a tiny number of followers may find they’re able to post about topics in relative freedom, but as they gain followers (粉丝, or “fans”), they may find they reach critical mass and trigger notice. Some terms are on a filter list, so users are blocked from posting them to a platform like Weibo in the first place, but a term may not always be continuously censored.
“[In early March], if you keyed in something related to the term ’emperor,’ I think it was blocked for the first few days, and a few days later it was changed,” Fu says. “If you are too strict controlling terms used in posts, people will find it extremely inconvenient: There are a lot of things that can contain ’emperor.’ Blocking that makes it very difficult for users. It depends on the period of time.”
These past weeks, with the looming constitutional change, have been a sensitive period of time.
“We’ve gotten a lot of calls these past weeks from people asking about the censorship system, and they’ve told me a lot of stories,” Fu says. He has heard about many examples of people whose accounts on Weibo were suspended, leaving them unable to post anything at all. He recalls receiving a call from a reporter in Beijing who told him about someone who posted a comment to social media related to issues around the term-limit change and was subsequently detained over the post.
“If you just look at the announcement last week about the constitutional changes of Xi Jinping, there’s been no real dissent in public, except from people online,” Fu says.
“You can’t see any of that in the mainstream media. You don’t see people protesting within China. There’s no assembly. It’s only online where you can still see quite substantial number of people who disagree with this constitutional change.”
Images and symbols have long been a popular way to skirt censors as they pass and get tweaked from user to user; they require some human intervention to recognize. 2012, within the first couple of years of Weiboscope, Fu says, felt like a turning point. 2012 was the year Chen Guangcheng, a blind, dissident lawyer, managed to escape Beijing for the U.S. It was the year a prominent party chief in western China, Bo Xilai, and his wife became embroiled in a scandal involving the murder of a British businessman, and fell from power. Online, Chinese people used codewords like “tomato” (西红柿, whose first two characters mean “West” and “red” individually) to refer to Bo Xilai, since the name “薄熙来” (Bo Xilai) was easily blocked.
“Looking back, that I think was one of the pivotal years, when the government began to use a larger-scale censorship system to try to silence the public response to these kinds of public issues,” Fu says. “People have been able to use these ways to extend survival time of discussions online. But in terms of the government, over the past few years, strategies have become increasingly more sophisticated, and a lot of new measures have been introduced by the government and by service providers, from filtering keywords to more groups of human censors.
“As I’m sure you know, there are laws to regulate online information. They can arrest people. They have targets, they can maintain blacklists of influential people on these platforms. They’re cracking down on VPNs, which many in China use to access blocked websites,” he says. “It’s a multidimensional system to regulate online speech in China.”
Fu and his team have found increasingly technological restrictions on their own Weiboscope work. Sina, Weibo’s parent company, has made its API much less open in the past five years.
“In the early days when we started Weiboscope, we could use basically most of the API to get what we needed. We had the timeline, search, we could get the individual posts — you go through their documentation and find all the API calls accessible,” Fu says. “Now most of the API calls they serve are for commercial partners. As individual developers, we don’t have budget for their tokens; we only can use one or two to assess the data, but we’ve tried to make use of this very restricted space to access to the biggest sample on Weibo we can.”
The HKU team is now building out a similar tool for WeChat, China’s ubiquitous messaging platform that now services a huge range of commercial and social activities. (Disclosure: Nieman Lab, which produced this story, has a content partnership with Tencent, WeChat’s parent company.) WeChatscope can’t monitor user-to-user messages, but will track public accounts and their posts for any content that gets deleted, starting with a few hundred public accounts and gradually scaling up. The team has hacked together a solution to scrape the app for published posts; the WeChatscope tool then follows up in intervals to see if any posts have been removed.
“We’re developing a pretty complicated tool to try to hack the system,” Fu says. “I think it’s only in China that you would ever require this particular kind of workaround.”