How the University of Hong Kong is tracking China’s censorship of Weibo users.

The Weiboscope project documents Chinese censorship of the popular social network. Now, HKU is building a similar tool for WeChat.

By Shan Wang
Nieman Lab

A version of this story first appeared in Nieman Lab on March 9, 2018. It is republished here under a Creative Commons license.

Xi Jinping and Winnie the Pooh; the letter “n,” and Pu Yi, too.

A hodgepodge of text and images have been deleted by Chinese censors as thousands of delegates gathered in Beijing for the National People’s Congress this month to officially vote to abolish the two-term limit for Chinese presidents, paving the way for Xi Jinping to remain in power beyond 2023, and potentially for life.

The reasoning behind censoring posts ranges from obvious to absurd to convoluted. Pooh memes, because the cartoon bear’s figure, when viewed alongside the trimmer Tigger, resembles a photograph taken of Chinese president Xi Jinping and Barack Obama in California in 2013 (judge for yourself here). Images of Pu Yi, China’s last emperor, with a caption referencing the return of the Qing dynasty, for its ridicule of consolidation of power. The letter “n,” maybe because n = number of terms, and n > 2.

King-wa Fu, an associate professor at the University of Hong Kong’s Journalism and Media Centre, and his team at HKU, have since 2011 monitored these types of censored content on China’s Weibo service, a social media site made in the image of Twitter (Twitter itself is officially blocked in China). The team’s Weiboscope tool tracks a large sample of accounts for posts that have been deleted, and has collected a substantial dataset of the types of terms and content that seem to trigger censors, whether human or algorithmic or a combination of both, and on which days and over what events censorship of Weibo has been highest. (Posts Weiboscope checks up on that show the message “permission denied” suggest they’ve been censored. Weiboscope, of course, can only track the posts that managed to be published in the first place.)

Paths of censorship

Over the course of 2017, the team captured 20,561 deleted posts across 120,000 Weibo users, a sample that includes both users with large followings and tens of thousands of randomly chosen users. Among the most heavily censored events: Nobel Prize winner Liu Xiaobo’s death on July 13, 2017. Among the words commonly found in censored posts of the past year: 习近平 (Xi Jinping). (The HKU team published its 10 most-censored Weibo events of 2017 roundup at ChinaFile last Thursday.)

Fu has seen enough aggregate data to recognize paths of censorship and the ways netizens — the term of choice in English-language reporting on Chinese internet users — try to circumvent censors or cloak their dissent. A Weibo user with a tiny number of followers may find they’re able to post about topics in relative freedom, but as they gain followers (粉丝, or “fans”), they may find they reach critical mass and trigger notice. Some terms are on a filter list, so users are blocked from posting them to a platform like Weibo in the first place, but a term may not always be continuously censored.

“[In early March], if you keyed in something related to the term ’emperor,’ I think it was blocked for the first few days, and a few days later it was changed,” Fu says. “If you are too strict controlling terms used in posts, people will find it extremely inconvenient: There are a lot of things that can contain ’emperor.’ Blocking that makes it very difficult for users. It depends on the period of time.”

These past weeks, with the looming constitutional change, have been a sensitive period of time.

“We’ve gotten a lot of calls these past weeks from people asking about the censorship system, and they’ve told me a lot of stories,” Fu says. He has heard about many examples of people whose accounts on Weibo were suspended, leaving them unable to post anything at all. He recalls receiving a call from a reporter in Beijing who told him about someone who posted a comment to social media related to issues around the term-limit change and was subsequently detained over the post.

“If you just look at the announcement last week about the constitutional changes of Xi Jinping, there’s been no real dissent in public, except from people online,” Fu says.

“You can’t see any of that in the mainstream media. You don’t see people protesting within China. There’s no assembly. It’s only online where you can still see quite substantial number of people who disagree with this constitutional change.”

Images and symbols have long been a popular way to skirt censors as they pass and get tweaked from user to user; they require some human intervention to recognize. 2012, within the first couple of years of Weiboscope, Fu says, felt like a turning point. 2012 was the year Chen Guangcheng, a blind, dissident lawyer, managed to escape Beijing for the U.S. It was the year a prominent party chief in western China, Bo Xilai, and his wife became embroiled in a scandal involving the murder of a British businessman, and fell from power. Online, Chinese people used codewords like “tomato” (西红柿, whose first two characters mean “West” and “red” individually) to refer to Bo Xilai, since the name “薄熙来” (Bo Xilai) was easily blocked.

“Looking back, that I think was one of the pivotal years, when the government began to use a larger-scale censorship system to try to silence the public response to these kinds of public issues,” Fu says. “People have been able to use these ways to extend survival time of discussions online. But in terms of the government, over the past few years, strategies have become increasingly more sophisticated, and a lot of new measures have been introduced by the government and by service providers, from filtering keywords to more groups of human censors.

“As I’m sure you know, there are laws to regulate online information. They can arrest people. They have targets, they can maintain blacklists of influential people on these platforms. They’re cracking down on VPNs, which many in China use to access blocked websites,” he says. “It’s a multidimensional system to regulate online speech in China.”

Complex workarounds

Fu and his team have found increasingly technological restrictions on their own Weiboscope work. Sina, Weibo’s parent company, has made its API much less open in the past five years.

“In the early days when we started Weiboscope, we could use basically most of the API to get what we needed. We had the timeline, search, we could get the individual posts — you go through their documentation and find all the API calls accessible,” Fu says. “Now most of the API calls they serve are for commercial partners. As individual developers, we don’t have budget for their tokens; we only can use one or two to assess the data, but we’ve tried to make use of this very restricted space to access to the biggest sample on Weibo we can.”

The HKU team is now building out a similar tool for WeChat, China’s ubiquitous messaging platform that now services a huge range of commercial and social activities. (Disclosure: Nieman Lab, which produced this story, has a content partnership with Tencent, WeChat’s parent company.) WeChatscope can’t monitor user-to-user messages, but will track public accounts and their posts for any content that gets deleted, starting with a few hundred public accounts and gradually scaling up. The team has hacked together a solution to scrape the app for published posts; the WeChatscope tool then follows up in intervals to see if any posts have been removed.

“We’re developing a pretty complicated tool to try to hack the system,” Fu says. “I think it’s only in China that you would ever require this particular kind of workaround.”

Shan Wang

Shan Wang is a staff writer at the Lab. She previously worked in editorial at Harvard University Press, and has reported for Boston.com and the New England Center for Investigative Reporting. One of the first news stories she ever wrote was about Muggle Quidditch for The Harvard Crimson. She was born in Shanghai, grew up in Connecticut and Massachusetts, and is a Ray Allen devotee. Follow Shan Wang on Twitter.

From this week

Platforms

Zuck was prepared. The Senators weren’t.

Tuesday’s 5-hour Congressional hearings showed how little American politicians understand, let alone how to regulate them. The lawmakers, who came across as out-of-touch old men, had no points to make, apart from trying to score soundbites (compare this with Singapore’s grilling of Facebook a few weeks back). They couldn’t figure out what ad tech is, how advertising works, or what data Facebook collects. Zuck seemed nervous but confident — he’s been coached by internal and external consultants on this. His team even reconfigured a conference room to look like a congressional hearing room. Never underestimate this guy’s ability to learn. Investors were encouraged by Zuck’s performance, and the lowered risk of regulation. They sent the stock up 4.5% at the end of the testimony on Tuesday.
CNN

Governments & policy

Transformations

Talent

Shen Lu left China to study journalism in the U.S. She writes perfectly in both languages.

But she finds it hard to get the career she wants on either side. She faces press restrictions in China, while U.S. newsrooms aren’t keen on hiring Chinese journalists. “If I had it to do over again, knowing what I know now, I doubt I’d make the decision to study journalism again, because the news industry—both in China and in the U.S.—seems to be a world designed to keep people like me out.”
China File

Design

A year after they jumped on, The Economist gets over 7 million viewers a month on Snapchat.

Their daily Snaps are deep topical dives into things like racial divisions, vaping, the possibility of another cold — or World — war, and office predators. The experience is true to the medium (brisk, anonymous-but-personal) but also true to the brand (highly produced, perfectly-researched, documentary-style editing). Surprisingly, some of the visual design can be downright hideous: for a timeline of sexual harassment, they choose to go with white, drop-shadowed text against a background image of a cave painting. “Others are “top-Snap only” content. These often have lots of text on them... intended to be shareable/screenshot-able primers on a topic.” Huh. The ads are programmatically annoying (I got one that promised more Instagram followers). But I love how The Economist is using the platform, as do millions of Snapchatters. This day in the life of Lucy Rohr, their Snapchat editor, is revealing, and it's the only reason I have exhumed my Snapchat app after my short-lived and bewildered run-in with it many years ago. Okay fiiiiine, I also want to Boost My Brows.
Digiday

I saw The Curious Incident of the Dog in the Night-Time at Singapore’s Esplanade last week.

The design of the production, by the otherworldly talent Bunny Christie, is brilliant, sensitive, frantic, filled with more LED wattage than a slap upside the head, and it has many more decibels than are physically available in the world. Christie, winner of a mountain of awards, is known for designing “psychology as well as space”, and she certainly redesigned my psychology for good. Interestingly, but not surprisingly, the production comes in another flavour: Relaxed. I think this is excellent design that addresses a very real need for audiences in the market for a different pace. From the programme: “Sound and lighting cues are modified to be less startling. Leaving the auditorium to take a break is fine and there is a designated Quiet Area available. There is an easy-going attitude to noise and movement, doors remain open and lights are dimmed staying on throughout the show.” Nice. Now if only life came with a Relaxed mode.
The Guardian

Notables