Toutiao’s approach to curbing fake news: Teach the AI to write it so that the machines can fight it.

Bots vs bots is what we need to solve a human problem.

By Frank Hersey

A version of this article first appeared on TechNode, a site covering the rapid changes in China’s technology and startups in real time. It’s re-published here with permission of TechNode.

Toutiao’s AI software did not generate this headline, but for the 20 million pieces of content that flow through the platform each day, headline generation and AB testing are just two of the AI services Toutiao uses to get more people tapping.

Speaking to foreign journalists for the first time as head of the Jinri Toutiao AI Lab and vice president of the app’s owner Bytedance, Dr. Ma Wei-Ying talked about the tech that his lab is working on, why it has a bot that generates fake news and what it knows about its users.

Jinri Toutiao is a news recommendation app that is trained and updated in real time on a user’s behavior. Unlike search engines, Ma pointed out, its search function is individual rather than one ranking for everyone.

“This is the democratization of content creation,” said Ma, putting Bytedance in line with other Chinese tech companies that have recently declared themselves as content companies. “Toutiao is becoming a new information platform for people to find information and connect with information. People are using their smartphones not just to access information, but to create information. They don’t need their own website–they can use Toutiao to directly upload and publish the information and content they create.”

The tremendous amount of data generated by users and creators allows the training of neuro-network models. Applying AI to the data gathered is generating a better understanding of the world these users are in.

“We are moving from a digital representation of the world to a semantic representation of the world.”

Ma believes the system is going to improve across the board. “Content creation will be fundamentally revolutionized in next few years” as AI allows the “mining of human intelligence to close the feedback loop” of each stage of the lifecycle of content creation, moderation, dissemination, and consumption. Here’s how.

Make fake news to beat fake news

Bytedance has a different approach to tackling fake news: writing it. The AI lab that Ma heads has developed a bot that uses the company’s growing database of real fake news stories to generate its own fake fake news. It then has another bot for detecting fake news which is trained by analyzing its counterpart’s fake feed, and by drawing on a matching database of real news. “One is good at writing, which means this also helps us to advance machine writing, and the other is machine reading. These two can push each other to improve by using the label data and assimilated data through our algorithms,” said Ma.

Ma believes that having two competing algorithms allows them each to improve. Toutiao lets users report what they believe to be fake news and analyzes comments to detect whether they suggest the content might be fake. When the system identifies a piece of fake news that has got through, it will notify all who have read it that they had read something fake.

Bytedance is using this “dual-learning” technique in other ways. It machine translates news from Chinese into English, then has another program to translate that article from English into Chinese to improve both processes. Fake news can also be translated to allow the algorithms to train for Toutiao’s global expansion. Other aspects of global expansion are language-independent, such as video, meaning those algorithms have already been trained on large numbers of Chinese users.

In the future, the culmination of analyzing successful pieces, building a database of popular topics, and developing machine writing will mean Toutiao will be able to automatically generate articles for its readers on their favorite subjects.

Better algorithms, better articles

“We adjust our strategy every week. It’s a constant experiment,” said Ma. The system is monitoring in real time and is also working to predict if a piece of content will be a success.

Algorithms offer four headlines to article writers then conduct AB testing to determine which is having the most impact. But not all articles are subject to algorithms due to the computing power involved. Only when a piece starts to gain traction will it get extra help.

Machine learning is used for viral prediction. It compares incoming articles with previous content that has taken off and as the machine learning proves successful, the accuracy of the system increases with constant feedback. Ma acknowledged that care has to be taken to prevent the algorithms from distorting the popularity of particular elements of content or stopping content from new users getting through who have yet to establish a positive profile from the system.

Automated sports commentary

Object recognition in video is also finely developed to fuel more personalization. Bytedance is working on smarter, personalized sports coverage, explained Ma. The current one-feed-fits-all approach will be replaced with a tailored viewing experience when fan data recognizes an interest in, for example, a particular player. Coverage will focus more on that player, with the end goal being a personalized, automated commentary and onscreen captions.

Location, location, location. And time.

Toutiao builds up an idea of users’ lives including their whereabouts and habits. As well as understanding what content the user is interested in, the AI adjusts recommendations based on current and historic location. Ma gave an example of this which shows the sophistication of the tool. Chinese people living in the U.S., using Toutiao as part of their everyday lives there, are generating a footprint. Then suddenly Chinese New Year comes around and the location changes from the U.S. to somewhere in China. The news may change accordingly there and then, but once the user heads back to the States, the software assumes that the user’s location at Chinese New Year was significant to them, and probably their hometown. Once back in the U.S., if any news stories crop up in their supposed hometowns, they will show up in the users’ feeds.

Time is used as a gauge for what is appropriate to send. Algorithms work out when a person is busy and so the app will not bombard them with too much content and will save it until they are free. On a larger scale, the data is providing profiles of cities and areas of cities in terms of people’s working habits. On an individual scale, these patterns can suggest what a person’s occupation is, but the data is anonymized. The system generates a user ID per smartphone, made up of a billion factors and which only an algorithm can identify.

Moderation and government relations

In a separate briefing, Bytedance senior vice-president for corporate development Liu Zhen revealed that of the 20 million pieces of content uploaded to Toutiao each day, 90% are machine moderated. Meaning the other 2 million pieces are human-reviewed. Although Toutiao has been working on its moderation for five years, humans are and always will be needed, according to Ma.

“We have a very good communication channel between the company and the government. So far we’ve been working very hard because we are a new platform, a new kind of application exploring a new frontier. Things have been going quite smoothly because the communication channel is very open and very healthy,” said Ma.

Frank Hersey

Frank Hersey is a Beijing-based tech reporter who’s been visiting China since 2001. He tries to go beyond the headlines to explain the context and impact of developments in China’s tech sector. Follow Frank Hersey on Twitter.

From this week



Governments & policy

Civil society groups in Singapore are concerned that a proposed public order bill would confer on police the authority to shut down communications in times of unrest.

No live broadcast of police operations. No transmission of text, photos or videos of the incident. No documentation of police action. The government’s apparent goal is to preserve an official version of information, but you can see how this is a source of worry for journalists trying to cover incidents on the ground. The definitions are so broad that the law could even be used to crack down on peaceful gatherings.




Noun Project is a delicious vocabulary of visual abbreviation.

In plain-speak: crowd-sourced icons for everything. But it’s interesting to think about what a hyper-simplified icon says about race, and this essay nails it. “Many depictions of race — icons or otherwise — rely on outdated tropes, stereotypical depictions, or fetishized myths to accomplish recognizability. Icons of race should celebrate physical differences as representative forms because failing to do so will result in misrepresentation by homogenization.”
The Noun Project

The Sydney Morning Herald launched its redesign a couple of weeks ago.

The new design system has a read-later feature called Shortlist (do people still use those?), contextual info-dives, and skimmable ‘talking points’ boxes (us old newspaper designers loved those). The website is now more horizontal in its scroll navigation, with cards in rows going across, compared to the earlier version with sections tottering about in columns. But good design is always about more than just what you see. I asked Frames subscriber and SEO dude Vahe Arabian what he thought. “I like that they’re using Varnish for the shortlist tool, and to to boost site speed. But they still need to work on optimising their topic hubs. They also need better linking to related stories instead of over-relying on their tool.” Thanks, Vahe!
Sydney Morning Herald

One of our favourite beards and designers, Van Schneider, hung out with the folks at Farmgroup, a design firm in Bangkok.

They’re largely a branding shop, but have grown into a full service design consultancy. They work on lots of interior and restaurant design projects, and I really like what they do with menus. “Graphic design is relatively young in this country; we are all still finding a place to stand in the world. I think one of our fortés is being crafty (in both meanings).”
Van Schneider