AI chatbots show China bias in 75% of Chinese-language queries

A Wall Street Journal experiment found that AI chatbots including DeepSeek and ChatGPT deliver markedly different answers about China depending on the language used — and how forcefully the user pushes back.

When Jeff He, a California-based reader, translated a Wall Street Journal opinion column into Chinese and shared it with high-school classmates in China, the response was swift. One friend asked DeepSeek, China's leading homegrown AI model, to write a rebuttal. The bot produced an essay titled "The Future Does Not Belong to America," arguing that China has Huawei, Tencent, ByteDance and BYD while America has produced little beyond "a search engine that's a bit chattier than the old ones."

He then accessed DeepSeek from his California office — the same web address — pasted the rebuttal and asked the bot to verify each claim. The overseas version dismantled it, flagging "selective use of data," "false dichotomies" and "multiple factual errors and logical fallacies" across eight points.

"The 'no-mercy' criticism from the overseas DeepSeek really surprised me," He told the Journal.

The divergence reflects a structural feature of large language models that researchers are only beginning to quantify. A study published in Nature last week by Molly Roberts, co-director of China Data Lab at the University of California San Diego, and her team found that state-aligned media from authoritarian countries can seep into training data and shape chatbot responses — even without deliberate programming.

Roberts said the mainland-versus-overseas gap He observed likely stems from differences in post-training alignment, the step where models are given instructions about what is "safe" to say. "State media ending up in the training data will affect LLMs generally," she said. "Post-training should induce refusal or skewed responses in LLMs that are affected by regulations from a particular state."

The Nature study tested Claude and ChatGPT with identical political questions in English and Chinese. In 75% of cases, Chinese-language prompts generated answers more favorable to the Chinese government. Across 37 autocratic countries including Vietnam, Turkmenistan and Uzbekistan, both chatbots gave more pro-regime answers when prompted in the dominant local language. By contrast, in nations with the highest press freedom, the LLMs were often more critical of the government when queried in the local tongue.

The mechanism is straightforward: state-aligned media produces vast amounts of text behind few paywalls. In the open-source training dataset CulturaX, Chinese state propaganda documents were 41 times more prominent than Chinese-language Wikipedia articles — typically a core training source. When the researchers added scripted state media to a test model's training data, the model became measurably more favorable to the Chinese Communist Party.

Pushback matters — but not everyone pushes

Other WSJ readers reported similar patterns with ChatGPT in English. Chas Gile, a private-equity investor in Texas, asked ChatGPT whether China was "in some ways as democratic as Western countries." The first answer offered a careful comparative analysis, noting that Freedom House rates China "Not Free" but that the regime offers "performance accountability" and "high reported public satisfaction."

When Gile pushed back — telling the bot he thought it had been affected by Chinese propaganda — ChatGPT apologized within seconds and issued a sharper answer. Asked to "remain truly objective," it sharpened further: "China may offer a powerful alternative model of state capacity, but it does not offer a democratic alternative."

The episode illustrates a single chatbot moving several inches per turn depending on user persistence — a dynamic that favors confident, informed users over casual ones.

What this means for the AI industry

The findings arrive as the frontier AI labs prepare for public listings. Anthropic and OpenAI are both planning initial public offerings; DeepSeek is raising fresh capital from investors aligned with Beijing's push for technology self-sufficiency. The financial stakes amplify the need for what Roberts calls "source transparency" — a nutrition label for AI training data.

"AI companies have a role in being as transparent as possible," Roberts said. "We need to educate the public to think critically about the output of AI and not rely on it blindly."

The policy implications extend beyond consumer chatbots. If major LLMs are influenced by authoritarian propaganda, they could serve as uniquely effective apologists for autocratic regimes — a machine that can synthesize all recorded knowledge but deliver answers shaped by state media that users may not recognize as biased. Unlike a state newspaper, a chatbot will engage in hours-long dialogue and provide detailed answers to skeptical questions, making its influence harder to detect.

Beijing appears to view American chatbots as a threat: ChatGPT is banned in China. Yet the Nature study suggests the information environment may still be improving relative to domestic alternatives. In a separate experiment, ChatGPT prompted in Chinese still expressed broadly anti-authoritarian views and provided advice on how to protest the government — suggesting frontier models may remain less biased than state-controlled media even with training-data contamination.

The question for regulators and investors is whether the current trajectory — where a user's language and persistence determine the quality of information they receive — is acceptable as AI becomes the primary information interface for more than a billion weekly users.

This article is for informational purposes only and does not constitute investment advice.