Anthropic’s Claude AI Targeted in Massive Distillation Attacks by Chinese AI Labs
Anthropic, a leading artificial intelligence research company, has recently accused three major Chinese AI firms—DeepSeek, Moonshot AI, and MiniMax—of orchestrating large-scale distillation attacks aimed at extracting advanced capabilities from its Claude AI models. These coordinated operations reportedly involved approximately 24,000 fraudulent accounts and generated over 16 million interactions with Claude, violating Anthropic’s terms of service and regional access restrictions.
Understanding Distillation in AI
Distillation is a common technique in AI development where a smaller student model learns from the outputs of a larger teacher model. This method is typically used to create more efficient versions of existing systems. However, when applied illicitly to a competitor’s model, distillation enables rapid transfer of capabilities at a fraction of the original development cost and time.
Anthropic has expressed concern that these unauthorized distillation efforts may result in copies of Claude lacking the robust safety measures integrated into U.S. frontier models. These safeguards are designed to prevent misuse in areas such as bioweapons development or malicious cyber operations. The company warns that unprotected AI capabilities could be exploited by authoritarian governments for military, intelligence, or surveillance purposes, or could be open-sourced, spreading dangerous AI tools beyond any single nation’s control.
Details of the Distillation Campaigns
DeepSeek
– Scale: Over 150,000 exchanges
– Targets: Advanced reasoning, rubric-based grading (to train reward models), and censorship-safe alternatives to politically sensitive queries
– Tactics: Synchronized traffic across accounts, shared payment methods, and prompts designed to extract step-by-step chain-of-thought reasoning
Moonshot AI (Kimi models)
– Scale: Over 3.4 million exchanges
– Targets: Agentic reasoning, tool use, coding, data analysis, computer-use agents, and computer vision
– Tactics: Hundreds of fraudulent accounts across multiple access paths; later phases focused on reconstructing Claude’s reasoning traces
MiniMax
– Scale: Over 13 million exchanges (the largest campaign)
– Targets: Agentic coding and tool-use orchestration
– Tactics: Detected while still active; when Anthropic released a new model, MiniMax pivoted within 24 hours, redirecting nearly half its traffic to the updated system
Anthropic has attributed these campaigns with high confidence using IP correlations, request metadata, infrastructure fingerprints, and corroboration from industry partners. In one instance, request metadata directly matched the public profiles of senior researchers at the implicated labs.
Circumventing Regional Restrictions
Anthropic does not offer commercial access to Claude in China. To bypass this restriction, the Chinese AI labs reportedly purchased access through third-party commercial proxy services that resell API calls at scale. These services operate extensive networks of fraudulent accounts, blending distillation traffic with legitimate customer requests, thereby complicating detection efforts.
Anthropic’s Response and Future Measures
In response to these attacks, Anthropic is investing heavily in new detection systems, including classifiers for chain-of-thought elicitation and behavioral fingerprinting to identify coordinated activity. The company is also sharing technical indicators with other AI labs, cloud providers, and authorities, while tightening verification processes to prevent future unauthorized access.