AI Model Price War Heats Up: DeepSeek Cuts 75%, Tokens Enter 'Utility Era'
In May 2026, the AI industry witnessed something more structurally significant than any new model release: DeepSeek announced a permanent 75% price cut on V4 Pro API.
This is not a promotion. This is a transfer of pricing power.
Three Signals of Price Collapse
| Event | Time | Significance |
|---|---|---|
| DeepSeek V4 Pro API price cut 75% | May 2026 | First domestic LLM to hit global lowest price |
| Three major operators launch Token packages | May 2026 | Compute enters ‘pay-as-you-go’ infrastructure phase |
| China daily Token calls exceed 140 trillion | May 2026 | ~1000x growth in two years, steep demand curve |
These three events together signal one thing: Token is shifting from a ‘technology cost’ to an ‘infrastructure cost’, similar to bandwidth and storage in the cloud era.
Global Price Comparison: Who Still Has Margin?
| Vendor | Model | Input Price (/million Tokens) | Output Price (/million Tokens) |
|---|---|---|---|
| DeepSeek | V4 Pro | ~ ¥0.5 | ~ ¥2 |
| OpenAI | GPT-5.5 Instant | ~ $0.5 | ~ $1.5 |
| Baidu | Wenxin 5.1 | ~ ¥1.2 | ~ ¥4 |
| Gemini 3.5 | ~ $0.35 | ~ $1.4 |
DeepSeek’s pricing strategy is clear: trade extreme low prices for ecosystem positioning, lock developers into its calling system first, then monetize through enterprise services and vertical scenarios.
This is not vicious competition. It is a path already validated in the cloud era.
What Operator Entry Means
China Mobile, Unicom, and Telecom simultaneously launched Token packages, a signal deeper than model price cuts.
- Compute commoditization: Token is bundled and sold like data traffic, allowing enterprises to forecast monthly AI costs
- Network edge deployment: Operators can push inference nodes to provincial-level data centers, reducing latency
- Compliance closed loop: Data stays within borders, calls are traceable, meeting strict regulatory requirements for finance and government
When operators start selling Token, large models cease to be the exclusive game of tech companies and become part of national digital infrastructure.
Practical Impact on Developers
A 75% price cut is not a numbers game. It directly changes the technical-economic model of products.
What was previously impossible is now feasible:
- Long-context RAG (Retrieval-Augmented Generation): Previously too expensive for 100k-token contexts, now viable for routine use
- Real-time voice transcription + translation: The cost barrier for streaming calls disappears
- Batch document processing: Processing 1000 PDF contracts in one run drops from thousands of yuan to hundreds
What was previously risky to try is now worth experimenting:
- Multi-model routing: Automatically switch models based on task complexity, costs remain controllable
- High-frequency fine-tuning: Continuously optimize small models with real call data, marginal cost approaches zero
Risk: What Comes After the Price War?
Low price does not equal health. Three potential issues to watch:
-
Service quality dilution: Does extreme low pricing come with increased response latency and reduced availability? DeepSeek’s concurrent load capacity has yet to be tested at large scale.
-
Innovation motivation shift: When basic calls are unprofitable, where will vendors turn? Enterprise services, private deployment, industry vertical models — this is actually positive, indicating market stratification is forming.
-
Overseas pricing power: Domestic models are priced extremely low at home, but how to price overseas? If global markets follow the price drop, OpenAI and Anthropic’s profit margins will be compressed, potentially triggering more intense technology arms races.
Conclusion
The collapse of Token prices is not the end; it is the beginning.
It means the innovation barrier for the AI application layer is drastically lowered. In 2024, building an AI app required considering model costs; in 2026, that constraint has essentially disappeared. The next competitive focus will shift to:
- Product experience design
- Data flywheel construction
- Industry know-how depth
In other words, the model layer competes on price, the application layer competes on value. This is bullish for developers.
Source: DeepSeek official announcement, three major operators’ press releases, National Cyberspace Administration public information, industry data compilation.