"Token" Economics: AI Needs to Recalculate the Books

MaticHoleFiller · 2026-04-03T00:34:27+00:00

> Investing in stocks? Look to Golden Kylin Analyst Reports—authoritative, professional, timely, comprehensive—helping you uncover potential thematic opportunities!Source: Beijing Business Today"Token" is becoming the hottest term in the AI industry. At the 2026 Zhongguancun Forum held recently, the topics discussed by Kimi founder and CEO Yang Zhilin and Zhipu CEO Zhang Peng inevitably centered around it. Yang Zhilin defined Token as the future GDP, while Zhang Peng straightforwardly stated, "Relying on low prices for long-term competition with Tokens is not conducive to industry development." Over 1,000 kilometers away, Tencent Senior Executive Vice President Tang Daosheng and Vice President Li Qiang also discussed Tokens. The former said, "With the same model capability, different Harness (scaffold) designs lead to significant differences in Token costs," and the latter believed that Token switching

MaticHoleFiller

2026-04-03 00:34:27

If you want to trade stocks, just look at the Golden Qilin analyst reports—authoritative, professional, timely, and comprehensive. Help you uncover high-potential theme opportunities!

Source: Beijing Business Daily

“Token” is becoming the hottest word in the AI industry. At the 2026 Zhongguancun Forum annual meeting held recently, the topics discussed by Kimi founder and CEO Yang Zhilin and Zhipu CEO Zhang Peng couldn’t avoid it. Yang Zhilin defined Token as the future GDP, while Zhang Peng bluntly said, “Competition based on low pricing for Token over the long term is not conducive to the development of the industry.” More than 1,000 kilometers away, Tencent’s senior executive vice president Tang Daosheng and vice president Li Qiang also talked about Token. The former said, “With the same model capability, different Harness (scaffolding) design can lead to a huge difference in Token cost.” The latter believed that Token switching is easy, and that low stickiness and the ending of subsidies can easily cause customers to churn. When OpenClaw (netizen nickname “lobsters”) triggers an exponential surge in Token consumption, Token is no longer just a technical term—it becomes a key variable in business models.

“Burning through” costs for Token

The agent boom sparked by the lobsters has caused Token consumption to explode exponentially. What is Token? According to the National Data Bureau’s definition, it is the smallest unit for AI large models to process information. Tokens can be measured, priced, and traded.

Zhang Ting, product负责人 of Baidu Qianfan, explained to a Beijing Business Daily reporter, “It is neither completely equal to a single character, nor completely equal to a word. It’s a kind of ‘language fragment’ in between. For example, the Chinese character ‘我’ is a Token, ‘today’ might be a Token, but ‘internationalization’ could be split into two Tokens: ‘international’ and ‘ization.’ Because the language a large model faces is global, Token is a universal ‘greatest common divisor,’ allowing the model to handle all languages and symbols in a unified way.”

According to the National Data Bureau, in early 2024, China’s daily Token calls averaged 100 billion; by the end of 2025, that number jumped to 100 trillion; in March 2026 it has already surpassed 140 trillion, with growth of more than 1,000x over two years.

From February, the reactions from cloud vendors and AI large-model companies began. Zhipu canceled the first-purchase discount for the GLM Coding Plan, with package prices rising overall by at least 30%. In early March, Tencent Cloud raised the prices of two of its self-developed models; the Tencent HY2.0 Instruct model saw a 463% increase. In the second half of the month, Alibaba Cloud and Baidu Intelligent Cloud announced on the same day that they would raise AI compute prices, with the highest increase reaching 34%.

For the logical chain behind why the agent boom drives Token consumption growth, Zhang Peng recently provided a detailed explanation: When an Agent faces complex tasks, the model’s reasoning chain is very long, so Token consumption is very high. Accordingly, the cost of model inference also increases. Therefore, it’s necessary to bring Token pricing back to its normal business value. Competing on low prices long term is also not good for the development of the whole industry.

In an interview with Beijing Business Daily reporter and other media, Li Qiang said, “The economic value of Token will quickly draw attention from all customers. If you only consider consumption volume and not economics, your price or cost on the user side may end up being higher, which will have a negative impact on the company’s long-term healthy development.”

Harness “scaffolding” hidden beneath the surface

How exactly is Token priced? Zhang Ting broke it down with an example for the Beijing Business Daily reporter: “For instance, ‘What’s the weather like in Beijing today,’ plus the AI’s answer, would consume about 50–100 Tokens. If you ask the AI to write an 800-word essay, including your prompt words and the full output, it would consume about 1,000–1,500 Tokens.” “Translated into money: currently, the price of mainstream models on Baidu Qianfan is on the order of a few cents per million Tokens. That means $1 can have the AI write about 1,000 800-word essays.” Zhang Ting said.

But when Token consumption grows exponentially, a deeper issue comes to the fore: not all of these Tokens are spent on “the blade.” “Token is like gasoline, and an Agent is like the engine. If you only focus on fuel consumption and don’t care about the economics and output capability of the engine, the customer will ultimately give it up too.” Li Qiang interpreted Token efficiency using fuel consumption.

Li Di, founder of Nextie (Mingri Xincheng) and “the father of Xiaobing,” also told the Beijing Business Daily reporter, “The hot Token consumption points to an interesting phenomenon: Tokenmaxxing (a Token-scraping contest). Right now, many developers and companies are crazily boosting Token consumption, even treating it as a kind of display of ‘compute muscle.’ But this unrestrained burning leads to a huge ROI (return on investment) imbalance.”

Against the above backdrop, another concept—Harness—has rapidly gone mainstream in Silicon Valley and in domestic tech circles.

Li Di explained in detail to the Beijing Business Daily reporter that Harness is literally translated as “saddlery” or “reins.” If a large model is a powerful wild horse but with no fixed direction, Harness is the constraint system that allows it to run along a predetermined track.

“Getting AI into real-world use isn’t just an algorithms problem—it’s an engineering problem,” Tang Daosheng said, “With the same model capability, different scaffolding or Harness design—such as what tools the model calls, layered context engineering, long-term memory management, workflow implementation, and so on—have a significant impact on actual usage outcomes and Token costs.”

Luofu Li, head of Xiaomi MiMo large models, also mentioned this term when interpreting the value of OpenClaw: “OpenClaw raises the ceiling of domestically ‘once-closed-source’ level models. Meanwhile, with a set of Harness (constraint control system) and many other designs, it ensures the model’s task completion quality and accuracy, so the floor is also well guaranteed.”

Cloud vendors rebuild the “foundation”

Specifically at the engineering level, Tencent Cloud’s agent development platform ADP connects an Agent to a “library” through capabilities like RAG (retrieval-augmented generation) and knowledge bases, keeping industry experts online at all times. Then Claw runs in a secure sandbox of the Agent Runtime: as the neural center of an intelligent system, Claw continuously learns and accumulates the ability to discover and download Skills from a skill library, and by using the large model to send and receive instructions externally, it triggers actions. The sandbox solution of Agent Runtime can also be used to verify program results of large-model reinforcement learning, improving the efficiency of reinforcement learning training.

This is only the tip of the infrastructure iceberg.

“Where compute ends may be electricity.” Li Qiang revealed in an interview that Tencent started exploring compute-and-power coordination two years ago. “Together with partners, in Inner Mongolia, we directly power data centers using local wind power and wind–solar storage, and balance clean energy peak and troughs using hydrogen energy and energy storage. At the same time, we coordinate compute peaks and valleys. On one hand, it dramatically reduces electricity costs; on the other, it reduces carbon emissions.”

Another layer of change happens in the scheduling mechanism. “In the current cloud computing era

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.