2026-01-12 08:28:13

Benchmarking is essentially writing values into code.

All our expectations and fears about AI are forcibly embedded into those scoring tools—what constitutes progress, what should be feared, what needs to be optimized—and we end up pretending that these things can be precisely quantified. The problem is, some things simply can't be measured. Behind the selected metrics, there are often the designer's own assumptions. The choices you make in testing are equivalent to defining what AI should become. Conversely, the things that are not chosen might actually be the most important.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

10 Likes

Reward
10
8
Repost
Share

Comment

0/400

SignatureLiquidator

· 4h ago

Indicators are just a smokescreen; you see what you choose to see. The unseen is the truly terrifying part.

View OriginalReply0

EternalMiner

· 5h ago

That's right, indicators are essentially power.

View OriginalReply0

PanicSeller

· 5h ago

The benchmark is a game of power discourse; whoever sets the criteria wins.

View OriginalReply0

GateUser-7b078580

· 5h ago

The data shows that this scoring system itself is unreasonable. But who decided the selected indicators? Miners are taking too much, and so are the benchmarks.

View OriginalReply0

ChainBrain

· 5h ago

Wow, that's why those rankings are all nonsense.

View OriginalReply0

MetaMisery

· 5h ago

This is the truth: whoever sets the targets holds the power of discourse.

View OriginalReply0

TokenTherapist

· 5h ago

Hmm… Benchmarking is essentially codifying someone's values, and that's the real issue. --- Exactly, the things that aren't included in the metrics are the truly terrifying ones. --- So basically, designers are playing a power game with numbers. --- Quantification itself is a form of filtering; that's a very absolute statement, haha. --- Once the metrics are set, they become self-fulfilling prophecies. --- Every time I look at a benchmark, I want to ask: who says these things should be measured? --- The most absurd thing is pretending that precise quantification can solve value conflicts.

View OriginalReply0

TopBuyerBottomSeller