Many businesses struggle to accurately measure the true impact of their AI investments, leading to wasted resources and missed opportunities. (Illustrative AI-generated image).
- Focusing on AI metrics like token counts or GPU hours can be misleading, leading businesses to track activity rather than genuine impact.
- The practice of “tokenmaxxing”-maximizing AI token consumption without regard for business value-can lead to significant financial waste and distract from core priorities.
- Companies like Amazon and Uber have learned that misaligned AI incentives can result in budget overruns and a lack of meaningful business outcomes.
- Effective AI measurement should prioritize tangible business results such as increased revenue, faster processes, and reduced operational friction.
- A strategic shift is needed to define clear, measurable business outcomes for each AI project before implementation.
- Implementing AI usage limits and conducting small-scale experiments can help prevent wasteful spending and ensure targeted, valuable AI investments.
Amazon once used an internal leaderboard to track artificial intelligence usage, focusing on the number of “tokens” processed by different teams. Tokens are small pieces of text that AI models handle, similar to words or parts of words. The goal was to encourage AI adoption, assuming more tokens meant more progress.
However, this approach led to unexpected results. Teams increased their AI tasks, but the company saw diminishing returns. Employees were using AI simply to meet a metric, not to solve actual business problems. This led Amazon to discontinue the leaderboard.
An Amazon senior vice president advised staff to avoid using AI without a clear purpose. This situation serves as a critical warning for many businesses currently integrating AI. The key question is whether companies are tracking AI activity rather than its tangible impact. Relying on metrics like token counts, GPU hours, or inference costs can be misleading and financially detrimental.
The Amazon Leaderboard: A Lesson in Misaligned Incentives
Amazon’s experience highlights a common pitfall: setting the wrong incentives. When you measure activity, you get more activity. If token usage is the primary metric, teams will focus on maximizing it, regardless of its business value. This phenomenon is known as “tokenmaxxing,” where the goal becomes increasing AI token consumption without regard for the actual benefit of the output.
Tokenmaxxing is akin to measuring the mileage of delivery trucks without checking if they delivered any packages. Amazon’s leaderboard inadvertently created a competition, encouraging teams to use AI for tasks that didn’t require it, such as summarizing documents nobody read or using chatbots for simple FAQ answers. The AI was active, but the business saw no improvement.
This is not solely a technical issue but a management challenge. When leadership demands AI metrics, teams will provide easily reportable numbers like token usage, which can mask underlying inefficiencies. Amazon’s decision to eliminate the leaderboard is a significant signal that other companies should heed. If a data-centric organization like Amazon can fall into this trap, others are equally vulnerable.
Tokenmaxxing: The Hype That’s Costing Businesses
The phrase “Tokens are the new oil for the enterprise” has circulated widely, sounding impressive but potentially leading to poor decision-making. Tokenmaxxing itself has become a buzzword, describing the drive to use as many AI tokens as possible under the assumption that high usage equates to staying competitive.
The core problem is conflating consumption with value. Consider a factory that measures success by machine operating hours. Managers might run machines unnecessarily, producing unwanted goods simply to keep them running. This creates an illusion of efficiency while wasting resources.
Tokenmaxxing similarly makes teams appear busy and fills dashboards with impressive figures, but it doesn’t guarantee that the AI is enhancing customer experience, reducing costs, or boosting sales. Often, it leads to wasted spending on computing power and diverts focus from critical business priorities. The pressure to demonstrate AI adoption can create a cycle where increased AI spending benefits AI providers more than the businesses themselves.
If AI expenses are rising while revenue remains stagnant, it indicates a measurement problem where activity is mistaken for progress.
Uber’s $40 Million AI Budget Overrun
Uber’s experience provides another cautionary tale. The company set an AI coding budget for 2026, which was intended to last two years, but was depleted in just four months. This significant overspend suggests a fundamental issue with how AI usage was being tracked and incentivized.
Likely, Uber was measuring token usage and encouraging AI for coding tasks. While AI tools can accelerate code generation, this doesn’t automatically translate to better quality code or fewer bugs. It can result in more complex codebases, increased error correction, and higher computational demands.
CEO Dara Khosrowshahi has emphasized cost control, highlighting the challenge of limiting AI usage without stifling innovation. Striking this balance is crucial: encouraging experimentation while preventing indiscriminate AI application.
Uber’s situation demonstrates that even large tech companies struggle with AI cost management due to the novelty of the tools and evolving best practices. For smaller businesses, the financial risks are amplified; a rapid budget overrun like Uber’s could be devastating.
Google’s Sevenfold Token Growth: A Potential Warning Sign
Google CEO Sundar Pichai noted a sevenfold increase in the company’s token usage within a year. While this sounds like success, its implications vary. For Google, which profits from AI through cloud services and model sales, increased internal usage might drive product development.
However, for most other companies, such rapid growth in token usage would be a red flag. Token usage represents a cost-computing power and electricity. Without a direct link to increased revenue or cost savings, it’s simply an expense. Rapidly escalating token counts without clear business outcomes mirror a factory increasing raw material purchases while sales remain flat-it’s waste, not progress.
Google’s figures should serve as a benchmark, not a target. If prompted to match Google’s growth, it’s essential to question the return on that investment. Without clear answers, scaling token usage is premature.
Other major tech firms like Meta, Microsoft, and Salesforce are reportedly implementing measures to control token usage, recognizing the same potential for waste that Amazon identified. They are seeking to establish sensible limits on AI consumption.
The True Metrics: Revenue, Speed, and Friction Reduction
Instead of focusing on token counts, GPU hours, or inference costs, companies should measure what truly matters to their business. This includes direct impacts on revenue, improvements in speed, and reductions in friction.
Revenue: Is the AI tool directly contributing to increased sales? For instance, an AI chatbot on an e-commerce site that helps customers find products can be measured by its impact on purchase conversion rates.
Speed: Does the AI accelerate decision-making or processes? This could manifest as faster customer service response times, quicker product development cycles, or streamlined approval workflows. An AI that reduces customer complaint resolution time from days to hours provides tangible value.
Friction Reduction: Does the AI simplify tasks for customers or employees? Automating repetitive or tedious tasks, like form filling or report summarization, can save significant employee time and boost productivity.
Consulting firms like Bain emphasize outcome-based measurement over consumption-based metrics, urging companies to focus on what AI achieves rather than how much is used. IDC notes that emerging agentic AI, capable of autonomous action, requires new frameworks for ROI tracking.
Successful companies report AI ROI by linking investments to specific business outcomes, such as “Our AI customer service tool saved $Y million by reducing call times by Z percent,” rather than just stating token usage figures. The correlation between token usage and revenue growth is not automatic; effective AI implementation focuses on improving core business processes for tangible results.
Three Steps to Correct AI Measurement
Improving AI measurement requires a shift in perspective, achievable in three key steps:
Step 1: Deprioritize Token Counts. Remove token usage, GPU hours, and inference costs from primary dashboards. Treat them as cost-tracking metrics, similar to utility bills, rather than performance indicators. Celebrate efficiency, not just consumption.
Step 2: Define Success for Each AI Project. Before launching any AI initiative, clearly articulate the expected business outcome in plain language. For example, “This chatbot will reduce support ticket volume by 20% within three months” or “This code assistant will decrease average development time for new features by 30%.” Projects without clear, measurable outcomes should not proceed.
Step 3: Measure Only the Defined Outcome. Track progress against the established goals. If the project succeeds, scale it. If it fails, reassess, pivot, or terminate it. Avoid continuing AI projects solely based on high token usage or superficial appeal.
Companies like Meta and Salesforce implement token usage limits per team or use case, requiring justification for exceeding budgets. This encourages thoughtful AI investment. Industry experts also recommend small-scale experiments before broad rollouts to validate AI tools and avoid costly, unnecessary implementations.
The most crucial change is cultural: Leaders should shift from asking “How much AI are we using?” to “What is AI accomplishing for us?” This reframing encourages teams to maximize value rather than just usage.
Amazon’s decision to dismantle its problematic leaderboard exemplifies this necessary shift. Businesses should critically evaluate their AI dashboards, distinguishing between genuine progress and mere activity. If the metrics reflect only activity, it’s time for a change.
AI holds immense potential, but it will be squandered if not measured correctly. Good management principles-measuring what matters and investing where it yields results-remain timeless. Avoid using AI simply for the sake of it; focus on its genuine contribution to business success.
Frequently Asked Questions
What is 'tokenmaxxing' in the context of AI?
Tokenmaxxing refers to the practice of maximizing the consumption of AI tokens without ensuring that the AI's output provides actual business value. It's about increasing AI activity for its own sake, rather than for achieving specific goals.
Why did Amazon shut down its AI leaderboard?
Amazon discontinued its AI leaderboard because it incentivized teams to use more AI tokens, leading to increased activity but not necessarily better business results. Employees were focused on hitting metrics rather than solving problems.
What are better metrics for measuring AI success than token counts?
Instead of token counts, businesses should measure AI's impact on revenue generation, improvements in process speed (e.g., faster customer service), and reduction of friction (e.g., automating tedious tasks). These metrics reflect tangible business value.
How can companies avoid overspending on AI?
Companies can avoid overspending by defining clear, measurable business outcomes for each AI project before starting. They should also track these specific outcomes rigorously and consider implementing usage limits or running small experiments to validate AI tools before large-scale deployment.
Is Google's high token usage a sign of success?
For Google, which profits directly from AI services, high token usage might indicate successful product development. However, for most other companies, a sevenfold increase in token usage without a clear link to revenue or cost savings is a warning sign of potential waste.
What is the main takeaway from Uber's AI budget issues?
Uber's experience shows that even sophisticated tech companies can face massive AI budget overruns if usage isn't carefully managed and tied to clear business value. It highlights the difficulty in controlling costs when AI tools are powerful and easily accessible.
How can a company shift its AI measurement strategy?
The shift involves moving from counting AI usage (like tokens) to measuring specific business outcomes. Leaders should ask 'What is AI doing for us?' instead of 'How much AI are we using?' This requires defining success metrics upfront and holding AI projects accountable for achieving them.