Goldman Sachs released a report on May 5 arguing that the artificial intelligence industry is at the cusp of a transformation from a cost-heavy infrastructure buildout to a profit-generating business, with a gross margin inflection point for hyperscale cloud providers and large model vendors expected within the next 3 to 12 months.

The Token Economics Shift
The bank forecasts that consumer and enterprise AI agents will drive global token consumption to increase 24-fold over current 2026 levels by 2030, reaching approximately 120 quadrillion tokens per month. If peak adoption rates for enterprise agents are reached by 2040, the figure could expand to 55 times current levels.
The core thesis rests on a widening gap between token pricing and computing costs. While mainstream large model token pricing has stabilized or slightly rebounded after previously declining about 40% annually, the per-token computing cost driven by chips from Nvidia, AMD, Google TPUs, and Amazon’s Trainium continues to decrease at an annual rate of 60% to 70%. This divergence is creating expanding profit margins where increased usage now translates to profit growth rather than rising cost burdens.
Goldman Sachs described a self-reinforcing flywheel: lower per-token costs enable richer, more complex agents, which in turn consume far more tokens through longer context windows, repeated reasoning loops, and continuous monitoring—improving infrastructure utilization and unit economics.
Consumer and Enterprise Demand
Consumer-facing AI agents alone could add approximately 60 quadrillion tokens per month by 2030, a 12-fold increase, as AI assistants evolve from on-demand tools to always-on “residential” agents that continuously monitor and act on behalf of users. Daily AI queries are projected to grow from about 5 billion in 2025 to 23 billion by 2030.
Enterprise agents represent the larger multiplier. Goldman Sachs estimates programming agents consume about 7 million tokens daily at roughly $13 per day in API costs—far below human labor costs—explaining why software development is seeing the fastest adoption. The bank projects enterprise workloads will account for over 70% of total global token usage at peak adoption.
Investment Implications
The report arrives as hyperscaler capital expenditure continues to climb. Goldman Sachs previously estimated cumulative AI capex of $7.6 trillion between 2026 and 2031, and this latest analysis suggests improved profitability could make those spending levels more sustainable. The bank pointed to AWS revenue growth of 28% year-over-year in the first quarter and Google Cloud’s 63% growth as early evidence of the shift.
Goldman Sachs cautioned that not all AI workloads will achieve positive profit inflection—highly commoditized text-only chatbots may still see pricing compressed faster than costs decline.