To embed a website or widget, add it to the properties panel.

The AI bubble debate misses the point: the bottleneck is physical

The AI bubble debate misses the point: the bottleneck is physical

The AI bubble debate misses the point: the bottleneck is physical

Energy & Electricity

energy-electricity

Power

power

AUTHORs

EXTERNAL CONTRIBUTORS

Mayank Kapoor

Principal, Carbon Direct Capital

Jump to section

    Go from climate goal to climate action

    AUTHORs

    EXTERNAL CONTRIBUTORS

    Mayank Kapoor

    Principal, Carbon Direct Capital

    Jump to section

      Go from climate goal to climate action

      Key takeaways

      • Agentic inference has changed the economics of AI. Tokens are becoming units of work and the economic driver is now the work produced, not token generation. Per-token costs are falling and the willingness to pay for work produced is rising; these two trends compound. This tailwind enhances AI economics and has spillover impacts on all layers of the AI stack.

      • The AI infrastructure question has shifted from whether demand will show up to whether the physical stack can scale quickly enough. That stack includes power generation, grid capacity, interconnection, compute, memory, networking, cooling, siting, and community acceptance.

      • Carbon Direct Capital and Carbon Direct Inc. have a differentiated view because the two entities work across both sides of the constraint: Carbon Direct Inc. advises hyperscalers and energy buyers on power and grid bottlenecks, while Carbon Direct Capital invests in the technologies that relieve those bottlenecks. 

      • Carbon Direct Capital sees better risk-adjusted returns investing in the physical foundations of AI, including clean firm power, energy system efficiency, data center efficiency, and inference-optimized compute, rather than chasing late-stage AI application valuations.

      A better question than "Is AI a bubble?"

      The most important development in AI economics is agentic AI turning tokens into work, a shift that reframes the bubble debate which dominated investor conversations, sell-side notes, and Chief Information Officer surveys through early 2026. Hyperscalers spent approximately US$380 billion on capital expenditure (capex) in 2025 and have guided to approximately US$720 billion of capex in 2026. Carbon Direct Capital and Carbon Direct Inc. have worked together to build project-level models for both training and inference facilities to demystify the numbers and understand financial and technical sensitivities. The core finding was that the assets could be bankable using standard assumptions and that the binding constraints were physical, not financial. That conclusion has been reinforced in recent months by new developments.

      Concretely, AI is moving from single prompts and answers to multi-step workflows that plan, reason, call tools, verify outputs, and keep state. This shift to inference is the structural successor to training in the initial AI capex cycle; it changes power requirements, time to power, and compute architectures all at once. Goldman Sachs estimates that agentic AI could drive a 24-fold increase, relative to a 2026 baseline, to roughly 120 quadrillion tokens per month globally by 2030 as per-token costs continue to fall. SemiAnalysis makes the same point from another angle: the value of frontier tokens has risen as agentic workflows become useful, while hardware and software improvements have reduced the cost of producing each token.

      This does not mean every AI company is attractive, every data center project works, or every valuation is justified. It means the easy bubble framing is missing the more investable question. If token demand is compounding and the unit value of work produced is rising, the scarce resource is not abstract enthusiasm. It is the physical infrastructure required to turn that demand into work produced. 

      The data center model still matters, but the box is not a black box

      Our internal modeling for an illustrative 167-megawatt inference data center using Nvidia Blackwell graphics processing unit (GPU) servers suggests the potential for high-teens percent equity returns under a defined set of assumptions. We built a bottom-up underwriting, beginning with the number of users served per inference data center, assuming approximately how many tokens they will demand daily, and translating that token demand into compute needed based on industry-standard quantization and utilization rates. We then inferred the number of GPUs and servers needed to achieve the desired compute, which ultimately drove the total invested capital and power demand based on assumed thermal power designs and power usage effectiveness (PUE). On the revenue side, we used GPU-as-a-service rental rates as one proxy for the market value of compute capacity. A hyperscaler would not rent scarce compute externally if it had higher-value internal demand for that same capacity. As we will detail below, GPU rental prices have been steadily increasing on the back of inflecting inference demand.

      This model is not the entire argument, it is the starting point. An important lesson is that power cost alone does not break data center economics. Electricity is slightly over 10% of total costs in our model: a 50% increase in power price reduces equity-level returns by less than 2%. Access to power, speed of interconnection, and equipment availability matter more. In other words, the economics of the model facility are workable, but only if the facility can be built and powered on the timeline customers need.

      That is where most AI commentary remains too superficial. It treats the data center as a black box: capex goes in, tokens come out. That misses the bottlenecks inside and around the box. AI racks are moving far beyond traditional cloud power density. Cooling is shifting from air to liquid and two-phase systems. Networking and high-bandwidth memory become binding constraints in inference architectures. Grid interconnection queue wait times stretch to years. Communities can and do block projects. The technical, physical, and political constraints are increasingly the drivers of potential returns.

      Inference makes the constraint structural

      While training is episodic, inference is recurring. A training run can be delayed, accelerated, or redesigned. Inference happens every time a user asks a question, a developer runs an agent, a business automates a workflow, or an application calls a model in the background. Agentic inference multiplies that load because one user action can become many model calls, validation loops, and memory reads; industry benchmarks show that agentic systems consume 5-30 times more tokens than a standard chat interaction.

      Inference demand is also resilient in both directions. If efficiency gains lower the cost per token, more workflows become economic and total token consumption rises—the classic Jevons Paradox. However, token prices do not necessarily need to fall for inference spend to grow. As the economic unit shifts from tokens generated to work produced, customers may pay more per token when an agent delivers work produced that is worth more than the inference cost. Regardless of token price, tokens must all route through the same physical bottlenecks and we are seeing an increase in inference demand.

      The architecture of inference is also changing. Some workloads will prioritize low-latency answers. Others, especially agentic work without a human waiting on every token, will prioritize memory, state, context, and cost per completed task. That means the AI infrastructure stack will become more heterogeneous, not less: XPUs, custom silicon, photonics, memory hierarchies, and edge or regional deployment models will all matter. The pricing data shows demand for more AI infrastructure overall: on-demand GPU rental capacity is effectively sold out across all chip generations in early 2026, with one-year Hopper H100 contract pricing rising 15-20% month-on-month through March 2026 and Blackwell B200 rental rates up 23% in March alone. When rental rates rise into a wave of new chip supply, supply is not catching up to demand.

      Power is not one constraint, it is several

      Saying "AI is power constrained" is true, but not specific enough. The real problem has several layers. First, data centers need more electricity than many local grids can deliver on hyperscalers' timelines. Crucially, some grids can supply sufficient power but not continuously for 8,760 hours per year, conflicting with traditional assumptions about service reliability and leading to novel strategies around flexibility and intermittent self-supply. Second, the grid must be able to absorb large, fast-moving computational loads without creating reliability risks. Third, customers need energy procurement strategies that satisfy cost, reliability, climate, and public-acceptance requirements. Fourth, projects must get built in real communities, through real interconnection processes and real permitting fights. Power is not simply a commodity to purchase. It is an infrastructure development problem.

      This is where Carbon Direct Inc. is directly relevant. Carbon Direct Inc. has assembled a team of scientific, engineering, and market experts to support a paying power and energy advisory practice serving hyperscalers, energy buyers, and power producers. Its work answers the questions customers are asking before the market prices them: how to get more capacity out of existing physical grid infrastructure; how to assess the costs and value of load flexibility through advanced modeling capabilities; how to make clean firm generation bankable; how to reduce data center energy intensity; how to validate "bring your own power" and "bring your own compute" structures; and how to build projects that communities will accept. In the last twelve months alone, Carbon Direct Inc. has supported hyperscalers on bankability assessments for next-generation geothermal, scoped load-flexibility programs for multi-hundred-megawatt, single-customer sites, and modeled the carbon and reliability profile of "bring your own power" configurations against grid-tied baselines. Carbon Direct Capital leverages Carbon Direct Inc.’s network of technical experts including power engineers, geologists, and electrochemists to conduct credible technical diligence and to gain insights into early stage market trends and emerging preferences.

      What Carbon Direct Capital is investing behind

      Our investment focus follows the bottlenecks. On the power side, we are investing in technologies that can deliver reliable power on AI timelines. Sage Geosystems is a next-generation geothermal platform with hyperscaler buy-in; Carbon Direct Capital co-led its US$97 million Series B with Ormat Technologies. We could not have made this investment without the deep expertise of the Carbon Direct Inc. research team which analyzed Sage's technical results to date to help underwrite future project feasibility. ION Clean Energy is a company that retrofits carbon capture technology onto natural gas combined cycle plants to create "blue electrons"; Carbon Direct Inc. is in active dialogue with multiple large power users on this topic. Carbon Direct Capital is also actively evaluating the enabling picks and shovels around geothermal, nuclear, fuel cells, and more.

      On the data center efficiency side, we are investing in technologies that reduce the amount of power required for a unit of AI work. While it is encouraging to see incremental annual gains in chip efficiency, these are scaling far more slowly than compute demand, driving the need for more innovative technological solutions. As one example, a team at Carbon Direct Inc. helped us understand the fundamental energy consumption requirements of a standard CMOS, and the potential of all-optical computing as an alternative. This led to Carbon Direct Capital investing in Neurophos, a photonic compute company targeting step-function gains in energy efficiency per chip that are beyond those achievable by existing GPUs. Carbon Direct Capital joined the company's US$110 million Series A alongside Gates Frontier, Microsoft's M12, Aramco Ventures, Bosch Ventures, and others. More broadly, we are studying other layers of the data center technology stack including networking, memory, cooling, and inference-optimized architectures because the next phase of AI infrastructure will not be solved by simply buying more of yesterday's hardware.

      In May 2026, two companies comparable to two recent Carbon Direct Capital portfolio companies priced public listings within a week of each other; both listings ran hot. Fervo Energy, a next-generation geothermal developer, like Sage Geosystems, went public at a roughly US$8 billion valuation and the stock is currently trading around 39% higher than the initial listing. Cerebras Systems, a wafer-scale alternative to mainstream GPUs focused on inference compute, like Neurophos, priced at about US$56 billion and is currently trading approximately 28% higher than the initial listing. The market is willing to underwrite frontier technologies in the physical layers of the AI stack.

      The bear case deserves to be taken seriously

      There are real risks to the AI boom. Hyperscaler free cash flow can compress if capex grows faster than revenue. Model efficiency gains can reduce the amount of compute required for a given task. Training demand may be more episodic than the market assumes. Local opposition can slow or cancel data center and power projects. Some new data center capacity could become expensive cloud infrastructure competing on price if AI revenue disappoints.

      Those risks are why Carbon Direct Capital frames this as an investment in constraints, not in AI enthusiasm. If efficiency improves, inference use cases expand and the bottleneck shifts to deployment, memory, power, and cost per unit of work produced. If training demand slows, inference and enterprise agents still require recurring capacity. If local grids cannot absorb load, technologies that unlock power, reduce energy intensity, or improve flexibility become more valuable. If some AI applications or model developers fail, the upstream physical bottlenecks remain for the rest.

      The investment conclusion

      The AI infrastructure opportunity sits at the intersection of frontier technology risk, project-finance economics, and energy-system engineering. Underwriting this opportunity well requires addressing all three at once; Carbon Direct Capital is built to do just that. The technical team at Carbon Direct Inc. has a pulse on emerging stakeholder preferences and scientific breakthroughs, understands novel technologies deeply, and is highly experienced in conducting detailed technical diligence to ensure that projects are viable and scalable. Carbon Direct Capital combines these market and technical insights with our commercial underwriting to facilitate new investments. We are not picking AI winners. We are not picking pure energy assets. We are investing in the companies and technologies that have to exist for AI to sustainably scale.

      1 Numbers based on the most recent earnings releases from Alphabet, Amazon, Meta, and Microsoft.

      2 Key assumptions include: 167-megawatt illustrative inference data center located in Virginia, ~93,200 Nvidia B200 GPUs, ~$6.7 billion invested capital, 1.25 power usage effectiveness, ~1.5 terawatt-hours annual electricity consumption, US$100/megawatt-hour power, 50% debt ratio, and US$3.50/GPU-hour initial topline pricing with a 3% annual escalator.

      3 Market trading data for Fervo Energy and Cerebras Systems current as of June 2, 2026.

      Disclaimer

      Carbon Direct Capital Management LLC is an investment adviser registered with the US Securities and Exchange Commission (SEC). Registration as an investment adviser does not imply any particular level of skill or training. Additional information about Carbon Direct Capital Management LLC, including our Form ADV Part 2A Brochure, is available on the SEC's website at adviserinfo.sec.gov.

      This content is provided for informational purposes only and should not be construed as or relied upon as investment, legal, tax, or other advice. You should consult your own advisers regarding legal, business, tax, and other matters related to any investment. Any projections, estimates, forecasts, targets, prospects, or opinions expressed are subject to change without notice and may differ from opinions expressed by other employees of Carbon Direct Capital Management LLC, its affiliates, investors, portfolio companies and other individuals, groups or entities. Certain information contained herein may have been obtained from third-party sources believed to be reliable; however, Carbon Direct Capital Management LLC makes no representations about the accuracy or completeness of any such information or its appropriateness for any given situation. Any investments or portfolio companies mentioned are not representative of all investments made by funds managed by Carbon Direct Capital Management LLC, and there can be no assurance that any investment will be profitable or that future investments will have similar characteristics or results. Past performance is not indicative of future results. The content speaks only as of the date indicated. This content does not constitute an offer to sell or a solicitation of an offer to buy any security. Any such offering will be made only pursuant to formal offering documents.

      Connect with an expert

      Get answers to your decarbonization questions and explore carbon management solutions.

      Connect with an expert

      Get answers to your decarbonization questions and explore carbon management solutions.

      Connect with an expert

      Get answers to your decarbonization questions and explore carbon management solutions.