2025 in Review: Building for the agentic era

2024 was about raising the bar, turning innovations into defaults, and proving that infrastructure could be both powerful and invisible. If last year was about setting the foundation, 2025 has been about unleashing velocity by removing the bottlenecks and nuances that stand between an idea and a shipped product.

This year, I focused on simplifying the complex. From shifting how we pay for builds to redefining the execution model of serverless compute, the goal was to create a platform ready for the modern era. We are entering a time where humans and AI agents build side-by-side.

The ships for this year have been career-defining: shipping On-Demand Concurrency for builds, leading a new execution model with Fluid compute, introducing Active CPU Pricing, and also releasing Vercel Sandbox. These were not just regular features. They were fundamental shifts in how the web is built and scaled.

However, this review is bittersweet. It marks not just the end of a year, but the end of my chapter at Vercel. But before we get to what’s next, let’s look at how we reshaped compute in 2025.

Unblocking the pipeline: On-Demand Concurrency

We started the year by tackling a legacy constraint: build slots. For years, teams had to buy slots, which naturally created bottlenecks from time to time. If you had 5 developers or agents pushing code, but only 2 slots, you can see the problem if they all want to push changes at the same time.

We shipped On-Demand Concurrent Builds to solve this. It is a simple but incredibly powerful change to our pricing model. Instead of provisioning capacity by buying slots, you simply access high concurrency and pay on-demand per minute.

This shift was critical. In a world where AI agents are pushing code alongside humans, the volume of concurrent workstreams is exploding. On-demand concurrency ensures that building and deploying your changes never becomes the bottleneck.

The best of both worlds: Fluid compute

Following that, I shipped what is undoubtedly one of the highlights of my career so far: Fluid compute.

I led a new execution model for serverless compute designed specifically for modern workloads. For years, we spent our time optimizing for short idle times and millisecond transactions. We assumed faster responses was always better. But the paradigm has shifted.

With the rise of agents, you now have more idle time, and sometimes you are actually happy to have it. You want to give the LLM time to "think" so it gives you a better answer, or you want to keep a connection open while a function streams a response back. Traditional serverless was not built for this. It penalizes you for waiting as it charges for all resources based on duration, locking your instance regardless of whether the CPU is idle or not.

Fluid Compute changes the equation by allowing requests to share an instance. It brings the efficiency and performance of servers to the scalable nature of serverless. It eliminates the trade-off developers used to make, offering the best of servers and serverless all together.

Aligning cost with value: Active CPU Pricing

Shipping Fluid was a massive step, but true to the philosophy of iteration, it was only the beginning. To truly support these new agentic workloads, we needed to fix the economics.

We fixed this with one of our most ambitious ships: Active CPU Pricing for Fluid compute.

The concept is radically fair: you should only pay for what you use. This is critical for agentic workflows. When your function is waiting for an LLM to respond, you are not using the CPU. In the old model, you paid for that wait. Now, you don't.

Regardless of the runtime, when running on Fluid compute, you pay primarily for the resources consumed. If your workload is not CPU intensive, like waiting on an LLM API or streaming tokens back to the user, you are not billed for CPU cycles you are not using. You only pay for the reserved memory. This unified the model and drastically lowered costs, making it economically viable to run complex, long-running agentic tasks on serverless.

The agentic primitive: Vercel Sandbox

While we were refining Fluid, we were cooking something else in parallel: Vercel Sandbox.

We have been running untrusted code securely for years to power our own builds. It made sense to expose this primitive to the world. Built on top of Hive (the general compute platform I’ve been building with the team for years), and powered by Fluid compute with Active CPU pricing, Vercel Sandbox allows anyone to spin up secure environments for untrusted code instantly.

This is perhaps one of the strongest primitives for the agentic era. Whether it is an AI agent generating a UI or a code interpreter running data analysis, developers can now give agents control of a sandbox for every use case. We have already seen the community embrace this with incredible creativity, including projects like the OSS Vibe Coding Platform, which lets anyone build their own vibe coding platform.

Closing a Chapter

2025 was, without a doubt, one of the greatest years of my professional life. We did not just ship features. We fundamentally changed the economics and mechanics of the platform.

However, as the year closes, so does my time at Vercel.

I have always believed in leaving on a high note, and I am incredibly proud that I am doing exactly that. We have set the bar higher than ever, and the platform is in a stronger position than it has ever been. I am leaving with a heart full of gratitude for the team and the community. I am excited to watch Vercel continue to crush it, but from now on, I will be rooting from the outside.

I have a deeply optimistic view of what the future holds.

See you in 2026.