Every engineer who has used Claude Code, Cursor or a similar AI coding tool seriously for more than a week has the same story. Mid-flow, deep in a refactor, the model stops responding. A red banner appears. Daily limit reached. Weekly quota exhausted. Try again in 4 hours. The session that was about to ship a feature dies on the rate limiter.
This is not a failure of the model. It is a failure of how access to the model is sold. Subscription pricing for AI Coding is fundamentally mismatched with how engineers actually work, and the friction it creates has become the single most-complained-about thing in the AI Coding tool space in 2026.
The lived experience
Pick any engineer using AI Coding heavily. The pattern is identical.
Monday morning: pay for Claude Pro at $20/month. Use it intensely until Wednesday afternoon when the 5-hour rolling window kicks in. Wait three hours for it to lift. Pick up where you left off. Hit the daily limit by Wednesday evening. Wait until tomorrow.
Or upgrade to Claude Max at $100 or $200 a month. Burn the weekly quota in three days of focused work. Watch the model refuse for the rest of the week. Pay $200 next month for a tool you used at full bore for three days out of thirty.
The choices on offer are: a subscription that throttles you exactly when you're being productive, a more expensive subscription that throttles you slightly later, or paying per-token through the API where the cost is so opaque you have no idea what a session will run you until the bill arrives.
Engineers are voting with their feet. The subreddits and Hacker News threads on this topic are pages of identical frustration. "Hit my Claude limit in two hours." "Paid for Max, still capped." "Cursor's pro tier is basically a teaser before they ask for more."
Why credit limits exist
To be fair to the providers: the limits are not arbitrary. Anthropic, OpenAI and the others sell access to a finite, expensive compute pool. A heavy user of Claude Code can run hundreds of dollars of inference in a day. The flat-rate subscription model only works because most subscribers don't use the tool that hard. Power users effectively subsidise themselves with the credits that idle users don't burn.
When a few thousand power users start using the tool as designed — running an AI agent for hours a day across multiple repos — the economics break. The provider's choices are: raise prices, add stricter limits, or both. They mostly choose both.
This is a structural problem with subscription pricing for an inference-heavy product. It is not solvable by another tier or a better-worded error message. The mismatch is between a fixed monthly fee and a workload that scales with how productive the engineer is.
The UX bug nobody is fixing
Here is the part that should not be acceptable: the tool punishes its best users.
The engineers most invested in AI Coding — the ones who restructured their workflow around it, the ones who can drive an agent productively for hours, the ones who would happily pay more for more access — are exactly the engineers who hit the wall first. The tool gets less useful the more useful you become at using it.
A productivity tool that gets worse the more productively you use it is a UX bug. It would be unacceptable in any other category of software. Imagine an IDE that started lagging after you'd written 5,000 lines of code that day. You'd switch IDEs.
The flat-rate-with-quotas model also creates the wrong incentives for the engineer. Instead of using the tool freely, they start hoarding. They batch their AI questions. They hesitate before kicking off a long-running agent task. The cognitive overhead of "is this prompt worth burning some quota" cancels out a meaningful chunk of the productivity win the tool was supposed to deliver.
What engineers actually want
The pattern of complaints points at a clear shape:
- Pay for what you use. Not a subscription that may or may not match the month's workload. A direct relationship between usage and cost.
- No commitment. A heavy week and a quiet week should cost different amounts. Engineers' workloads are bursty; their pricing should be too.
- Predictable per-unit cost. Per-token billing is too granular and too opaque. Engineers want to know what a session costs before they start it, not after the fact.
- No throttling mid-flow. If the engineer is willing to pay for the next hour of compute, the tool should let them buy it on the spot, not bench them for four hours.
This is not a request for cheaper AI Coding. Engineers know it costs money to run. It is a request for honest pricing that scales with the work, not a flat fee that punishes power users.
The shape of a saner model
Pay-per-session AI Coding flips the equation. Book a session, get the AI Coding capacity you need, walk away. No subscription that sits idle on quiet weeks. No quota wall mid-refactor. No surprise per-token bill. Just a known price for a known block of focused work.
Codeforless built its entire product around this model. Sessions come in 1-hour blocks. The price is fixed before you buy. You pay for the hours you code, and only for the hours you code. The next post in this series walks through how the math actually works — and why a focused engineer can ship more on $3 of pay-per-session AI Coding than they can on a $200/month subscription that capped them out by Wednesday.
If you are tired of the credit-limit treadmill, the pricing page is worth a minute of your time.
