
Customer support is the department most founders quietly underweight. It is not the team closing deals. It is not the team running paid acquisition. It is not the team shipping the product. So when budgets are drawn up, support tends to get the leftover line item - the one you fund just enough to keep the queue from catching fire.
That framing is wrong, and it has been wrong for a long time. Companies with strong support post measurably higher revenue than peers with mediocre support, and a clear majority of consumers say a good service experience is what turns them into repeat buyers. Support sits directly on top of retention, expansion, and word of mouth. Treating it as an afterthought is leaving real money on the table.
But this post is not another argument for why support matters. You already believe that, or you would not be reading. The harder question - the one most operators duck - is the one we want to put numbers behind:
What does it actually cost to support your customers, and what is the right way to spend that money in 2026?
Belief does not pay payroll. At some point you have to put the spreadsheet on the table and look at the line item honestly. How many agents should you have at your stage? How much of your margin should support eat? And when the open-weight model market has crashed inference prices to a fraction of a cent per resolution, what is the right blend of humans and AI? Let's walk through it.
What it actually costs to staff a support team
Start with people, because people are still the largest line.
In the United States, the average customer support agent earns somewhere between $42,000 and $58,000 per year in 2026, with technical or tier-two roles routinely pushing past $75,000. That is just base pay. Once you add benefits, payroll taxes, equipment, software seats, training time, management overhead, and the indirect cost of recruiting and onboarding, the fully loaded cost lands closer to $65,000 to $85,000 per agent per year. Operators who only look at the salary number consistently underestimate the real spend by 30 to 40 percent.
Going offshore changes the absolute number but not the shape of the problem:
- South and Southeast Asia: roughly $7,000 to $18,000 per agent per year, with experienced English-language agents at the top of that band.
- Eastern Europe: roughly $13,000 to $22,000 per year, depending on country and seniority.
- Latin America: roughly $11,000 to $20,000 per year, with the additional pull of time-zone overlap with North American business hours.
These numbers are real savings, but they are not free, and they come with their own management tax - vendor contracts, quality-assurance overhead, cultural-fit calibration, and the occasional turnover spike when a competing BPO opens up across the street. Once you scale past ten or twenty agents, even a few hundred dollars of monthly variance per head turns into a meaningful annual figure.
Volume is where most teams are honest with themselves for the first time. If you sell into e-commerce, fintech, consumer SaaS, or anything subscription-shaped, you are looking at hundreds to thousands of tickets a week. A mid-sized Shopify merchant routinely handles 300 to 600 tickets a day during peak season. A SaaS product with tens of thousands of active users churns through password resets, billing edge cases, integration questions, and tier-one bug reports every hour of the day. As a rough planning ratio, fast-growing companies tend to need one full-time support agent for every 500 to 1,000 active users, scaling with ticket complexity.
So picture a fairly typical growing company:
- 20,000 active users
- Ticket rate of about 1.5% of users per day
- Roughly 300 tickets a day, every day
To staff that load across multiple shifts without burning out the team, you need six to ten full-time agents plus at least one team lead. At U.S. fully loaded rates that is $450,000 to $850,000 a year in support payroll. Even at offshore rates you are still looking at $110,000 to $220,000 a year. And payroll is only one part of the bill.
The hidden second budget: tools, training, and overhead
Salaries are visible. The rest of the support stack is the budget founders forget about until they are doing year-end planning and discover it has quietly grown into a six-figure line.
A realistic monthly bill for a small but credible support operation looks roughly like this:
- Helpdesk platform (Zendesk, Freshdesk, Intercom, Help Scout): roughly $80 to $115 per agent per month, scaling with seats.
- Shared inbox or collaboration tool for internal triage: $200 to $300 per month.
- Knowledge base for both customer-facing articles and internal SOPs: $120 to $200 per month.
- Quality assurance and conversation analytics: $180 to $250 per month.
- Training and onboarding platform for new hires: $120 to $180 per month.
- Workforce management and scheduling for shift coverage: $100 to $200 per month.
- Reporting, BI, and CSAT survey tools: $120 to $180 per month.
For a team of four agents and a lead, that comes to roughly $1,200 to $1,500 a month in tooling alone - about $14,000 to $18,000 a year, on top of payroll. Layer in turnover (the average support agent stays under 18 months), recruiter fees, and the productivity hit of the first six weeks while a new hire ramps, and the picture gets uglier.
Pull it together for a small but serious team:
- Four agents at a modest $2,200 per month each: $8,800/month
- One lead at $2,800 per month: $2,800/month
- Tooling: $1,300/month
That is $12,900 a month, or about $155,000 a year, to keep a four-agent operation running. And we are still being conservative - no health benefits, no weekend coverage premium, no recruiting fees. This is the floor, not the ceiling. Now imagine the same shape of spend at twenty agents, or fifty. You can see how quickly support becomes one of the largest non-engineering line items on the P&L.
What changed in 2026: the model market did most of the work for you
A year ago you could make a decent argument that AI support was good for cheap, predictable questions and not much else. That argument no longer holds. The 2026 model landscape has done two things at once: it has pushed quality up at the frontier, and it has pushed price down across the open-weight tier so aggressively that the unit economics of support have inverted.
On the closed-frontier side, GPT-5.5 and GPT-5.5 Pro added parallel reasoning to OpenAI's stack in April. Claude Opus 4.7 leads SWE-bench Pro at 64.3% and is the model many teams pick when an AI Action has to actually execute - refunds, order changes, booking flows, anything that touches a real system of record. Claude Opus 4.6 and Sonnet 4.6 ship with a 1M-token context window at no surcharge, which means an agent can hold an entire knowledge base, a full prior conversation history, and your policy documents in a single prompt. Gemini 3.1 Ultra stretches that to 2M tokens and is natively multimodal across text, image, audio, and video, which matters the moment a customer attaches a screenshot of a broken checkout.
The open-weight side is where the cost story really lives.
- DeepSeek V4 Flash ships at $0.14 per million input tokens and $0.28 per million output tokens, with a 1M-token context. At those prices, a typical support resolution costs a small fraction of a cent.
- MiniMax M2.7 is a 230B / 10B-active MoE that prices roughly 8% of Claude Sonnet at twice the speed, and posts 56% on SWE-Pro - strong enough for serious tool use, cheap enough to run at FAQ scale.
- Z.ai's GLM-5.1 (MIT-licensed, 754B-param MoE) and Moonshot's Kimi K2.6 (1T-param MoE, agentic-first, swarms up to 300 sub-agents) are agentic-grade models you can run on your own infrastructure when compliance or latency demands it.
- Alibaba Qwen 3.6 ships a 27B dense Apache-2.0 variant that beats much larger MoE rivals on agentic coding benchmarks, and a 35B-A3B open MoE variant - both genuinely viable for local or VPC deployment.
- Xiaomi MiMo-V2-Pro offers a >1T-param, 42B-active, 1M-context reasoning-first model with weights open under MIT.
The practical consequence is simple: the cost floor for a competent AI support agent is now measured in tenths of a cent per resolution, not dollars. Berrydesk lets you pick from this entire menu - GPT, Claude, Gemini, DeepSeek, Kimi, GLM, Qwen, MiniMax, MiMo, and others - and route different traffic to different models so you can land exactly where you want to be on the cost-versus-quality curve.
Traditional support versus an AI-first support stack
Let's run the same company through the AI-first version of the math.
The traditional baseline
Same shape as before. Four agents at $2,200 a month, one lead at $2,800 a month, plus tooling:
- Payroll: $11,600/month
- Helpdesk and seats: $520/month
- Internal collaboration: $220/month
- Knowledge base: $150/month
- QA and analytics: $220/month
- Training and reporting: $280/month
- Recruiting, HR, miscellaneous overhead: $320/month
Total: roughly $13,310/month, or about $160,000 a year, for a four-agent team that can handle maybe 800 to 1,200 tickets a week before quality starts to slip.
The AI-first version
Now wire a Berrydesk AI agent into the front of your queue. The agent is trained on your help center, your product docs, your Notion workspace, your Google Drive folders, and selected YouTube walkthroughs. It carries AI Actions for the high-frequency, high-friction tasks: looking up an order, processing a refund within policy, rescheduling an appointment, escalating to a human with full context, and creating a ticket in your existing helpdesk when a human really is needed.
You do not need to fire your team. You re-shape it. Instead of five people grinding through tier-one volume, you keep two strong agents who specialize in escalations, complex troubleshooting, VIP accounts, and continuous improvement of the AI's behavior.
- Two senior agents at $2,500 a month: $5,000/month
- Berrydesk plan, sized for your traffic: $80 to $500/month depending on usage
- Underlying model spend, routed mostly to DeepSeek V4 Flash or MiniMax M2 with frontier escalations to Claude Opus 4.7 or GPT-5.5: $50 to $300/month for a company in this range
- Helpdesk and integrations, scaled down to escalation-only volume: $120/month
- Knowledge base (which the AI also reads): $80/month
Low end: $5,330/month. High end: $6,000/month.
Compared with the traditional baseline of about $13,310/month, you are cutting 55 to 60 percent of monthly support cost without giving up coverage. In fact, coverage gets better - the AI agent does not sleep, does not need shift differentials, and answers in seconds at three in the morning the same way it answers at three in the afternoon. When traffic spikes during a launch or a Black Friday window, the AI scales horizontally; you do not have to hire and onboard a temporary team in October to absorb November.
This is the other half of the story. AI support is not just cheaper. It is more elastic. Traditional support cost scales linearly with volume because every additional ticket needs a human minute. AI support cost scales sub-linearly: doubling your ticket volume might double your model spend by a few hundred dollars, while a doubled human team would cost six figures more.
Picking the right model mix for your stack
A common mistake is to treat "AI support" as a single decision - you either use it or you don't, with one model behind the curtain. The teams getting the best results in 2026 are routing their traffic across multiple models based on the kind of question being asked.
A pattern that works well:
- Tier-zero, high-volume, low-risk traffic - order status checks, password resets, "where do I find X" questions, basic policy lookups. Route to DeepSeek V4 Flash or MiniMax M2.7. These models are fast, agentic enough for simple tool use, and so cheap that running them at scale barely registers.
- Tier-one, mid-complexity traffic - billing questions that require pulling the customer's plan, refund requests inside policy, rescheduling, multi-step troubleshooting. Qwen 3.6, GLM-5.1, or Claude Sonnet 4.6 with the 1M-token context window work well here. The long context means the agent can hold the full conversation, the policy doc, and the customer's account state without juggling chunks.
- Tier-two, high-stakes or ambiguous traffic - anything where a wrong answer is genuinely costly: complex refund decisions, fraud-adjacent flows, regulated-industry questions, executive escalations. Reserve Claude Opus 4.7 or GPT-5.5 Pro for these. They are more expensive per call, but a single avoided wrong refund pays for a month of frontier inference.
- On-prem or air-gapped environments - for regulated industries that cannot send conversations to a third-party API at all, the MIT- and Apache-licensed Chinese open weights (GLM-5.1, Qwen3.6-27B, MiMo-V2-Pro) make a self-hosted, agentic support agent realistic for the first time.
Berrydesk handles the routing for you. You configure thresholds - confidence levels, ticket categories, customer tiers - and the platform sends each conversation to the right model without you having to wire it yourself.
Where AI support quietly wins beyond cost
Cost is the headline. The rest of the wins are easy to overlook because they do not show up on a single line of the spreadsheet.
Onboarding collapses. A new human agent needs four to eight weeks to ramp before they are net-positive on tickets. A new model is "onboarded" the moment you point Berrydesk at your knowledge base and turn it on. When you ship a new feature on Tuesday, the AI agent knows about it on Tuesday. The human team gets a training session on Friday and is fully fluent the following Monday, if you are lucky.
Quality stays consistent. Human agents have good days and bad days, energy peaks and slumps, mood swings, and personal context that bleeds into tone. A well-tuned AI agent answers question one and question one thousand the same way. That consistency is the thing customers notice without being able to name it.
Coverage becomes 24/7 by default. You do not need to staff a night shift in Manila to keep response times under five minutes for European customers. The AI agent handles overnight traffic at full quality, and your humans show up to a queue that is mostly already resolved.
Knowledge feedback loops shorten. Every conversation the AI handles surfaces a gap or an ambiguity in your documentation. Berrydesk can flag the questions where the agent had low confidence, or where customers escalated to a human, and you can patch the underlying article. The next thousand customers asking the same question get a better answer.
What to watch out for
It is not all upside. Three pitfalls are common enough that they are worth naming.
Do not let the agent act outside its policy. AI Actions are the part of the system that touches real money, real bookings, and real customer accounts. Define hard limits - refund ceilings, plan-change rules, escalation triggers - and have the agent refuse cleanly when a request crosses them. Frontier-grade tool-use models like Claude Opus 4.7, Kimi K2.6, and GLM-5.1 are reliable enough to entrust with these flows, but reliability is a function of how tightly you scoped the action, not just how good the model is.
Do not skimp on the human layer. Cutting from five agents to zero is a mistake. Cutting from five to two senior agents who handle escalations, calibrate the AI, and own the customer relationships that matter most is the move. The AI multiplies a good support team. It does not replace one.
Do not treat the knowledge base as a one-time setup. The AI is only as good as what it can read. Stale articles, contradictory policy docs, and missing pages show up immediately as low-confidence answers and unhappy escalations. Treat the knowledge base as a living product surface, not a one-quarter project.
Get started with AI customer support
The math in 2026 is not subtle. Open-weight frontier models from DeepSeek, Z.ai, Moonshot, Alibaba, MiniMax, and Xiaomi have collapsed the cost of running production-grade support agents. Frontier closed models from OpenAI, Anthropic, and Google have made tool use, long-context grounding, and multimodal understanding genuinely production-ready. The combination is what makes an AI-first support stack realistic for a serious company, not just a demo.
You can keep paying $150,000-plus a year for a small support team that struggles to maintain coverage during launches, weekends, and time zones. Or you can route 80 to 90 percent of your tier-one traffic to an AI agent that resolves in seconds, and reserve your humans for the 10 to 20 percent of conversations where their judgment is the actual product.
Berrydesk gives you four steps to launch: pick a model from the full 2026 menu, train it on your docs, websites, Notion, Google Drive, and YouTube content, brand the chat widget to match your product, wire AI Actions for the bookings, refunds, and lookups your team handles every day, and deploy to your website, Slack, Discord, WhatsApp, and the rest of the channels your customers actually use.
Cut the cost. Keep the quality. Build your support agent for free at berrydesk.com.
Cut your support costs without cutting quality
- Route 80–90% of tickets to an AI agent that resolves in seconds
- Pick the model that fits your budget - frontier or open-weight
Set up in minutes
Chirag Asarpota is the founder of Strawberry Labs, the team behind Berrydesk - the AI agent platform that helps businesses deploy intelligent customer support, sales and operations agents across web, WhatsApp, Slack, Instagram, Discord and more. Chirag writes about agentic AI, frontier model selection, retrieval and 1M-token context strategy, AI Actions, and the engineering it takes to ship production-grade conversational AI that customers actually trust.



