
Chatbots stopped being a curiosity a while ago. By 2026 they sit on the front line of every serious customer-facing workflow - answering questions, taking payments, booking appointments, escalating to humans, recovering carts, onboarding employees, even triaging medical concerns. The interesting question is no longer "should we have one?" but "what does a good one look like, and what can we steal from the teams who got it right?"
This piece walks through twenty-one chatbots - some famous, some niche - and unpacks what each one is actually doing under the hood. The point isn't to admire them. It's to extract the design patterns you can reuse when you build your own.
A note on what's changed under these examples since most of them were first profiled. The models powering chatbots in 2026 look almost nothing like the GPT‑3.5- and GPT‑4-era systems people associate with the first wave of "ChatGPT for X" launches. Today's leading deployments mix and match across an entirely new bench: GPT-5.5 and GPT-5.5 Pro for hard reasoning, Claude Opus 4.7 (currently leading SWE-Bench Pro at 64.3%) for tool-heavy agentic work, Gemini 3.1 Ultra with its 2M-token context for sprawling knowledge bases, and an aggressive open-weight tier - DeepSeek V4 Flash at $0.14 / $0.28 per million tokens, Moonshot Kimi K2.6, Z.ai's GLM-5.1, Alibaba's Qwen 3.6 family, MiniMax M2.7, Xiaomi's MiMo-V2-Pro - that has crushed the cost floor for routine traffic. Most of the chatbots below would be rebuilt very differently if launched today, and we'll point out where that matters.
1. Berrydesk
Berrydesk is an AI support agent platform built for teams who want a production-grade chatbot live in an afternoon, not a quarter. You pick a model - GPT-5.5, Claude Opus 4.7, Gemini 3.1, DeepSeek V4, Kimi K2.6, GLM, Qwen, MiniMax, and others - train it on your docs, websites, Notion, Google Drive, or YouTube channel, brand the widget, wire up AI Actions for booking and payments, and deploy to your site, Slack, Discord, or WhatsApp.
What's interesting about it: Berrydesk lets you route traffic across models. Cheap, fast open-weight models like DeepSeek V4 Flash or MiniMax M2 handle routine FAQs at fractions of a cent per resolution; Claude Opus 4.7 or GPT-5.5 Pro pick up the complicated escalations. With 1M+ token context windows now standard on flagship models, an agent can hold your entire knowledge base, the full conversation history, and your refund policy in-context - RAG becomes a tuning lever, not a hard requirement.
The lesson: the best support chatbot in 2026 isn't a single model - it's a routing layer that puts the right model on the right ticket.
2. Sephora's Virtual Artist
Sephora's chatbot pairs augmented-reality try-ons with a conversational stylist. You upload a selfie or use the live camera, the assistant suggests products based on tone and occasion, and you can virtually apply lipstick, eyeshadow, or foundation before buying.
Key features:
- AR makeup try-on layered onto a live video feed
- Personalized product picks tied to skin tone, eye colour, and previous purchases
- In-chat tutorials for techniques the user can practice in front of the camera
- Tight integration with Sephora's e-commerce stack - products are added to cart inside the chat
The lesson: the chatbot is the thinnest layer of the experience. The defensibility is in the AR pipeline and the product graph. If you build for retail, the chat interface is a discovery surface, not the product itself.
3. Domino's "Dom"
Domino's lets customers order pizza through Messenger, voice assistants, smart watches, and its own app - all powered by a shared ordering bot. Repeat orders are one tap. New orders are guided. Drivers get tracked in real time.
Key features:
- Saved Easy Orders for one-message reorders
- Cross-platform parity (the same bot logic powers Messenger, Slack, voice, and SMS)
- Live tracking from oven to doorstep
- Loyalty integration so points accrue regardless of where you order
The lesson: the boring part - letting customers do the thing they already want to do, with one fewer step - is where most of the ROI lives. Don't out-clever the use case.
4. Duolingo's Conversation Practice
Duolingo's in-app conversation tutor uses LLM-driven roleplay to let learners practice a target language with a fictional character - ordering coffee, checking into a hotel, arguing with a sibling. The bot adjusts difficulty in real time and corrects errors gently.
Key features:
- Free-form conversations in dozens of languages
- Difficulty scaling tied to the learner's current proficiency level
- Context-aware feedback (it explains why a phrasing is off, not just that it is)
- Gamified streaks and XP that stretch into the chat surface
The lesson: when the chatbot is the product, persona consistency matters more than raw IQ. Duolingo's characters have voices, quirks, and stable backstories - that's what makes the practice feel real.
5. Replika
Replika is an AI companion designed for emotional support and casual conversation. Users build a persistent persona over time, and the bot remembers past conversations, shared memories, and stated preferences.
Key features:
- Long-running memory across sessions
- A configurable avatar and personality
- Mood check-ins and journaling prompts
- Voice and AR modes for users who want presence beyond text
The lesson: Replika is the loudest commercial validation of long-context memory. The reason 1M-2M-token windows in Gemini 3.1 Ultra and DeepSeek V4 matter for enterprise support is the same reason they matter for Replika - the agent that remembers what you told it three weeks ago feels qualitatively different from one that doesn't.
6. Bank of America's Erica
Erica is BofA's in-app virtual assistant. She can show balances, flag unusual spending, surface upcoming bills, transfer money between accounts, and explain charges in plain English. She has crossed two billion interactions across BofA's customer base.
Key features:
- Voice, text, and tap-driven inputs inside the mobile app
- Predictive spending insights ("you spent 23% more on dining this month")
- Fraud alerts with one-tap dispute flow
- Bill-pay reminders and automated transfers
The lesson: in regulated industries, the chatbot's job is to compress the most-asked questions into a one-step interface. Erica is not trying to be impressive. She is trying to make checking your balance faster than opening the accounts tab.
7. Hopper-style Travel Bots
Hipmunk's original travel bot is gone, but its DNA shows up in every modern travel agent: Hopper, Kayak's chatbot, the assistant inside Google Flights. The pattern is the same - describe a trip in natural language, get a ranked set of options, watch prices, get nudged when to book.
Key features:
- Natural-language trip search ("a beach somewhere warm in February under $800")
- Price prediction and "wait or book" recommendations
- Calendar-aware itinerary planning
- Bundled hotel-plus-flight packages with explainable trade-offs
The lesson: travel queries are messy and high-stakes. Long-context models - Gemini 3.1 Ultra at 2M tokens, Claude Sonnet 4.6 and DeepSeek V4 at 1M - finally let an agent juggle a multi-leg trip, hotel preferences, loyalty constraints, and ten browser tabs of past prices in a single conversation. Pre-2025 chatbots couldn't do that without brittle pipelines.
8. Whole Foods' Recipe Assistant
Whole Foods' chatbot sits on Messenger and helps shoppers turn "what do I make for dinner" into a recipe and a basket. It filters by dietary restriction, surfaces seasonal ingredients, and links straight into the e-commerce flow.
Key features:
- Recipe search by ingredient, cuisine, or dietary need
- Filters for vegan, keto, gluten-free, allergen-aware
- Nutritional breakdowns inline
- "Add all ingredients to cart" as a single click
The lesson: retailers underrate the value of content discovery chatbots. Recipes are a Trojan horse for groceries. Anywhere a customer's actual goal is one step removed from your SKUs, a chatbot can bridge the gap.
9. Spotify's DJ and Playlist Bots
Spotify's DJ feature is a generative-AI-driven music host that talks listeners through their own taste, builds personalized sessions, and surfaces context for tracks ("your most-played artist this fall is back with a new album"). The underlying playlist bots take prompts like "music for a dinner party that doesn't suck" and return a sequenced set.
Key features:
- Voice-driven host that narrates context between songs
- Mood- and activity-based playlist generation from natural-language prompts
- Personalisation grounded in years of listening history
- Cross-format coverage - songs, podcasts, audiobooks
The lesson: the moat is the data. A model on its own can't recommend music as well as a model wired into ten years of your listening history. For support agents this maps directly onto the value of training on your actual ticket history, not just your help centre.
10. H&M's Style Assistant
H&M's chatbot acts as a personal shopper. It runs a short style quiz, then assembles outfits - top, bottom, layers, accessories - pulling from H&M's catalogue. It also handles size and fit, which is where most fashion returns originate.
Key features:
- Style fingerprinting from a short quiz
- Outfit-level (not item-level) recommendations
- Size and fit guidance based on past purchases
- Direct add-to-cart inside the chat
The lesson: in fashion, the unit of value is the outfit, not the item. Whatever your business sells, ask whether your chatbot is matching the unit your customer actually shops for.
11. Marriott's Booking Concierge
Marriott's chatbot handles room search, booking, in-stay requests, and local recommendations across hundreds of properties. It's plumbed into the Bonvoy loyalty system and the property-management system at each hotel.
Key features:
- Natural-language room search ("a quiet king room with a desk near the conference centre")
- In-stay requests routed to housekeeping, the front desk, or room service
- Loyalty-aware pricing
- Local recommendations from staff-curated content
The lesson: hospitality chatbots are a stress test for integration. The bot can be brilliant, but if it can't actually trigger the action - a late checkout, a reservation, a maintenance ticket - it's a brochure. Berrydesk's AI Actions, Kimi K2.6's 4,000-step orchestration, and Claude Opus 4.7's tool-use accuracy exist precisely to make this layer reliable.
12. NASA's Space Bot
NASA runs conversational interfaces that make missions, instruments, and astronomy approachable for the public. You can ask about upcoming launches, the status of a Mars rover, or what's in tonight's sky and get a researcher-grade answer in plain English.
Key features:
- Live mission status from official NASA feeds
- Astronaut profiles and interview archives
- Sky-tonight tools tied to your location
- Educational deep-dives written for schools
The lesson: trust comes from sources, not eloquence. A great public-facing chatbot cites where it got each answer. With 1M-token context windows on Claude Sonnet 4.6 and DeepSeek V4, agents can keep entire mission briefs in working memory and quote them precisely.
13. White-Label Bank Chatbots
Mastercard, Plaid, and a wave of AI-first vendors now sell chatbot infrastructure to mid-sized banks that can't build their own. The white-label pattern lets a regional bank ship a branded assistant in weeks instead of years.
Key features:
- Account inquiries and transaction history
- Card activation, lock/unlock, and dispute flow
- Fraud detection alerts with one-tap response
- Branch and ATM locator
The lesson: B2B2C chatbot platforms are a real market category. If you serve other businesses, the question isn't "should our customers' customers see a chatbot?" - it's "do we build it or buy the underlying agent?"
14. Lemonade's Insurance Bot
Lemonade's claims chatbot is famous for paying out simple claims in seconds. The user describes the loss in chat or video, the bot evaluates the claim against policy and fraud signals, and - if it passes - funds hit the customer's account before the conversation ends.
Key features:
- Quote generation from a short conversation
- Policy management inside the same chat
- Video-driven claim filing
- Instant payouts for low-risk approved claims
The lesson: the right chatbot doesn't just answer questions - it closes the loop. With agentic models like Kimi K2.6 (300-sub-agent swarms, 4,000-step plans) and GLM-5.1 (8-hour autonomous plan-execute-test-fix loops) shipping as open weights, instant-resolution flows are now feasible far outside fintech.
15. Starbucks' Order Bot
Starbucks lets customers order via voice and chat across Messenger, Alexa, the in-app assistant, and Apple's voice surfaces. The bot remembers your usual, knows your nearest store, and accounts for wait times.
Key features:
- Multi-platform voice and text ordering
- Drink customisation in plain English
- Live wait-time estimates per store
- Rewards integration
The lesson: for high-frequency, low-complexity transactions, the chatbot's job is to remove keystrokes. Every tap saved compounds across millions of orders.
16. HealthTap and Modern Triage Bots
HealthTap's symptom checker - and the wave of triage bots that followed it - guide patients through structured questions to figure out whether to self-care, see a doctor virtually, or escalate to urgent care. They're not diagnosing; they're routing.
Key features:
- Structured symptom intake
- General health information with sourced answers
- Medication reminders
- Triage to the right level of care
The lesson: in healthcare, the routing is the value, not the diagnosis. The MIT- and Apache-licensed Chinese open-weight models - GLM-5.1, Qwen3.6-27B, MiMo-V2 - have made on-prem and air-gapped deploys economical for hospitals and insurers that can't send PHI to a public API.
17. Expedia's Travel Agent
Expedia's chatbot, integrated with ChatGPT and inside Expedia's own apps, handles end-to-end travel planning - flights, hotels, cars, activities - and supports the trip after booking.
Key features:
- Cross-product travel booking in one conversation
- Price alerts and rebooking when fares drop
- Itinerary management with calendar sync
- Live customer support for disruptions
The lesson: the chatbot's job extends past purchase. Most travel value comes from handling the long tail of changes - delayed flights, missed connections, refund disputes - and that's exactly where 1M-token context plus reliable tool-use shine.
18. Amtrak's Julie
Julie has been Amtrak's voice and text assistant for years. She handles schedule lookups, ticket booking, station information, and delay alerts at a scale that would require thousands of agents otherwise.
Key features:
- Schedule and route inquiries
- Booking and ticket management
- Station and on-board service information
- Real-time delay notifications
The lesson: the longest-running assistants in production are also the most boring. They do one job, they do it well, and they survive every model upgrade because the interface stays stable while the model behind it gets smarter.
19. Real Estate Lead Bots
Roof AI and a generation of similar tools live on agent and brokerage websites, qualifying leads, scheduling tours, answering common questions about neighborhoods, schools, and financing.
Key features:
- Property search with filters drawn from natural-language queries
- Virtual tour scheduling that actually books to an agent's calendar
- Mortgage estimates inside the chat
- Lead capture with smart routing to the right agent
The lesson: for high-ticket, low-volume transactions, the chatbot's value is qualification - separating the windows-shoppers from the people who'll close, and routing each to a different next step.
20. Kuki and the Conversational Showcase Bots
Kuki - a five-time Loebner Prize winner - represents an entirely different lineage: chatbots optimised for human-like open-ended conversation, not transactions. The descendants of this line now show up as character.ai personalities, companion bots, and roleplay surfaces.
Key features:
- Broad knowledge with persona consistency
- A stable sense of humor and personality
- Continuous learning from conversations
- Cross-platform deployments - chat, voice, embodied avatars
The lesson: even commerce-driven chatbots benefit from the persona discipline of the showcase bots. A bot with a voice gets remembered. A generic "How can I help you today?" assistant gets ignored.
21. UNICEF's U-Report
U-Report is a chatbot platform UNICEF runs across SMS, Messenger, WhatsApp, and Telegram in dozens of countries. It polls young people on issues that affect their lives, surfaces crisis information, and connects users to local services.
Key features:
- Poll-driven civic engagement at scale
- Multilingual operation across low-bandwidth channels
- Local-services routing for users in crisis
- Data feedback loops to programs and policymakers
The lesson: chatbots aren't only for revenue. Anywhere two-way communication needs to scale across millions of people in dozens of languages, the chat interface is the cheapest, most accessible UI on the planet.
Patterns worth stealing
Step back from the twenty-one examples and a handful of design patterns repeat:
Route across models, don't pick one. The best 2026 deployments use cheap open-weight models - DeepSeek V4 Flash, MiniMax M2, Qwen3.6-27B - for routine traffic and reserve Claude Opus 4.7, GPT-5.5 Pro, or Gemini 3.1 Ultra for complex escalations. This is exactly what Berrydesk's model picker is built for.
Treat actions as first-class. The chatbots that move metrics close the loop - Lemonade pays the claim, Domino's confirms the order, Marriott books the room. AI Actions in Berrydesk, Kimi K2.6's swarm orchestration, and Claude Opus 4.7's tool-use reliability turn this from demoware into production behavior.
Persona is a moat. Generic assistants get ignored. Distinctive ones get used. Pick a voice and hold it across every surface.
Long context changes the architecture. With 1M-2M-token windows now standard, the question shifts from "what should we retrieve?" to "what should we leave in working memory?" RAG becomes a precision tool rather than a default.
Source matters more than fluency. Public-facing and regulated bots earn trust by citing what they know and refusing what they don't. Open-weight models under MIT and Apache 2.0 - GLM-5.1, Qwen3.6-27B, MiMo-V2 - have made on-prem deploys viable for industries where data can't leave the building.
What to watch out for
The same shift that's made chatbots vastly more capable has also widened the gap between good and bad deployments. A few traps to avoid:
- Picking a model and walking away. Models ship every few weeks. A deployment frozen on GPT-5.0 today is leaving cost and quality on the table compared to the same workload on Claude Opus 4.7 plus DeepSeek V4 Flash.
- Skipping evaluation. "It seemed to answer well in testing" is not enough. Sample real ticket history, grade responses, and re-grade after every model swap.
- Over-engineering RAG when long context would do. With 1M-token windows you can often skip the embedding pipeline entirely for medium-sized knowledge bases. Try the simple thing first.
- Treating the chatbot as a side project. The top performers above are owned by product teams with metrics, dashboards, and roadmap reviews. Anything less and the bot stagnates.
Building your own
The teams behind the chatbots above didn't get there by hiring an army. They picked a clear job, a stable persona, the right model mix, and a way to actually take action. Then they iterated.
If you want to skip the integration work and start at iteration, Berrydesk is the fastest way to get there - pick a model, point it at your knowledge, brand the widget, wire up your actions, and ship. The examples in this post took years to build the first time. They don't have to take you that long.
Launch your AI agent in minutes
- Pick from GPT-5.5, Claude Opus 4.7, Gemini 3.1, DeepSeek V4, Kimi K2.6 and more
- Train on docs, sites, Notion, Drive, and YouTube - then deploy to web, Slack, Discord, WhatsApp
Set up in minutes
Chirag Asarpota is the founder of Strawberry Labs, the team behind Berrydesk - the AI agent platform that helps businesses deploy intelligent customer support, sales and operations agents across web, WhatsApp, Slack, Instagram, Discord and more. Chirag writes about agentic AI, frontier model selection, retrieval and 1M-token context strategy, AI Actions, and the engineering it takes to ship production-grade conversational AI that customers actually trust.



