Back to Blog
Automation PlaybooksFebruary 20, 20269 min read

How We Built an AI Support Agent That Handles 80% of Tickets

The Problem

Our client, a direct-to-consumer e-commerce brand doing $320k in annual revenue, had a support problem. The founder was spending three hours every day reading, categorizing, and responding to customer support tickets. The tickets came through Zendesk and covered everything from "Where is my order?" to complex return authorization requests.

The manual process looked like this: read the ticket, determine the category (shipping, returns, product questions, complaints), look up the relevant order in Shopify, draft a response, and send it. For returns, there was an additional step of checking the return policy, generating a return label, and updating the order status.

Three hours per day. Fifteen hours per week. Sixty hours per month. At the founder's opportunity cost, this was roughly $4,500 per month in lost productive capacity. That is the kind of problem AI automation was built to solve.

The Architecture

We did not slap a chatbot on the website and call it a day. We built a purpose-designed triage system using our A.N.T. framework (Acumen, Nuance, Trust). Here is the architecture.

Layer 1: Acumen (The Knowledge Layer)

Before writing any code, we spent a full week auditing the client's support operation. We analyzed 500 historical tickets and categorized them into distinct types:

Shipping status inquiries: 35% of all tickets. Customer wants to know where their order is.

Return and exchange requests: 25% of all tickets. Customer wants to return or exchange a product.

Product questions: 20% of all tickets. Questions about sizing, materials, care instructions, or compatibility.

Order modifications: 10% of all tickets. Changes to shipping address, adding items, or canceling orders.

Complaints and escalations: 10% of all tickets. Unhappy customers requiring empathetic, nuanced responses.

This analysis revealed that 80% of tickets (shipping, returns, and product questions) followed predictable patterns that could be automated with high confidence. The remaining 20% required human judgment and empathy.

Layer 2: Nuance (The Intelligence Layer)

For each automatable category, we built specialized prompt chains. This is not a single generic prompt. Each ticket type has its own processing pipeline.

The shipping status pipeline works like this: The AI reads the incoming ticket and identifies it as a shipping inquiry. It extracts the order number (or looks it up by customer email through the Shopify API). It queries the shipping carrier's tracking API for real-time status. It generates a response that includes the current status, estimated delivery date, and tracking link. If the package is delayed, it proactively acknowledges the delay and provides an updated timeline.

The returns pipeline is more complex: The AI identifies the return request, checks the order date against the 30-day return window, verifies the item is in a returnable category, generates a return authorization number, creates a prepaid shipping label through the carrier API, and sends a response with step-by-step return instructions. If the item falls outside the return window or is in a non-returnable category, it drafts a polite explanation and escalates to the founder for a judgment call.

The product questions pipeline: The AI searches a curated knowledge base built from the brand's product descriptions, FAQ page, care instructions, and sizing guides. It generates contextually relevant answers and includes links to the specific product pages. If the question is not covered in the knowledge base, it flags the ticket for human response and adds the question to a "knowledge gap" tracker so we can update the knowledge base.

Layer 3: Trust (The Safety Layer)

This is where most AI implementations fail, and where we invest the most engineering effort. The Trust layer ensures the AI never makes a mistake that costs the client money or damages customer relationships.

Confidence scoring: Every AI-generated response gets a confidence score from 0 to 100. Responses below 85 confidence are automatically routed to the founder for review rather than sent automatically. This prevents the AI from confidently sending a wrong answer.

Financial guardrails: The AI can never issue refunds, process charges, or modify order values without human approval. It can generate return labels (a cost-controlled action with a maximum of one label per request), but any action involving money requires explicit founder sign-off.

Tone validation: Every outgoing response passes through a tone checker that ensures the message matches the brand's voice. Too formal? It gets adjusted. Too casual? It gets adjusted. Contains anything that could be interpreted as confrontational? It gets flagged.

Escalation triggers: Certain keywords and patterns automatically bypass the AI and route directly to the founder. These include mentions of legal action, social media threats, requests for the owner specifically, and tickets from VIP customers (identified by lifetime order value).

Audit logging: Every single AI action is logged with the input ticket, the AI's reasoning, the generated response, the confidence score, and whether it was auto-sent or escalated. This creates a complete audit trail and provides training data for continuous improvement.

The Deployment

We deployed in three phases over two weeks.

Phase one (days one through three) was shadow mode. The AI processed every incoming ticket and generated responses, but nothing was sent automatically. Instead, the responses appeared as internal notes on each ticket for the founder to review. This let us validate accuracy before going live.

Phase two (days four through seven) was assisted mode. The AI generated responses and pre-populated the reply field. The founder reviewed each one and either sent it as-is, edited it slightly, or rewrote it entirely. We tracked the edit rate to measure accuracy.

Phase three (days eight through fourteen) was autonomous mode. High-confidence responses for shipping and product questions were sent automatically. Returns still required founder approval for the return authorization. Complaints and escalations were always routed to the founder.

The Results

After 30 days of autonomous operation, here are the numbers.

Total tickets processed: 847. Tickets handled autonomously by AI: 678 (80%). Tickets escalated to founder: 169 (20%). AI accuracy rate (responses that required no editing): 94%. Average response time (AI): 47 seconds. Average response time (previous manual): 4.2 hours. Customer satisfaction score: improved from 3.2/5 to 4.7/5.

The founder's daily support time dropped from 3 hours to 35 minutes. She now only handles the 20% of tickets that genuinely require her personal attention: complex complaints, VIP customers, and edge cases the AI has not encountered before.

Lessons Learned

Building this system taught us several things that we now apply to every support automation engagement.

First, the knowledge base is everything. The AI is only as good as the information it has access to. We spent more time curating the knowledge base than we did on the AI logic itself. Every product description, every policy document, every FAQ answer had to be accurate, complete, and up to date.

Second, confidence thresholds matter more than accuracy. An AI that is right 95% of the time but sends everything automatically will eventually send a catastrophically wrong response. An AI that is right 95% of the time and escalates everything below 85% confidence will never embarrass you. We would rather have a few false escalations than a single wrong automated response.

Third, the escalation experience must be excellent. The 20% of tickets that reach the founder need to arrive with full context: the customer's order history, the AI's attempted response, and the reason for escalation. This turns a 15-minute research task into a 2-minute decision.

Fourth, continuous learning is non-negotiable. Every month, we review the escalated tickets, identify new patterns that could be automated, update the knowledge base, and refine the confidence thresholds. The system gets better every single month.

The Template Is Replicable

While every business has unique support challenges, the architecture we built is replicable across industries. The three-layer approach (Acumen for knowledge mapping, Nuance for intelligent processing, Trust for safety) works whether you are triaging HVAC service requests, qualifying real estate leads, or routing IT support tickets.

The key is resisting the temptation to automate everything from day one. Start with the 80% that is predictable. Master that. Then gradually expand the AI's capabilities as you build confidence in the system and accumulate training data from real interactions.

customer supportAI agentsautomation

Ready to eliminate manual work?

Book a free AI Bottleneck Audit and see exactly how many hours your business can reclaim with AI automation.

No contracts. No setup fees. Cancel anytime.