AI Training Models: Why Your Choice Matters in CX

If you’ve already rolled out AI in your support org, or are about to, you’ve likely run into systems that don’t perform the way it promised. Maybe it couldn’t deflect real tickets, maybe it gave answers that didn’t match your policies, or maybe it couldn’t resolve anything without human help.

The root cause is often overlooked: how the AI was trained.

Most AI tools are trained on public data and generic help articles. They weren’t built to reflect how your team actually works, what your customers ask, or how your agents solve issues. When that’s the case, the AI can’t reason through a ticket or act with context.

The way an AI solution is trained shapes what the AI knows, how it behaves, and whether it delivers real ROI. To understand how different training methods impact performance, Forethought surveyed over 600 CX professionals across U.S. mid-market companies in our 2025 AI in CX Benchmark Report. The results showed companies that trained their AI on real, historical ticket data saw better outcomes across every major metric, including deflection, CSAT, cost per resolution, and retention.

How AI is trained impacts performance, but not all training is equal

Many companies selling AI-powered support tools today are not building their own models. Instead, they’re using foundation models like GPT-4, Claude, or Bard as the backbone of their product. These models are trained on public data and aren’t fine-tuned for individual businesses by default.

In most cases, vendors don’t offer that level of customization. They plug into a general-purpose model and layer on a simple interface or workflow builder. The result is a system that may sound intelligent but doesn’t actually understand your tickets, policies, or workflows and has no way to learn them.

According to the Wall Street Journal, 92% of Fortune 500 companies are now using ChatGPT in some form, and a recent survey by Andreessen Horowitz found that 66% of enterprises are using OpenAI’s models in their operations. Tools like ChatGPT Enterprise are built on GPT-4, but OpenAI makes it clear that these models don’t train on your business data unless you pay for custom fine-tuning—a service that can cost millions.

This approach speeds up deployment, but it sacrifices accuracy and relevance. You only get control over training if the platform allows you to upload your own data, see what’s being used, and manage ongoing training cycles. This training data affects performance in two critical ways:

Grounding: Historical ticket data gives the model examples of real issues and correct answers. It stops the AI from hallucinating.
Context: Internal documents, macros, workflows, and CRM fields show how your team solves problems. That gives the AI enough understanding to resolve tickets, not just talk about them.

Even if you connect one of these solutions to your help center, the system won’t ingest and learn from real tickets or workflows if it’s not designed to do so. You need a partner that gives you control over what goes into training, how the model learns, and how it applies that knowledge.

New data: Training your AI model on historical tickets dramatically improves key CX metrics

AI models can be trained in different ways but not all of them lead to better outcomes. To understand how much the training method matters, we compared companies that trained their AI on historical support tickets and CRM data to those that relied on other sources, such as help center articles or vendor-supplied content built on open source models. Then we looked at how each group performed across four core CX metrics: deflection rate, CSAT, cost per resolution, and customer retention.

1. Deflection rate

Deflection is one of the clearest indicators of whether AI is working. If your system can’t resolve issues before they reach an agent, it’s not delivering value—it's just rerouting tickets.

The training method you use plays a major role in whether deflection actually happens. Companies that trained their AI on historical support tickets saw significantly stronger results. On average, their deflection rates were 38% higher than those using generic data. Some teams achieved up to 2.3x higher deflection by training on real examples of how their agents resolved issues in the past.

That advantage shows up in how teams perceive the impact as well. 77% of companies using historical data said AI had a positive effect on deflection, including 22% who described the impact as highly positive. Among companies that didn’t use historic data, only 65% saw a positive impact and nearly 1 in 5 said AI made deflection worse.

If your AI isn’t improving deflection, this may be why. Models that don’t learn from real tickets aren’t equipped to resolve them.

2. CSAT

Deflecting tickets away from your human agents is only useful if customers walk away satisfied. Our findings show that training your AI models on historical ticket data also improves customer satisfaction.

Companies that trained their AI on past support tickets and CRM data reported an average CSAT of 84%, compared to 75% for those using other data sources. That’s a nearly 10-point gap.

When AI is trained on real examples from your environment, it’s more likely to give accurate, relevant answers the first time. That leads to faster resolutions, fewer escalations, and better experiences overall.

3. Cost per resolution

AI is supposed to reduce workload, but if it’s not trained well, it does the opposite. When AI gives the wrong answer, or can’t complete the workflow, tickets still reach your human agents. In many cases, they take longer to resolve because the agent has to redo or clarify what the AI already attempted.

Better training can help here too. Companies that trained their models on historical ticket data saw an average cost per resolution of $14, while those using generic sources averaged $18. Across the full sample, the average was $16.

The trend data tells the same story. 56% of companies using historical data said their resolution costs were improving month over month while only 46% of those using other data said the same, and 16% reported that costs were getting worse.

Good training creates a compounding effect. The more the AI sees real examples from your environment, the more efficient it becomes.

4. Customer retention

Retention is often shaped by a long chain of support experiences. It’s not just about whether a ticket is resolved, but how many times a customer had to ask, how confident they felt in the answer, and how quickly they got help. AI plays a growing role in those moments.

Our research found that companies training their AI on historical support tickets reported a customer retention rate of 76%. That’s above the 74% average and notably higher than the 72% rate for companies using other types of training data.

Nearly half (48%) of companies using historical data said retention was improving over time, compared to 40% of companies using generic sources. And while only 1% of the historic-data group saw retention declining, 16% of the generic-data group did.

The same training that improves CSAT also improves retention over time. When AI is trained on how your team actually supports customers, it’s more likely to leave them with a good experience—one that builds loyalty instead of eroding it.

3 real-world examples of how AI works better when trained on historical ticket data

The strongest results in our research came from teams that trained their AI on real support data like past tickets, internal documentation, CRM notes, not generic templates or public articles. Our AI-powered solutions learn from your historic data, while Discover, our analytics solution, surfaces content gaps, auto-generates article drafts, and continuously improves the model’s performance over time.

These three examples from Forethought customers show how they’ve used this training approach to drive meaningful, measurable improvements in accuracy, efficiency, and customer experience.

1. Qover uses AI trained on historic data to meet strict regulatory standards

Qover, a provider of embedded insurance products through software, operates in one of the most tightly regulated industries in Europe. A single inaccurate answer creates compliance risk and potential legal exposure, which makes accuracy non-negotiable when evaluating AI vendors.

Forethought stood out not just for its performance, but for how it learns. Unlike tools that rely on public content or generic templates, Forethought Solve was trained directly on Qover’s support tickets, knowledge base, and policy documentation. That allowed the system to learn how Qover’s agents had handled past questions across multiple insurance products and respond accordingly. They were able to achieve:

95% accuracy on written responses
80% ticket deflection
4 minutes saved per ticket
CSAT scores equal to human agents
Hallucination rates under 2%, well below industry norms

Just as important, Qover retained control. Their team was able to configure intents and Autoflows themselves, ensuring that the AI would behave predictably and escalate when needed.

2. TestGorilla uses AI trained on real tickets to support two distinct user groups

TestGorilla provides pre-employment screening tools to job candidates and recruiters. Candidates often need immediate help during live assessments, while recruiters tend to ask more complex, context-heavy platform questions. This dual pressure made it essential to automate high-volume requests without sacrificing accuracy.

Early AI tools didn’t meet the bar. They hallucinated, escalated too often, and introduced friction into what needed to be a fast, seamless experience. TestGorilla shut them off and reset their approach. When they found Forethought, they started small.

The team trained Solve on historical ticket data and launched it in chat for one use case: deflecting repetitive candidate questions. More complex issues still went to humans. This allowed them to test, refine, and build trust in the system’s behavior. The results allowed them to scale across use cases and workflows with impressive results:

89% ticket deflection (up 21% from their previous chatbot)
92% CSAT, equal to their human-assisted support
Improved agent morale, as the team focused on more strategic work

Because Solve was trained on real conversations it understood the specific issues TestGorilla’s customers actually asked. That gave the team confidence to demonstrate the system live to their executive team, where it performed exactly as expected.

3. ActiveCampaign uses agentic AI to deflect high-volume tickets without rule-based setup

When ActiveCampaign began evaluating AI support solutions, most vendors promised future capabilities but offered little in the way of proven results. What stood out about Forethought was that it could start delivering value immediately without requiring rule-based configuration or months of tuning.

The team trained Solve on their internal knowledge base and historical support content. This gave the system a strong foundation to understand customer intent and respond naturally. That alone helped deflect common requests around campaign setup, feature walkthroughs, and usage guidance.

As they scaled, Autoflows—Forethought’s agentic AI technology—extended that impact and enabled the AI to take action inside their connected systems, like CRMs and help desks. That eliminated the need for manual intervention and unlocked a new level of automation, where AI wasn’t just answering questions, but resolving them.

Forethought makes training your AI agent on historical tickets easy

Most AI tools ask you to trust a black box. Forethought’s models are trained on your actual support data, including tickets, knowledge base content, macros, and CRM notes, so your AI agent’s responses and actions always reflect how your team really works, not how a template says it should.

Implementation doesn’t require building a custom model or writing prompt libraries from scratch. To get it up and running, you just:

Plug into your help desk
Connect your knowledge base
Sync your historical support data

Once trained and integrated properly, you can confidently trust Forethought to respond accurately and take action on behalf of your team.

Request a demo to see how a well-trained AI agent can help you scale your support organization without additional headcount.

Hashtags blocks for sticky navbar (visible only for admin)

Experience the Future of Customer Support

Get Started Today

Experience the Future of Customer Support

Get Started Today

Authors

Machielle Thomas

Authors

Machielle Thomas

Download the report

Smiling child wearing a helmet rides a red bicycle on grass with an adult man supporting and two children watching happily in the background.

Nunc quisque sapien nibh volutpat odio vitae varius ipsum. Semper ac urna platea id. Dui quis donec bibendum viverra volutpat gravida dictumst.

90%

Accuracy & Coverage in Classifying New Tickets

50%

Reduction in Time to Resolution

Read Case Study

Experience the Future of Customer Support

How AI is trained impacts performance, but not all training is equal

New data: Training your AI model on historical tickets dramatically improves key CX metrics

1. Deflection rate

2. CSAT

3. Cost per resolution

4. Customer retention

3 real-world examples of how AI works better when trained on historical ticket data

1. Qover uses AI trained on historic data to meet strict regulatory standards

2. TestGorilla uses AI trained on real tickets to support two distinct user groups

3. ActiveCampaign uses agentic AI to deflect high-volume tickets without rule-based setup

Forethought makes training your AI agent on historical tickets easy

Hashtags blocks for sticky navbar (visible only for admin)

{{resource-cta}}

{{resource-cta-horizontal}}

{{authors-one-in-row}}

{{authors-two-in-row}}

{{download-the-report}}

{{cs-card}}

{{resource-cta-form}}

Resources