Enterprise AI Analysis
Generating Frequently Asked Questions from Technical Support Tickets using Large Language Models
High-performance computing (HPC) systems face increasing user demand and complexity, leading to a high volume of technical support tickets. This paper introduces a novel pipeline for automatically generating clear, accurate, and relevant Frequently Asked Questions (FAQs) directly from raw support ticket data. The methodology involves semantic clustering, representation learning, and large language models (LLMs). It filters tickets by recency and anomaly frequency, cleans and summarizes them into issue-resolution pairs, performs unsupervised semantic clustering, and generates FAQs from top clusters. A structured evaluation by subject matter experts (SMEs) showed that the approach effectively transforms raw tickets into understandable and pertinent FAQs, enhancing scalability and efficiency in HPC support workflows.
Executive Impact
Our AI-driven FAQ generation pipeline can significantly enhance operational efficiency and user satisfaction in high-performance computing (HPC) environments by automating common support issues.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
Our pipeline integrates semantic clustering, representation learning, and LLMs to transform unstructured ticket content into structured FAQs. Key steps include ticket filtering, cleaning and summarization, embedding and subclustering, and FAQ generation from representative clusters. This multi-stage process ensures adaptability to evolving support trends.
We employed fine-tuned instruction-tuned LLMs (Mistral-7B-Instruct-v0.3 for summarization and Phi-4 for FAQ generation) with Low-Rank Adapters (LoRA) and specific prompting strategies to handle domain-specific terminology, ensure consistent formatting, and mitigate hallucinations, crucial for HPC support. Prompt engineering included role-based framing, clear task definitions, negative examples, and formatted output specifications.
To group tickets with similar underlying issues, we utilized MPNet sentence embeddings and K-Means clustering after PCA dimensionality reduction. This allowed for the identification of fine-grained subclusters that represent tightly grouped instances of the same technical issue, crucial for generating precise and relevant FAQs from the voluminous and often inconsistent support ticket data.
Enterprise Process Flow
| Model | Silhouette Score ↑ | DBI ↓ |
|---|---|---|
| Nomic | 0.051 / 0.043 | 3.874 / 3.523 |
| BGE | 0.039 / 0.045 | 4.844 / 4.315 |
| MiniLM | 0.061 / 0.055 | 3.472 / 3.221 |
| MPNet | 0.066 / 0.066 | 3.324 / 3.156 |
| MPNet achieved the highest Silhouette Score and lowest DBI values, indicating the most effective clustering of issue summaries. | ||
SME Evaluation Highlights
“Our SMEs validated the pipeline's ability to generate high-quality outputs, emphasizing clarity and accuracy. For seven of the ten FAQs, reviewers agreed on relevance and accuracy, with minor suggestions for further content enhancement. This human-centered approach is crucial for specialized domains like HPC.”
— HPC Support Staff
Calculate Your Potential ROI
Estimate the potential return on investment for implementing an AI-driven FAQ system in your organization.
Implementation Roadmap
A phased approach to integrate AI into your enterprise workflows.
Data Ingestion & Preprocessing
Establish secure connections to support ticket systems and implement robust data redaction for PII and irrelevant content, ensuring a clean dataset for analysis.
LLM Fine-tuning & Summarization
Fine-tune instruction-tuned LLMs with domain-specific data to generate structured issue-resolution pairs, maintaining consistency and accuracy.
Semantic Clustering & Ranking
Apply advanced embedding models and clustering algorithms to identify fine-grained subclusters of similar issues, prioritizing high-impact topics for FAQ generation.
FAQ Generation & Refinement
Utilize a multi-stage LLM process to generate clear, accurate, and relevant FAQs from top subclusters, incorporating iterative refinement for quality assurance.
Deployment & Continuous Improvement
Integrate the FAQ system into existing support workflows, establishing feedback mechanisms and automated monitoring to ensure long-term relevance and performance.
Ready to Transform Your HPC Support?
Book a strategy session to discuss how our AI solutions can streamline your operations.