Enterprise AI Analysis
Can Generative AI Produce Test Cases? An Experience from the Automotive Domain
This paper explores the use of generative AI to automatically convert informal test case specifications into executable test scripts in the automotive domain. By integrating large language models with few-shot learning and retrieval-augmented generation, the proposed approach highlighted the potential of generative AI to support industrial software testing processes. Our solution assumes test case specifications defined in Rational Quality Manager and test scripts specified in ecu.test. We evaluated our solution by considering an industrial benchmark of 200 unique pairs of informal test step descriptions and executable test instructions. Our results show that generative AI can produce correct or near-correct test cases in many scenarios, the quality of results depends significantly on prompt design, large language model selection, and the accuracy of context retrieval. Our study underscores the need for human oversight to address subtle errors in logic sequencing and value assignments, ensuring functional correctness. Future research should prioritize improving retrieval mechanisms, expanding dataset diversity, and exploring hybrid human-AI workflows to enhance generative AI's scalability, reliability, and larger applicability in industrial settings.
Key Findings & Impact
Our analysis reveals significant opportunities for efficiency gains and quality improvements in enterprise testing workflows.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
Few-shot learning (properly configured) can produce test scripts that correctly implement their informal specification between 49.5% and 64.5% of the cases. However, when not correct, the test scripts are highly similar to the ones defined by humans and might be a valid starting point for the implementation of the test cases.
Even though in most cases the prompt order and context order do not affect the effectiveness of our solution, our results suggest ordering the context entries alphabetically and putting the context before the informal specification.
Our results suggest that LLMs tuned for code generation outperform their counterparts. Instruction tuning decreased the effectiveness when the ideal context was used, and did not provide improvements for the unfiltered context.
Unnecessary and inaccurate context significantly reduces the effectiveness of the test generation.
Enterprise Process Flow
Comparison | CHRF | EMR | LMR |
---|---|---|---|
Code Generation Tuning vs. Base |
|
|
|
Instruction Following vs. Base |
|
|
|
Industrial Context & Confidentiality
The study was conducted in collaboration with a large automotive company, using Rational Quality Manager for informal specifications and ecu.test for test scripts. Industrial confidentiality (Requirement R3) was maintained by executing models locally. This ensures real-world applicability while respecting data sensitivity.
- Collaboration with automotive OEM.
- Use of RQM and ecu.test (Assumptions A1, A2).
- Local execution for data confidentiality (R3).
- Results directly relevant to automotive domain.
Estimate Your Test Automation ROI
Quantify the potential time and cost savings from automating test case generation in your enterprise.
AI Test Automation Implementation Roadmap
A strategic phased approach to integrate generative AI into your testing workflow.
Phase 1: Pilot & Integration
Integrate AI-powered test case generation into a pilot project, focusing on a critical subsystem. Establish initial RAG context and few-shot examples.
Phase 2: Feedback & Refinement
Collect feedback from engineers on AI-generated scripts. Refine prompt design, context retrieval, and LLM configuration based on real-world usage.
Phase 3: Scaling & Training
Expand AI test generation to additional projects. Develop internal training programs for engineers to leverage and oversee AI effectively.
Phase 4: Advanced Features & Optimization
Explore advanced features like automated context retrieval, integration with CI/CD pipelines, and continuous LLM model optimization.
Ready to Transform Your Testing Workflow?
Connect with our experts to explore how Generative AI can revolutionize your test case generation and accelerate software delivery.