Precision Oncology & AI

Real-world application of large language models for automated TNM staging using unstructured gynecologic oncology reports

Manual cancer registry data entry is time-consuming and error-prone. This study demonstrates that Large Language Models (LLMs) can accurately extract TNM classifications from unstructured gynecologic oncology reports using only prompt engineering, outperforming manual entries. Both cloud-based (Gemini 1.5) and top-performing local (Qwen2.5 72B) LLMs achieved high accuracies for T, N, and M classifications, highlighting their potential to enhance data integrity and streamline clinical workflows without complex fine-tuning or data anonymization.

Schedule Your Strategy Session

Executive Impact at a Glance

Key performance indicators demonstrating the potential of LLM integration in clinical data management.

0 Manual Data Entry Error Rate (pT)

0 Gemini 1.5 pT Classification Accuracy

0 Gemini 1.5 pN Classification Accuracy

0 Gemini 1.5 cM Classification Accuracy

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Background & Challenges

LLM Performance (T/N Stage)

LLM Performance (M Stage)

Methodology & Innovation

Manual data entry in cancer registries is time-consuming and prone to error, with reported inaccuracies ranging from 5-17%. The complexity of TNM classification criteria and frequent updates exacerbate these errors, underscoring the need for reliable and efficient data registration methods.

Cloud-based LLM (Gemini 1.5) achieved 0.994 accuracy for pT and 0.993 for pN, outperforming manual entries. The top-performing local model (Qwen2.5 72B) also showed strong results with 0.971 accuracy for pT and 0.923 for pN. These models effectively extracted pathological T and N classifications from unstructured reports using prompt engineering alone.

Gemini 1.5 achieved 0.909 accuracy for clinical M (cM) classification, and Qwen2.5 72B achieved 0.895 accuracy. While robust, performance for M-stage was comparatively lower than T/N stages, often due to misinterpretation of peritoneal dissemination and extra-regional lymph node metastases as M1.

The study utilized out-of-the-box LLMs with prompt engineering on raw, unstructured medical records, bypassing complex fine-tuning or data anonymization. Implementation in secure cloud-based and local offline environments ensures data confidentiality and practical applicability in clinical workflows. Pydantic-constrained decoding significantly improved output consistency and accuracy.

0

Percentage of pT classification errors in cervical cancer

Enterprise Process Flow: Automated TNM Staging Workflow

Raw Clinical Reports

→

Secure Network (Cloud/Local LLMs)

→

Prompt Engineering

→

Structured TNM Classification (JSON)

→

Enhanced Data Integrity

LLM Performance: Cloud vs. Local

Both cloud-based and local LLMs show high accuracy, but cloud models often lead with slight performance advantages due to their scale and continuous optimization.

Feature	Cloud-based LLMs (e.g., Gemini 1.5)	Local LLMs (e.g., Qwen2.5 72B)
pT Classification Accuracy	0.994	0.971
pN Classification Accuracy	0.993	0.923
cM Classification Accuracy	0.909	0.895
Data Security & Compliance	Robust cloud provider security Requires data transfer off-premises	Full on-premises control No external data transfer
Customization & Fine-tuning	Limited flexibility, often proprietary	High flexibility, open-source models Easier for specialized fine-tuning (if needed)

Improving Data Integrity with Pydantic-constrained Decoding

Challenge: Conventional prompt-based structured output often leads to format variations, extraneous explanations, and structural inconsistencies, requiring manual post-processing and reducing reliability.

Solution: Implemented Pydantic-constrained decoding for forced JSON output generation, ensuring consistent and valid JSON formats without irrelevant text.

Impact: Significantly improved accuracy (mean difference 0.0268, p=0.004) and F1 score (mean difference 0.0271, p=0.006) for pT classification, eliminating output verbosity and structural inconsistencies and enhancing automation.

Calculate Your AI Automation ROI

Estimate the potential cost savings and efficiency gains for your organization by automating manual data entry and classification tasks with LLMs.

Your Industry

Number of Employees Performing Manual Data Entry

Average Weekly Hours Spent per Employee on Manual Entry

Average Hourly Cost per Employee (USD)

Annual Savings $0

Hours Reclaimed 0

Your AI Implementation Roadmap

A phased approach to integrating LLM-based automation into your clinical data workflows.

Phase 1: Discovery & Strategy

Assess current manual processes, identify key data points for automation (e.g., TNM staging), and define success metrics. Develop a secure data handling strategy (cloud vs. local LLMs) and compliance framework.

Phase 2: Pilot & Validation

Implement LLM solution with prompt engineering on a subset of real-world, unstructured reports. Validate accuracy against ground truth, measure efficiency gains, and refine prompts based on initial results. Focus on secure environment setup.

Phase 3: Integration & Scale

Integrate the validated LLM solution into existing clinical workflows and registry systems. Train staff on new automated processes and monitoring protocols. Expand to broader datasets and additional classification tasks.

Phase 4: Optimization & Future-proofing

Continuously monitor LLM performance, update prompts for evolving classification guidelines, and explore advanced techniques like Pydantic-constrained decoding for further accuracy and consistency. Stay agile with LLM advancements.

Ready to Transform Your Data Management?

Schedule a personalized consultation with our AI specialists to explore how LLM automation can enhance accuracy, reduce workload, and future-proof your cancer registry operations.

Schedule Your Strategy Session

Precision Oncology & AI

Real-world application of large language models for automated TNM staging using unstructured gynecologic oncology reports

Executive Impact at a Glance

Deep Analysis & Enterprise Applications

Enterprise Process Flow: Automated TNM Staging Workflow

LLM Performance: Cloud vs. Local

Improving Data Integrity with Pydantic-constrained Decoding

Calculate Your AI Automation ROI

Your AI Implementation Roadmap

Phase 1: Discovery & Strategy

Phase 2: Pilot & Validation

Phase 3: Integration & Scale

Phase 4: Optimization & Future-proofing

Ready to Transform Your Data Management?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai