Enterprise AI Analysis
Sentiment Analysis & Random Forest for LLM vs. Human Text Classification
Authors: Javier J. Sanchez-Medina
After the launch of ChatGPT v.4 there has been a global vivid discussion on the ability of this artificial intelligence powered platform and some other similar ones for the automatic production of all kinds of texts, including scientific and technical texts. This has triggered a reflection in many institutions on whether education and academic procedures should be adapted to the fact that in future many texts we read will not be written by humans (students, scholars, etc.), at least, not entirely. In this work it is proposed a new methodology to classify texts coming from an automatic text production engine or a human, based on Sentiment Analysis as a source for feature engineering independent variables and then train with them a Random Forest classification algorithm. Using four different sentiment lexicons, a number of new features where produced, and then fed to a machine learning random forest methodology, to train such a model. Results seem very convincing that this may be a promising research line to detect fraud, in such environments where human are supposed to be the source of texts.
Executive Impact: Quantifying AI's Edge in Text Classification
This research presents a novel, sentiment analysis-driven approach to accurately discern between human-authored and LLM-generated scientific texts. Our methodology achieves robust classification performance, offering a crucial tool for academic integrity and content authenticity in the age of generative AI.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
The Challenge of AI-Generated Content
The rise of powerful Large Language Models (LLMs) like ChatGPT presents a significant challenge to distinguishing between human-authored and machine-generated texts. This is particularly critical in academic and scientific fields where authenticity and original thought are paramount.
The year Alan Turing proposed the Imitation Game as a benchmark for Artificial Intelligence, a challenge that LLMs increasingly approach.
Comparing Text Origination
Aspect | Human-Authored Text | LLM-Generated Text |
---|---|---|
Production Speed | Time-consuming | Rapid generation (seconds) |
Originality & Nuance | High, with unique insights and style | Mimics existing patterns, potential for lack of true originality |
Detection Methods (Previous) | Often manual review, plagiarism tools, or complex statistical models requiring human-in-loop | Statistical approaches (e.g., perplexity), watermarking (requires LLM cooperation) |
Challenges | Variability in style, potential for error | Ethical concerns, factual inaccuracies, evolving mimicry of human text |
Our Novel Classification Methodology
We introduce a new approach that leverages sentiment analysis to create robust features for discriminating between human and LLM-generated scientific texts. These features feed into a Random Forest model for classification.
Enterprise Process Flow
Number of sentiment-derived features used for classification, generated from four distinct lexicons (Bing, Afinn, NRC, Loughran-McDonald).
Origin of Training Data
Our model was trained on a balanced dataset comprising 145 instances: 72 abstracts from the 'New Phytologist' journal (human-authored) and 73 abstracts generated by ChatGPT v3.5 from corresponding titles. This balanced dataset enabled robust training and evaluation.
Key Data Points: 145 Total Instances, 4 Sentiment Lexicons, Human & ChatGPT Sources
Performance & Future Outlook
The Random Forest model demonstrated strong performance in classifying texts, with an overall accuracy of 84.13%, suggesting this methodology is a promising avenue for combating AI-generated fraud.
Detailed Model Performance (Weighted Average)
Metric | ChatGPT (Class 'a') | NewPhytologist (Class 'b') | Weighted Avg |
---|---|---|---|
TP Rate | 0.849 | 0.833 | 0.841 |
FP Rate | 0.167 | 0.151 | 0.159 |
Precision | 0.838 | 0.845 | 0.841 |
Recall | 0.849 | 0.833 | 0.841 |
F-Measure | 0.844 | 0.839 | 0.841 |
ROC Area | 0.880 | 0.880 | 0.880 |
Confusion Matrix Snapshot
The model correctly classified 62 ChatGPT instances and 60 NewPhytologist instances. Only 11 ChatGPT texts were misclassified as human, and 12 human texts were misclassified as ChatGPT, demonstrating good discrimination.
Future Directions for AI Detection
While promising, future research will need to address the rapid evolution of LLMs, including GPT-4 and beyond. Exploring combinations of sentiment features with other statistical or linguistic analysis techniques, as well as continuous adaptation and validation, will be crucial for maintaining detection efficacy.
Next Steps: Adaptation, Integration, Ongoing Validation with Evolving LLMs
Calculate Your Potential AI ROI
Estimate the time and cost savings your enterprise could achieve by implementing intelligent automation solutions.
Your AI Implementation Roadmap
A structured approach to integrating advanced AI detection into your enterprise workflows.
Phase 1: Discovery & Strategy
Comprehensive assessment of your current text processing workflows, identification of key detection needs, and development of a tailored AI strategy.
Phase 2: Data Preparation & Feature Engineering
Collection and cleaning of relevant text datasets, generation of sentiment analysis-based features, and initial model prototyping.
Phase 3: Model Training & Validation
Training of robust classification models (e.g., Random Forest) using advanced techniques like stratified cross-validation, followed by rigorous performance validation.
Phase 4: Deployment & Integration
Seamless integration of the trained AI detection system into your existing platforms and workflows, ensuring minimal disruption and maximum impact.
Phase 5: Monitoring & Optimization
Continuous monitoring of model performance, regular updates to adapt to new LLM capabilities, and ongoing optimization for accuracy and efficiency.
Ready to Secure Your Content's Authenticity?
Connect with our AI specialists to explore how sentiment analysis and machine learning can protect your enterprise from AI-generated text fraud.