Enterprise AI Analysis
Responsible Prompting Recommendation: Fostering Responsible AI Practices in Prompting-Time
Human-Computer Interaction practitioners have been proposing best practices in user interface design for decades. However, generative Artificial Intelligence (GenAI) brings additional design considerations and currently lacks sufficient user guidance regarding affordances, inputs, and outputs. In this context, we developed a recommender system to promote responsible AI (RAI) practices while people prompt GenAI systems. We detail 10 interviews with IT professionals, the resulting recommender system developed, and 20 user sessions with IT professionals interacting with our prompt recommendations. Results indicate that responsible prompting recommendations have the potential to support novice prompt engineers and raise awareness about RAI in prompting-time. They also suggest that recommendations should simultaneously maximize both a prompt's similarity to a user's input as well as a diversity of associated social values provided. These findings contribute to RAI by offering practical ways to provide user guidance and enrich human-GenAI interaction via prompt recommendations.
Keywords: Prompt Engineering, Human-AI Interaction, Responsible Computing, Responsible AI, Responsible Prompting, Recommender Systems, Proactive Value Alignment
Authors: Vagner Figueredo de Santana, Sara E Berger, Heloisa Candello, Tiago Machado, Cassia Sampaio Sanctos, Tianyu Su, and Lemara Williams
Executive Impact
Quantifiable insights into the adoption and effectiveness of Responsible AI Prompting.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
Understanding Prompting Practices
Our research began with 10 semi-structured interviews with IT professionals (researchers and data scientists) to understand their daily LLM usage, prompt writing, evaluation methods, and concerns. We found that prompting is often laborious and iterative, with professionals sourcing prompts from various informal and formal channels.
Key Challenges Identified:
- Difficulty in producing effective communication for desired GenAI outcomes.
- Lack of standardized methods for evaluating prompt quality or model outcomes.
- Concerns about model accuracy, hallucination, source attribution, fairness, bias, privacy, and language prioritization (e.g., English over other languages).
- Balancing speed and automation with human interpretation in GenAI workflows.
These insights highlighted a clear need for guidance and evaluation mechanisms during the prompting process to foster responsible AI practices.
Responsible Prompting Recommender System
We developed an LLM-agnostic recommender system offered as a lightweight Rest API. This system provides real-time recommendations before a prompt is sent to the GenAI model, helping users craft more responsible prompts.
Core Components:
- Sentences Dataset: Comprises +2000 human-curated sentences, categorized into 'positive' (promoting social values) and 'negative' (harmful/adversarial) clusters.
- Prompting Recommendations Workshop: Involved 16 Responsible Tech researchers to gather diverse, value-aligned prompt-recommendation pairs, focusing on values like fairness, safety, explainability, reliability, and social norms.
- Recommendation Algorithm: Compares input prompt sentences with dataset centroids using a similarity metric. It recommends up to 5 positive sentences to add and up to 5 negative sentences to remove or reword, using configurable thresholds to ensure relevance and diversity.
The system aims to be flexible, open-sourced, and easily customizable to various contexts and social values.
Evaluating Recommendations with IT Professionals
A user study with 20 IT professionals (researchers and data scientists) assessed their interaction with the responsible prompting system. Participants engaged in tasks involving editing prompts with harmful content and comparing LLM outputs.
Key Qualitative Findings:
- Effective Guidance: Recommendations were valued for raising RAI awareness and making ethical considerations more visible.
- Improved Quality: Edited prompts led to LLM outputs that were more concise, specific, customized, and aligned with RAI values, saving tokens and preventing "ethics lectures" from the LLM.
- Prompt Engineering Support: The system was seen as a valuable tool for non-experts, reducing prompting time, and a "new way of programming."
- Areas for Improvement: Participants noted a lack of direct user control over recommendation updates, perceived ephemerality/redundancy of suggestions, and a desire for more industry-specific customization.
Quantitative Results: The system achieved an average SUS score of 81.94, indicating good usability. Participants frequently chose to remove harmful sentences first before adding positive value-based recommendations.
Fostering Responsible AI in Practice
This research highlights the critical role of prompt recommendations in fostering Responsible AI (RAI) practices. By providing real-time guidance, the system helps users proactively consider ethical implications before content generation.
Key Implications:
- Proactive Value Alignment: The system acts as a "blue teaming-like" approach, preventing harms and promoting responsible practices early in the LLM interaction lifecycle.
- Complementary Approach: Value-based recommendations complement traditional post-generation guardrail approaches by instilling RAI awareness at the point of creation.
- User Empowerment: The lightweight, customizable API empowers organizations to adapt the system to their specific ethical guidelines and contexts, fostering a more inclusive and responsive AI ecosystem.
- Addressing Challenges: By offering a practical tool, it addresses the challenges of stochasticity, lack of guidance, and the fast-paced nature of GenAI development.
Future work will focus on integrating more advanced recommendation strategies and expanding the diversity of supported contexts and prompt templates.
Enterprise Process Flow
Feature | Baseline Prompt Outcome | Recommended Prompt Outcome |
---|---|---|
Response Nature | Often generic, vague, or repetitive. May include ethical 'lectures' when harmful content is detected post-generation. | More concise, specific, customized, and aligned with user intent. Proactively embeds RAI values, leading to more direct task fulfillment. |
RAI Alignment | Limited or reactive, requiring post-generation filtering or moralizing outputs from the LLM. | Proactive, embedding responsible AI practices (e.g., transparency, fairness, safety) directly into the prompt before generation. |
Efficiency & Resource Use | May lead to wasted tokens or iterations due to refining problematic initial outputs. | Reduces need for extensive post-generation editing, potentially saving tokens and reducing overall prompting time. |
Case Study: Advancing Responsible Prompt Engineering
Our work demonstrates a practical approach to embedding Responsible AI (RAI) principles directly into the prompt engineering workflow. By leveraging a lightweight, LLM-agnostic recommender system, we empower IT professionals to craft more ethical and effective prompts. This proactive intervention addresses the growing concerns around GenAI's potential for harm by guiding users towards positive social values and away from potentially harmful inputs. The system's open-source nature ensures adaptability and community contribution, fostering a collaborative ecosystem for responsible AI development and deployment. This approach provides not just a tool, but a framework for continuous learning and adaptation in the rapidly evolving landscape of generative AI.
Calculate Your Potential AI ROI
Estimate the efficiency gains and cost savings your enterprise could realize by implementing advanced AI solutions.
Our Project Implementation Roadmap
A phased approach to integrate responsible AI practices within your organization, ensuring a smooth and effective transition.
Phase 1: Discovery & Research
Deep dive into existing prompting practices and challenges through expert interviews and literature review, establishing foundational understanding.
Phase 2: System Development
Design and implementation of a lightweight, LLM-agnostic recommender system, including dataset curation and algorithm development.
Phase 3: User Validation & Refinement
Conduct user studies with IT professionals to evaluate the system's effectiveness and gather feedback for iterative improvements and future enhancements.
Ready to Implement Responsible AI?
Connect with our experts to discuss how responsible prompting can transform your enterprise AI strategy.