Simulate the assistant with a human operator to explore real conversations safely. Script guardrails, log decisions, and debrief immediately. Make participants aware of the setup to preserve ethics. Use findings to harden prompts, clarify scope, and identify places where automation should never replace human judgment.
Iterate on prompts, microcopy, and turn-taking using text, voice, and visual affordances. Track how timing, latency, and interruptions affect comprehension. Prototype repair paths, confirmations, and summaries. Compare variants with A/B tests that combine qualitative feedback and behavioral metrics, then codify learnings in reusable patterns and checklists.
Collect only what you need, anonymize aggressively, and separate identifiers from content. Version prompts, datasets, and evaluation scripts to reproduce outcomes. Communicate risk clearly to participants and stakeholders. Publish experiment notes to your team so future projects avoid repeating mistakes and can build on proven insights.