AI · STRATEGIC DESIGN · 0→1 · PROMPT ENGINEERING

An AI that makes you rehearse the room before you walk in

You know the work is right — but the words don’t come out under pressure. Own The Room simulates the CEO, CTO, and PM you’re about to present to, asks what they’d actually ask, and scores how you answer.

ROLE

Product Designer: Strategy, Prompt Design, Build

MY CONTRIBUTIONS

Problem Framing, AI Workflow Design, Prompt Engineering, Build, Live Testing, Strategic Analysis

TIMELINE

5 Days

STATUS

Functional Prototype Shipped, Hi-Fi In-progress

THE PREMISE

The most important question in building an AI tool isn’t what the AI does. It’s what the human never gives up.

OVERVIEW

Have you ever had a great design idea but struggled to explain it to someone who isn't a designer?

You know the work is right. You know the research backs it. But the moment a CEO, a CTO, or a PM asks you to justify it on the spot because the words don't come out the way they should. That's the gap Own The Room is built to close.

This happens to a lot of designers. We're trained to think inside the work. We're not trained to translate that thinking on the spot, under pressure, to people who weren't in the room when the decisions were made.

Own The Room is an AI coaching tool that lets you practice that translation before the real conversation happens. You document your design context, upload it, define your stakeholders, and the AI takes on each persona and asking the questions they'd actually ask, scoring your responses, and telling you specifically what to fix. The goal isn't to come more prepared on paper. It's to be able to say what you already know, clearly, to whoever's asking.

This case study documents the full architecture: the problem, the design decisions, the build, and the strategic analysis behind it. Every choice about what the AI owns and what it doesn't which was deliberate. The final section applies the Five Capabilities Framework from HCDE 561: Strategic Human-Centered AI at the University of Washington to evaluate those decisions critically: where the tool works, where it introduces new risks, and what it means for how designers work with AI at scale.

01 — SITUATION

It's not a design problem. It's a translation problem.

A presentation was coming up where a CEO, a CTO, and a PM in the room. I'd spend hours writing notes about what I wanted to say. Sometimes I'd ask a colleague: "What do you think the CTO will ask?" They'd give me their best guess. I'd practice my answer in my head. Then I'd walk in and the CTO would ask something completely different, and I'd answer it like a designer: context-first, methodology-heavy. When what they needed was a business answer in 30 seconds.

I knew the work. I couldn't explain it to the room. And this isn't unusual.

In 2026, NN/g asked 150 designers their number-one problem. Half named the same thing — not design quality, not research methods, but alignment.

Nielsen Norman Group — What Designers Actually Struggle With on Product Teams

Tom Greever named the underlying tension in Articulating Design Decisions: "The most articulate person often wins." That's the uncomfortable truth behind every design review. You can have the best solution in the room and still lose the meeting because someone else was clearer, faster, and more tuned to what that specific audience needed to hear.

When that communication breaks down it doesn't just lose a meeting. It loses budget, roadmap influence, and trust. NN/g's State of UX 2026 now lists stakeholder management as a core competency alongside research and design craft. The field has named the gap. What it hasn't built is a structured way to practice closing it.

TARGET USER

A product designer who has already done the work: research, decisions, rationale and has a meeting with stakeholders who think differently than they do. NOT for designers figuring out what they think. For designers who already know what they think but struggle to translate it when the room doesn't think like a designer.

02 — RESEARCH

Every existing tool solves an adjacent problem.

Speech coaches grade how you talk. Deck builders make the slides. None start from your design context, and none simulate the specific people in your specific room.

TOOL

WHAT IT DOES

WHAT IT MISSES

Yoodli · Orai · Speeko

AI speech coaches: pacing, filler words, delivery

Delivery only. They grade how you speak, not what you’re saying about your actual decisions

VirtualSpeech

VR practice environments for public speaking

Immersive but generic. No domain knowledge, no stakeholder specificity

Beautiful.ai · Gamma · Pitch

AI slide generation and deck building

Build the presentation, not the live conversation after you stop presenting

Toastmasters · LinkedIn Learning

General communication skills and courses

Async and generic. No simulation rooted in your work and your stakeholders

Asking a colleague

One honest opinion from someone who knows you

One guess, 10 minutes and they may not know what your CTO actually cares about

THE GAP

None start from your design context. None simulate a CEO or CTO asking questions rooted in what that role cares about. None tie feedback to what you specifically said. They teach you to speak better in general. Own The Room trains you for this meeting, with these stakeholders, about this design.

03 — FRAMEWORK

Two frameworks became the architecture.

One mapped who owns what. The other defined what good actually looks like. Together they decided every line the tool draws between human and AI.

TOP-DOWN DECOMPOSITION

Who owns what?

I took the full workflow and asked one question of each task: does AI do this better, or does the human? Simulating questions and scoring performance are exactly what AI is good at. Writing context and choosing stakeholders require insider knowledge AI is structurally bad at. AI owns analysis, synthesis, and simulation. The human owns context, judgment, and authorization.

TASK

HOW IT HAPPENS TODAY

WHO OWNS IT

Write design context

Designers write it — hours to days

Human

Upload the context

.md / .docx / .pptx — 1 minute

Human

Choose design phase

Decide which phase to present — 1 minute

Human

Analyze the context

Memory or intuition — 5 minutes

Deliver the analysis

No structured output exists today

AI delivers. Human reviews & approves

Simulate stakeholders

Not a standard preparation step

AI simulates by role. Human picks the room

Run the simulation

Practice alone, or not at all

AI plays the stakeholder. Human answers

Score & suggest fixes

Doesn’t happen — 5–10 min when it does

AI scores & explains. Human decides what to act on

OUTPUT-FIRST

What does good actually look like?

The output-first framework inverts the normal design process. Instead of designing the workflow first, you start with the artifact you want at the end, described as specifically as possible and work backward to what it takes to produce it reliably.

I wrote out what a good version of this tool would do and what a bad version would do. Not in categories, in specifics. A good version asks probing questions rooted in this designer's actual project. A bad version cycles through stock prompts like "Tell me about your process." A good version scores performance tied to what the designer actually said. A bad version gives hollow praise without substance.

That spec became the foundation of the master instructions that govern the AI in every session. The "what to avoid" list became hard stops. The "what AI needs to know" list became the pre-session checklist. Working backward forced me to be precise about failure modes before I'd built anything.

Output-first Prompt Sheet: Good vs. Bad, Worked Backward

04 — DESIGN DECISIONS

Six choices, and the tradeoffs I accepted.

Every decision names what it costs. A tool that only lists its strengths is hiding something.

A human authorization gate before every session

Why

The AI produces a five-part context summary before coaching begins — the design problem, key decisions, stakeholders, presentation focus, and company context. The AI stops completely until the designer reads it and explicitly confirms it's accurate. No probing questions, no feedback, no simulation until then.

Tradeoff

Friction at the start of every session. A designer eager to jump in has to slow down. That friction is intentional — it protects everything that comes after.

Tension

The check only works if the designer actually reads the summary carefully. A rushed confirmation defeats the entire mechanism. The tool is honest about that limitation and prompts the designer to check each section, but it can't force real attention.

Data-only stakeholder simulation

Why

The AI pulls from trusted sources only — industry research, annual reports, reputable publications. “What’s your timeline?” is not what a CTO asks. The CTO informed by your company’s actual priorities asks something far more specific.

Tradeoff

It slows setup. The AI has to research the company and roles before simulating anything useful. Guessing would be faster — and wrong.

Tension

It can simulate what a CTO generically cares about. It cannot simulate your CTO — their history with you, the unspoken concerns. That gap is irreducible, and the tool returns it to the designer to fill.

Limitation transparency at the start of every session

Why

The biggest risk isn’t that the tool fails. It’s that it works well enough that you forget it’s not a prediction. So every session it states plainly: it doesn’t know your actual stakeholders, the politics, or the mood in the room.

Tradeoff

The disclaimer can feel tedious after enough use. But letting it fade is worse — confidence built through repetition could override the human's memory of what the tool can’t do.

Tension

After enough sessions the disclaimer becomes wallpaper. The tool can repeat the warning. It can’t make you keep taking it seriously.

Questions appear without preview

Why

If you know the questions in advance you’re memorizing answers, not practicing the skill. The discomfort of not knowing what’s coming is the point.

Tradeoff

Harder and less comfortable, especially early. An anxious designer gets no safety net.

Tension

There’s no difficulty setting. If it feels overwhelming early, there’s no way to ease in — you push through or stop.

Scoring tied to what the designer actually said

Why

“When you said X, you did Y well, but Z needs work. Try this instead.” Every note anchors to a real moment, graded Strong / Adequate / Weak by how the stakeholder would actually react. Vague feedback is the most common failure mode of coaching tools.

Tradeoff

It requires the AI to actually pay close attention to what the designer said, not just generate a generic response. When the AI misses nuance or scores something incorrectly, the format makes the error visible. That's a feature, not a bug. But it means the designer has to stay skeptical of the feedback, not just accept it.

Tension

The format makes errors visible, but creates a second problem: the designer must judge whether the AI’s evaluation is correct. If they trust it uncritically, flawed feedback can shape their practice. If they remain skeptical, they must second-guess the coaching while trying to learn from it. The format creates accountability, but assumes the designer is already skilled enough to recognize when the AI is wrong.

Two-skill architecture with explicit dependencies

Why

A check-context-validation skill runs the verification check; a design-stakeholder-coach skill runs the simulation and cannot run until validation completes. That dependency is written into the file, not assumed. Two files means two responsibilities; the check can’t be skipped or overridden by coaching logic.

Tradeoff

Two files mean more to maintain and more places for inconsistency. An index file exists to manage that as a reference catalog of what exists and where.

Tension

The guardrails are written in at three levels: the system-prompt.md, the design-stakeholder-coach.md, and the check-context-validation.md, so skipping validation requires overriding all of them. The redundancy is the protection. But that also means the architecture is more robust to the AI than to the human: a designer who's impatient can simply say "the context is fine, let's go" without actually reading the summary, and the system accepts it. The hard stop is real. What it can't enforce is whether the confirmation is genuine.

05 — THE HUMAN OWNERSHIP MODEL

Design what the human never gives up.

When I mapped the workflow using top-down decomposition, for AI to do something well, it needs two things: something concrete to work with, and a way to check if it did the job. With analyzing your context document, you give it the file, it reads it, and you can tell whether it understood correctly. Input is the file. Output is the summary. You can evaluate it. In contrast, AI struggles with tasks like understanding your context, knowing your relationships, and reading the room, because they require being embedded within the situation.

Own The Room is built on that distinction. The human owns everything that requires situational knowledge, not as a safety guardrail added later, but because those tasks are structurally unfit for AI delegation. AI cannot know what the designer left out of the document, the history between them and their CTO, or when they have practiced enough to present for real.

So the tool doesn't try to own those things. The designer decides what context to share. The designer names who the stakeholders are — the AI researches what each role actually cares about, drawn from the company's publicly available information and trusted industry sources, not forums or social media. The designer approves the AI's context summary before coaching begins. The designer decides which feedback to act on. The designer decides when they're done.

The human owns everything that requires being inside the situation: judgment, insider knowledge, and accountability. This is not a guardrail bolted on afterward, but a recognition that these tasks are structurally unfit for AI. That is not a compromise. That is the design.

06 — THE MOMENT BEFORE THE BUILD

I handed my own skill files to a fresh AI and asked where they fail.

The most important skill wasn't knowing how to write a skill file. It was knowing how to review one.

The Self-Imposed Critique That Caught Three Failures

1 — The memory promise. The skill claimed to "maintain continuous memory" across sessions; each session actually starts fresh. The critic was right about the problem but misunderstood my intent. The designer should never have to prompt, “refer to last time”—that would defeat the purpose. The tool should automatically learn from past sessions, including how the designer communicates, recurring patterns, and areas of difficulty, so each rehearsal builds on the last instead of starting from zero. How that history is stored and retrieved is still open, but the required behavior is clear: the system should already know.

2 — Inferring unwritten dynamics. The critic identified a real problem: AI cannot reliably infer unwritten rules from a brief description. I addressed this by placing that responsibility with the designer. Instead of only asking what stakeholders care about, the prompt asks why they care, encouraging the designer to surface the underlying dynamics rather than expecting the AI to guess. The AI coaches within the context provided; the designer owns the political context.

3 — Undefined scoring rubric. The critic identified another real issue: without defined criteria, numerical scores would be arbitrary and inconsistent, meaning two designers could receive very different scores for similar answers. I replaced the scale with three outcome-based levels tied to likely stakeholder reactions:

Strong — convinced or engaged

Adequate — understands but may still have doubts

Weak — confused or unconvinced

This makes the feedback easier to interpret and more consistent across rehearsals.

07 — THE BUILD

A system prompt and two skill files, testable the same day.

Own The Room runs on Claude.ai Projects. The fastest path from idea to working tool was a single document I could test the day it was written. The interactive web tool builds on this foundation.

system-prompt.md

The full behavior spec, pasted directly into the Claude Project prompt field.

master-instructions.md

The governing rules: tone, data rules, what the human controls, hard stops.

design-stakeholder-coach.md

The coaching simulation skill, its sequence, and what it must never do.

check-context-validation.md

The verification check: five-part summary, the hard stop, the limitation reminder.

output-first-prompt-sheet.md

The artifact spec that preceded the build: what good looks like, what to avoid.

SKILLS_INDEX.md

A reference catalog of every skill and where to find it.

08 — THE TESTING

I ran the full session as a designer would and it caught something I didn't anticipate.

I tested the tool directly in Claude Chat by setting up a Project with the system prompt and running the full session as a designer would. I uploaded my Grab a Seat case study as design context, told the tool my company was Google, named the CEO, CTO, and PM as my stakeholders, and said I wanted to present the research and ideation phases.

Before any coaching, the tool produced a full five-part summary: the design problem, the key decisions, a profile of each stakeholder grounded in what their role at Google typically cares about, all tied to Google's publicly available priorities. Then it stopped, and flagged a gap it couldn't fill:

"Is this a portfolio review at Google? Or are you pitching this as a potential Google product? The answer changes everything — including what these stakeholders would ask."

The Context Validation Check, Before a Single Coaching Question

It also recognized that Google already operates in health, which would shape how stakeholders evaluate the pitch. It then asked me to confirm that the revised summary accurately reflected my work, the company, and the people in the room.

THE GAP

That two-step catch: surface the ambiguity, update the context, confirm before proceeding, which is exactly the behavior the validation check skill was built to enforce. A coaching session built on the wrong premise trains you for the wrong conversation.

Live Simulation Video

Then the simulation genuinely challenged me. The CEO leaned in: "A board game is a nice proof of concept. What I'm trying to see is what this actually looks like at Google. What are we building?" My first answer was vague, I said "a health tool," mentioned Google Health, and contradicted myself by calling the game "not necessarily educational" right after describing it as awareness-building. The CEO flagged both on the spot.

It kept pushing on the harder question "why does Google specifically need to build this?" and offering three framings and asking which was actually in my head. Then the metrics question, where I didn't have a clean answer. I described distribution; it flagged that distribution isn't a success metric and made me choose: reach, behavior change, or brand association. I picked behavior change and it immediately raised the next problem, self-report bias: how would you actually validate that claim?

Scorecard: five dimensions, anchored to what I said

The scores reflected exactly where I fell apart:

Clarity — Adequate. I landed on a coherent vision, but the CEO had to pull it out of me exchange by exchange instead of me leading with it.

Evidence — Weak. Solid research, but I never anchored the pitch to it — never showed why the board-game format specifically solves the barrier the research found.

Stakeholder alignment — Adequate. Connecting to Google's mission was smart, but I never addressed why Google should own a standalone cultural product versus partnering.

Handling objections — Adequate. I pivoted when pushed, but didn't push back or show conviction.

Persuasiveness — Weak. The CEO would understand the idea. They wouldn't be convinced it's a Google product yet.

XXXXX

Then three specific fixes — lead with the research insight, not the product; connect the game's mechanics to the exact barrier research identified; get specific on measurement before the CEO asks. Finally it offered a choice: run the CEO round again, move to the CTO, or pause and reflect. That structure — score, specific language to fix it, then your choice of what to do next — is what makes it different from rehearsing alone.

WHAT BROKE

Voice was the intended delivery mode — it makes the simulation feel like a real meeting. The AI voice held for a few minutes, then degraded into static, like a walkie-talkie losing signal. The simulation was working; the voice couldn't sustain it. That's a platform limitation, not a concept limitation.

09 — VISUAL CONCEPT

Dark editorial — focused, high-stakes, mirroring the pressure.

The tool was tested as a functional prototype first; the functionality had to work before the interface was designed. Amber as the only warm accent. Syne for headings, DM Sans for UI. Clean, modern, no noise.

The four key screens — hi-fi in progress, built on the tested foundation.

Setup: upload context · choose phases · define stakeholders

Simulation: one stakeholder at a time · no preview · live indicator

Feedback: five dimensions · tips anchored to moments

Moodboard: color · type · tone direction

10 — WHAT I LEARNED

Test with other designers earlier.

I tested the tool on myself, with my own context — a biased sample. I knew the context too well to notice what the AI might miss. Testing with a designer whose work I don't know would surface gaps in the validation check that self-testing can't find.

And voice is a platform limitation, not a design flaw. The concept is right; the current platform can't sustain it. When a voice tool capable of holding a live simulation exists, that's the first thing worth revisiting.

11 — NEXT STEPS

Test with other designers earlier.

More rounds with different designers. The second round is underway — a different case study, a designer who wasn't involved in building it. The goal: find where the validation check fails for someone who doesn't already know how to write a good context document.

Hi-fi screens. Full interaction states — loading while the AI reads context, the confirmation screen, the simulation in active state.

Deeper personalization over time. Each session building on the last automatically — testing whether that actually changes how designers perform.

Voice, revisited. The concept is right; the technical execution isn't. Once the core is stable, voice on a platform that can sustain it is the next meaningful upgrade.

12 — STRATEGIC REFLECTION

The Five Capabilities, applied honestly.

A course-developed analytical lens from HCDE 561: Strategic Human Centered AI, for evaluating AI-integration decisions — used here to find where the tool works, where it introduces new risk, and what it means at scale.

Extraction

AI pulls knowledge out of people · grounded in Polanyi on tacit knowledge

Constructive

Uploading my context felt like a collaborator reading my work carefully and reflecting it back — catching gaps in my rationale before I walked into a simulation. It augmented my thinking rather than replacing it.

Critical

The tool only works if you share everything — full rationale, stakeholder context, company information. The more you give, the better it works. For sensitive or competitive projects, that transparency has a cost worth examining.

Tension

AI needs enough context to coach effectively, but trust is earned through repeated use—and each iteration invites the designer to share more. The unresolved question is where to draw the line between “enough context to be useful” and “more than I should disclose.

Workflow Adaptation

AI reshapes how work gets done · grounded in Pfeffer & Sutton’s knowing-doing gap

Constructive

Before this, I prepared entirely from a designer’s perspective — what I wanted to say, not what each person cared about. Describing each stakeholder and being challenged by questions I hadn’t anticipated made me think from multiple perspectives at once. A different cognitive move.

Critical

It only works if the human has done the hard work first. Skip the context, the stakeholders, the phases — or do them shallowly — and the simulation is shallow. The tool doesn’t reduce that upfront work, it just makes the practice that follows more useful.

Tension

Am I getting better at communicating design decisions, or better at passing a simulation? The real test is whether the clarity transfers to a real meeting with people who don’t follow the script.

Compliance & Safety

AI intersects with risk and responsibility · grounded in Perrow and Vaughan

Constructive

That transparency made me feel safer. Framing the experience as rehearsal rather than prediction helped me use the feedback without over-trusting it. If the real meeting unfolds differently, I will not feel blindsided or assume the tool failed. By managing expectations instead of claiming certainty, the tool built a more durable form of trust.

Critical

A disclaimer only works if internalized. After enough successful sessions, confidence can start to feel like certainty — the disclaimer fades to background, and you walk in believing the simulation was more accurate than it was.

Tension

The tool is honest about what it can’t do — but confidence built through repetition might override that honesty in the moment it matters most.

Model Operations

Governing and managing AI technologies · grounded in Simon’s bounded rationality

Constructive

Building on Claude.ai Projects with a system prompt gave me speed — no code and no one else involved. A working tool the same day. That low barrier meant I could test ideas quickly and fix things in real time.

Critical

The system prompt is the only thing controlling behavior, and I’m the only one watching it. If the AI shifts after a model update, there’s no alert, no review, no team to catch it. Fine for one person; not fine for a team.

Tension

The same thing that made it easy to build — one person, one file, no process — is what makes it fragile at any larger scale.

Functional Replacement

AI as replacement of roles · grounded in Hutchins’ distributed cognition

Constructive

No formal role is replaced. What the tool gives is availability — practice at midnight before a big presentation, without finding a colleague who’s free and explaining the whole context from scratch. Always ready, always patient.

Critical

Something less visible is replaced. Asking a colleague “what will my CTO ask?” prepared you and built a relationship. Own The Room replaces that conversation with a simulation — the knowledge transfer works, the relationship doesn’t.

Tension

The tool is better at being available; a colleague is better at actually knowing you. If every designer practices alone with AI, the informal coaching culture between colleagues slowly disappears — efficiency gained, quiet trust lost.

NEXT PROJECT

Grab a Seat →

You made it to the bottom. Let’s have a quick chat!

Talk to Gracia →

Resume