Prompt engineering as design work

Two years ago, “prompt” was an engineering term. Today, in teams shipping AI, designers write prompts shoulder to shoulder with data science. It’s not coding. It’s not copywriting. It’s a new discipline, with its own aesthetic, and it leverages exactly the kind of thinking designers already do.

I’ll cover five patterns worth knowing, plus a reverse-engineering technique that changes the way I start any new prompt.

Why prompts are design work

The prompt is the interface between human intent and system behaviour. It defines tone, format, limits. It’s the agent’s microcopy, but with more weight: a label’s microcopy lives on a screen; a prompt defines the universe of possible responses.

When data science writes the prompt alone, it usually comes out correct and soulless. When design writes alone, it usually comes out with soul and without technical precision. The good work happens at the intersection. More and more I see teams where the designer writes the first version, takes it to the OpenAI Playground or equivalent, tests, and only then hands off to data science to tune.

The five core patterns

1. Few-shot prompting. Show examples before the task. Instead of describing in words what you want, show 2 to 5 input → output examples. The model extracts the pattern. Works very well for tone, format, and structure.

Input: "I want Thai food near me"
Output: "Found 12 Thai restaurants within 2 km. Want to see the 
top-rated ones first or those with fastest delivery?"

Input: "vegetarian lunch under 10€"
Output: "Found 8 vegetarian options under 10€ in your area. Want 
to filter by gluten-free or do you have another preference?"

From this, the AI learns that the answer should confirm the search, suggest profile-relevant filters, and stay conversational without overdoing it.

2. Chain of thought. Ask for step-by-step reasoning before reaching the answer. “Think out loud before answering” is, literally, part of the prompt. Output quality rises noticeably for tasks with multiple decisions.

3. React. Cycles of reasoning + acting. The agent thinks, picks an action (e.g. calls a tool), observes the result, adjusts the thought, picks the next action. It’s the dominant pattern in autonomous agents like Claude Code.

4. Reflection. After producing an output, the agent looks at its own output and adjusts. “Critique this answer. Is it clear? Does it match the tone asked? Rewrite if needed.” Simple, powerful trick.

5. Role priming. Define the agent’s role at the start of the prompt. “You are an experienced food concierge, good-humoured, focused on families with kids.” Not magic, but it works.

The technique that changes everything: reverse engineering

The most useful thing I’ve learned about writing prompts is not to start with the prompt.

I start with the ideal message I want to see on screen. I mock it in a text file. I make variations: what would it look like if the user asked something simple, what if they asked something ambiguous, what if the system failed. I have 5 to 10 mocked outputs in plain text.

Only then do I ask: what prompt would produce these?

The advantage is double. First, I write the mocks with a designer’s instinct (tone, rhythm, copy decisions), where I’m strongest. Second, when I hand it to data science, I’m not explaining the prompt in the abstract. I’m showing the desired output. They get it in seconds.

This method also solves a classic problem: we tend to write prompts for one situation in our heads and the agent generalises poorly. By mocking 5 to 10 outputs across different situations, the prompt that comes out covers real variation.

Boundaries: the part many forget

A prompt without boundaries is a prompt that will break your heart in production. Users will ask things you didn’t think of. The agent will make trade-offs you didn’t want.

Real example: in AI-driven restaurant discovery, when there are no exact results, the agent relaxes filters. Expands the radius from 1 km to 2 km. Lifts the budget from 10€ to 15€. Reasonable. But if the user asked for “gluten-free” or “no peanuts”, that’s a dietary restriction with medical risk. It can never be relaxed.

I solved this by adding a hard rule to the prompt: “never relax allergy or intolerance filters. If there are no results matching the requested restriction, tell the user clearly and offer to expand the location or time slot instead.”

Boundaries like this become essential parts of the prompt. They aren’t polish; they’re what separates a usable product from one that can hurt.

How to fold this into your flow

Three concrete changes to your process:

Add “mock the output” as a step before any prompt. Even if it’s just three sentences in a text file.
Get access to the OpenAI Playground or your data science team’s equivalent. Without testing prompts, it’s theory.
Treat boundaries as deliverables. List them in Figma, in Notion, wherever it fits. They become part of the hand-off.

More on the why of this shift in the Design for AI guide. On how the output stops being fixed and what that asks of the method, see From deterministic to probabilistic design.