AI product planning

How to Generate an App Spec from a Prompt (Complete 2026 Guide)

The exact pipeline for turning a plain-English product idea into a complete app specification: personas, features, user stories, acceptance criteria, schema, and pages. With prompt templates and working examples.

Ash Metwalli
April 20, 2026
11 min read
AI product planningapp specificationprompt engineeringproduct managementAI PRD
How to Generate an App Spec from a Prompt (Complete 2026 Guide) — cover image
Share:

TL;DR

Generating an app specification from a prompt is now a repeatable pipeline, not a one-shot prompt. The stack: describe the product → extract personas → map features → decompose into INVEST-format user stories → attach Gherkin acceptance criteria → derive a relational schema → inventory pages and components. Each stage takes the previous stage's output as input, so the final spec is consistent rather than a collage of disconnected LLM calls. This guide walks through the full pipeline, gives you copy-paste prompt templates for each stage, and shows how VibeMap automates the whole thing in under 30 minutes. For the broader context, see our pillar guide on AI product planning.

Why the single-prompt approach fails

The first instinct most people have is to paste their entire product idea into ChatGPT and ask for "a complete app specification". The output is technically what you asked for — a wall of text with sections labelled Personas, Features, Stories, Schema. But when you try to use it, three structural problems appear:

  1. No linking. A persona named "Alex the Solopreneur" in the Personas section is gone by the time the stories section is generated; those stories reference a nameless "user". When a reader asks which stories serve Alex, there is no answer.
  2. Inconsistent granularity. The Features section has 4 high-level headings; the Stories section has 47 items; the Schema section has 3 tables. There is no way to know which story maps to which feature or which schema field.
  3. Drift. Ask the same model the same question tomorrow and you'll get a different Personas section. Nothing is persistent.

The fix is to run the pipeline in stages, with each stage's output serialized and passed as explicit context to the next. That's what VibeMap does under the hood. You can do it manually with a general LLM — the rest of this guide shows you how.

The seven-stage pipeline

Stage Input Output Target LLM (for cost)
1. Summary Freeform product description 200–400 word project summary Fast model (Gemini Flash / GPT-4o mini)
2. Personas Summary 3 personas with goals + pains + quote Fast model
3. Features Summary + personas MoSCoW-prioritized feature list Fast model
4. User stories Features + personas INVEST-format stories Reasoning model (Claude Opus / GPT-5)
5. Acceptance criteria User stories Gherkin Given/When/Then per story Reasoning model
6. Database schema Features + stories Relational ERD + SQL DDL Reasoning model
7. Pages + components Stories + schema Page inventory with shared components Fast model

Total LLM spend for a medium-complexity product (10 features, 25 stories, 12-table schema): ~$0.50–$2.00 if you route fast vs reasoning correctly.

Stage 1 — Generate the summary

Start with a natural description. Be specific about the user, the problem, and the core outcome. Vague prompts produce vague specs.

Bad prompt:

Build me an app that helps freelancers.

Good prompt:

Build a web app that helps freelance designers track and visualize client feedback across multiple projects. Core pain: clients leave feedback scattered across email, Slack, and Figma comments; designers lose hours reconciling it. Target: freelancers billing $50–200/hr who handle 3–10 active projects. Must work without client signup.

Prompt template:

Given this product description: <PASTE IDEA HERE>

Write a 300-word project summary that covers:
- The specific target user (role, context, tools they currently use)
- The concrete pain point being solved (with one realistic example scenario)
- The core outcome the product delivers
- Non-goals: things we explicitly will NOT build
- Initial constraints: tech stack preferences, compliance needs, rough scale

Output as plain prose. No headings. No bullet points.

Why plain prose: dense paragraph form preserves better through the downstream stages than a bullet list. Bullet lists tempt the model to drop context.

Stage 2 — Extract personas

Feed the summary into a persona prompt. Demand specifics — demographics, daily context, and one representative quote.

Prompt template:

Here is a product summary:

<PASTE STAGE 1 OUTPUT>

Generate 3 distinct user personas for this product. For each persona:
- Name (realistic, not "User A")
- Role + company size + years of experience
- Daily tools they use
- Top 3 goals related to this product
- Top 3 pains this product would solve
- One representative quote they might say
- A 1-line "why they matter" tagline

Output as JSON. Schema:
{
  "personas": [{
    "name": string, "role": string, "companySize": string,
    "experience": string, "tools": string[], "goals": string[],
    "pains": string[], "quote": string, "whyTheyMatter": string
  }]
}

Demanding JSON is intentional. It forces the model to return structured data that downstream stages can reference by name. "Alex the Solopreneur" now has a stable identifier that every subsequent stage can link back to.

Stage 3 — Feature list with MoSCoW priority

Features are the bridge between personas and stories. Each feature should be scoped to roughly 3–5 user stories' worth of work.

Prompt template:

Product summary:
<PASTE STAGE 1>

Personas:
<PASTE STAGE 2 JSON>

Generate a feature list that serves these personas.
For each feature:
- Name
- 1-sentence description
- Which personas it primarily serves (names)
- MoSCoW priority (Must / Should / Could / Wont)
- T-shirt size estimate (S / M / L / XL)
- Dependencies on other features (if any)

Rules:
- Aim for 8-15 features total
- Every "Must" feature must be required for a minimum lovable product
- No feature should overlap with another

Output as JSON array.

MoSCoW (Must / Should / Could / Won't) is the industry-standard prioritization framework and gives you explicit cutoffs for scope control.

Stage 4 — User stories (INVEST)

Here's where most ChatGPT outputs fall apart. Demand the INVEST framework and tie each story to a named persona and feature.

Prompt template:

Here are features:
<PASTE STAGE 3 JSON>

And personas:
<PASTE STAGE 2 JSON>

For each Must or Should feature, generate 3-5 user stories that follow INVEST:
- Independent (can be built alone)
- Negotiable (details can flex)
- Valuable (delivers user value, not just a tech task)
- Estimable (clear enough to size)
- Small (fits in a single sprint)
- Testable (has verifiable outcome)

Use the format: "As a <persona name>, I want <action>, so that <outcome>."

For each story also specify:
- Which persona (must be a named persona from above)
- Which feature (must be a named feature from above)
- T-shirt size (S / M / L)
- Engineering scope tag: frontend / backend / fullstack

Output as JSON array.

Named persona references are load-bearing: downstream tools (QA, Linear imports, AI coding tools) use them to filter work.

Stage 5 — Acceptance criteria (Gherkin)

Attach 3–5 acceptance criteria to every story, in Given/When/Then format, categorized as happy path, edge case, or failure state.

Prompt template:

For each user story below, generate 3-5 acceptance criteria in Gherkin format:

<PASTE STAGE 4 JSON>

Rules:
- Format: Given <preconditions> When <action> Then <result>
- Every story must have AT LEAST one happy path, one edge case, and one failure state
- Tag each criterion: happyPath / edgeCase / failureState
- Failure states must cover: unauthenticated access, invalid input, server error, rate limit
- Edge cases must cover: empty state, max limits, concurrent actions

Output as JSON array, keyed by story name.

The happy path / edge case / failure state taxonomy is what separates usable QA criteria from decorative ones. It also forces the model to think about negative paths, which it otherwise skips.

Stage 6 — Database schema

Derive the schema from features and stories. Ask for a relational model with explicit types, nullability, and indexes.

Prompt template:

Given these user stories and features:
<PASTE STAGE 3 + STAGE 4>

Generate a relational database schema:
- Tables with columns
- Column types (use Postgres types)
- Nullability
- Primary keys + foreign keys
- Recommended indexes (for any column used in WHERE, JOIN, or ORDER BY by a story)
- One-to-many / many-to-many relationships
- Any ENUMs needed for status fields

Output as:
1. A Mermaid erDiagram block (for visualization)
2. A complete SQL DDL block (executable against Postgres)

Asking for both formats is cheap — the LLM already has both in training data — and gives you a diagram to review plus working DDL to seed a database.

Stage 7 — Pages and components

Final stage: the UI inventory. Reference stories (so every page is justified) and schema (so every page has required data).

Prompt template:

Based on these stories and schema:
<PASTE STAGE 4 + STAGE 6>

Generate a page and component inventory:

Pages:
- Route (e.g. /dashboard, /projects/:id)
- Purpose (1 sentence)
- Linked stories (story names from above)
- Linked schema tables
- Required components

Shared components (used by 2+ pages):
- Name
- Purpose
- Which pages use it
- Props it needs

Output as JSON.

Shared components matter. Without explicit identification, the AI code generator (Cursor, Bolt, v0) will generate a new <Button> component seven times for seven pages. Flagging shared components in the spec saves hours of refactoring later.

The manual pipeline vs an automated one

Running all seven stages by hand in ChatGPT takes roughly 90–120 minutes for a medium-complexity product. You'll do it perfectly the first time, half-well the second, then start skipping stages by the fifth project.

The alternative is a tool that serializes every stage's output into persistent state and re-uses it automatically. VibeMap does exactly this — same seven stages, but with state persistence, artifact linking, change propagation when you edit a persona, and direct export to Linear, Cursor, and the MCP protocol.

If you just want to try one stage without commitment, our free User Story Generator handles Stage 4 in isolation — paste a feature, get INVEST stories in 10 seconds, no signup.

Common pitfalls

  • Skipping Stage 1 because the summary "seems obvious". This is the stage that carries the most context forward; skimping here compounds errors downstream.
  • Using one model for everything. Stages 1, 2, 3, 7 do not need a reasoning model. Using Claude Opus or GPT-5 for persona generation wastes $3–5 per run. Route to Gemini Flash or GPT-4o mini for the scaffolding stages; save the reasoning models for stories, acceptance criteria, and schema.
  • Not demanding JSON. Prose output tempts the model to hallucinate new personas mid-story. JSON forces structural discipline.
  • Reviewing the spec as "good enough" and going straight to code. Every spec has at least one missing non-functional requirement (auth, rate limits, data retention). Do a compliance pass before you prompt a single line of code. Our AI acceptance criteria guide covers the checklist.

Related reading


Try the pipeline on a real idea

The fastest way to understand the difference between a prompt-generated PRD and a pipeline-generated spec is to run both on the same idea.

🎯 Generate a complete spec in under 30 minutes.

👉 Try VibeMap free → · Join the Product Hunt launch waitlist →


Sources & further reading

Related Topics

Related Articles

View all posts