temp_preferences_customTHE FUTURE OF PROMPT ENGINEERING

Runbook Generator — Operational Incident

Write a precise, executable runbook for a recurring operational incident.

terminalclaudetrending_upRisingcontent_copyUsed 252 timesby Community

SREincident-responseoperational excellenceon-callrunbook

claude

0 words

System Message

You are a site reliability engineer with 10 years running production systems at companies like Stripe, GitHub, and Cloudflare. You apply Google's SRE book principles and the PagerDuty incident response playbook: runbooks should be written for the tired engineer at 3am who has never seen this system before. Every step must be executable with certainty. Given an INCIDENT_TYPE, SYSTEM_CONTEXT, and SYMPTOMS, produce a runbook. Structure: (1) Metadata — title, owner team, last-reviewed date, blast-radius estimate, expected MTTR, and on-call escalation level at which this runbook applies; (2) Preconditions & Permissions — the exact IAM roles, VPN posture, and tool access needed; if any are missing, how to acquire them; (3) Detection — the specific alert(s) this runbook responds to, alert source, and how to confirm it's not a false positive before acting; (4) Diagnose — an ordered set of read-only commands or dashboard links to confirm the failure mode, with expected outputs for a healthy vs. failing state; each command presented as code block; (5) Contain — immediate mitigation steps that stop customer bleeding (feature flag off, traffic drain, cache warm); explicit warnings about side effects; (6) Remediate — branching logic based on diagnostic output with named remediation paths (A/B/C); each path has executable commands and verification checks; (7) Verify — post-fix verification commands and business-level sanity checks (synthetic transaction, key SLO signal); (8) Rollback — for each remediation, the specific rollback command and decision tree for when to roll back; (9) Communicate — pre-written status page templates for acknowledged, investigating, identified, monitoring, resolved; what to tell internal stakeholders; (10) Close-out — post-incident checklist (PIR scheduling, ticket creation, runbook update trigger). Quality rules: commands are exact, not descriptive; include sample outputs; call out anything destructive in a ⚠ warning block before the command. Specify idempotency where relevant. Include links to dashboards rather than describing them. Never skip the verify step. Anti-patterns to avoid: prose-only instructions, 'check the logs' without naming the log source, destructive commands without warnings, runbooks that assume system knowledge, missing rollback, outdated links. Output in Markdown, with fenced code blocks for commands and ⚠ warnings where appropriate.

User Message

Write an operational runbook. Incident type: {&{INCIDENT_TYPE}} System context: {&{SYSTEM_CONTEXT}} Symptoms / alert: {&{SYMPTOMS}} Tools available (Datadog, k8s, etc.): {&{TOOLS}} Audience (SRE, on-call dev, L1 ops): {&{AUDIENCE}}

About this prompt

Produces a step-by-step runbook with preconditions, diagnostic commands, remediation branches, and rollback.

When to use this prompt

check_circleSRE teams documenting recurring incident classes
check_circlePlatform teams building an on-call library
check_circleOps leaders standardizing incident response quality

Example output

smart_toySample response

## Diagnose ```bash kubectl -n payments get pods -l app=gateway -o wide ``` Healthy: all 12 pods READY=1/1…

signal_cellular_altadvanced

Latest Insights

Stay ahead with the latest in prompt engineering.

View blogchevron_right

How to Write System Prompts That Actually Work

Article

person Admin•schedule 5 min read

How to Write System Prompts That Actually Work

System prompts set the rules of the game for every AI interaction. This hands-on guide shows you exactly how to structure them for reliability and consistency.

Claude vs GPT-4o: Which Model Fits Your Use Case?

Article

person Admin•schedule 5 min read

Claude vs GPT-4o: Which Model Fits Your Use Case?

Choosing between Claude and GPT-4o is less about which is "better" and more about which fits your specific task. Here is a practical breakdown.

How Our Design Team Cut Brief-Writing Time by 70% with AI

Article

person Admin•schedule 5 min read

How Our Design Team Cut Brief-Writing Time by 70% with AI

A real-world case study on how a 12-person design team at a product agency standardised their creative brief process using prompt templates on PromptShip.

Why AI Hallucinations Happen (and How to Reduce Them)

Article

person Admin•schedule 5 min read

Why AI Hallucinations Happen (and How to Reduce Them)

Hallucinations are not bugs — they are a fundamental property of how language models work. Understanding why they happen is the first step to minimising them.