Best Practices

Tips for designing effective scenarios and troubleshooting common issues.

Scenario Design

Start with the End in Mind

Before building, define:

Learning objective: What should participants be able to do after?
Prerequisites: What should they already know?
Time estimate: How long should it take?

Keep It Focused

Do	Don't
One clear objective per scenario	Multiple unrelated topics
3-5 components max for beginners	Overwhelm with infrastructure
15-30 minute completion time	Multi-hour marathons

Progressive Complexity

Structure scenarios from simple to complex:

Level	Scope
Beginner	1 component, basic operations
Intermediate	2-3 components, interactions
Advanced	Full stack, complex workflows

Real-World Relevance

Use realistic scenarios:

Actual use cases from production
Industry-standard configurations
Meaningful sample data

Instruction Writing

The WHAT-WHY-HOW Pattern

For each instruction:

markdown

## Create a Kafka Topic

**What:** Create a topic called `orders` with 3 partitions.

**Why:** Partitions allow parallel processing. Three partitions
let us scale to three consumers.

**How:** Open the [Kafka terminal](tab:kafka) and run:

\`\`\`bash
kafka-topics --create --topic orders \
  --bootstrap-server localhost:9092 \
  --partitions 3
\`\`\`

**Verify:** List topics to confirm creation:

\`\`\`bash
kafka-topics --list --bootstrap-server localhost:9092
\`\`\`

You should see `orders` in the output.

Show Expected Output

Always tell participants what success looks like:

markdown

You should see output similar to:

\`\`\`
Created topic orders.
\`\`\`

Include Checkpoints

Add verification steps throughout:

markdown

**Checkpoint:** Before continuing, verify:
- [ ] Topic `orders` exists
- [ ] You can produce a test message
- [ ] Consumer receives the message

Anticipate Mistakes

Add troubleshooting tips where errors are common:

markdown

::: tip Troubleshooting
If you see "Connection refused", wait 30 seconds for Kafka
to finish starting, then try again.
:::

Resource Management

Right-Size Components

Start minimal, increase if needed:

Component	Start With	Increase If
PostgreSQL	250m / 512Mi	Query timeouts
Kafka	500m / 1Gi	Slow message processing
Elasticsearch	500m / 1Gi	Search timeouts
Simple app	100m / 128Mi	OOM kills

Calculate Total Resources

Sum all components and add 20% headroom:

Component	CPU	Memory
PostgreSQL	250m	512Mi
Kafka	500m	1Gi
API	200m	256Mi
Subtotal	950m	1792Mi
+ 20% headroom	190m	360Mi
Total request	1140m	2152Mi

Watch for Resource Contention

If scenarios fail under load:

Check if pods are being OOM-killed
Look for CPU throttling
Consider reducing replica counts

Component Configuration

Use Clear Labels

Labels appear in UI and instructions:

Good	Bad
`postgres`	`db1`
`kafka`	`component-2`
`api-server`	`my-app`

Set Appropriate Start Order

Order	Components
0	Databases (postgres, redis)
1	Message brokers (kafka)
2	Backend services (api)
3	Frontend (web)

Configure Health Checks

Ensure components report ready status correctly:

Helm charts usually include probes
Custom apps need explicit configuration

Testing

Test the Happy Path

Run through your scenario as a participant:

Follow every instruction literally
Copy-paste all commands
Verify all expected outputs

Test Error Recovery

Intentionally break things:

Run commands out of order
Make typos in commands
Skip steps and see what happens

Test Time Limits

Ensure the scenario fits the TTL:

Time yourself completing the lab
Add buffer for slower participants
Set TTL to 1.5x your completion time

Test with Fresh Eyes

Have someone unfamiliar with the topic:

Follow your instructions
Note where they get confused
Refine based on feedback

Common Patterns

Database Initialization

markdown

Wait for PostgreSQL to be ready, then create the schema:

\`\`\`bash
# Wait for database
until pg_isready; do sleep 1; done

# Create tables
psql -U postgres -d mydb -f /scripts/schema.sql
\`\`\`

Service Dependencies

markdown

Before starting the API, verify Kafka is ready:

\`\`\`bash
kafka-topics --list --bootstrap-server kafka:9092
\`\`\`

If this command succeeds, Kafka is ready.

Data Flow Verification

markdown

Let's trace a message through the system:

1. **Produce** a message:
   \`\`\`bash
   echo "test" | kafka-console-producer --topic orders \
     --bootstrap-server kafka:9092
   \`\`\`

2. **Consume** to verify:
   \`\`\`bash
   kafka-console-consumer --topic orders \
     --from-beginning --max-messages 1 \
     --bootstrap-server kafka:9092
   \`\`\`

You should see `test` in the output.

Troubleshooting Guide

Scenario Won't Start

Symptom	Likely Cause	Solution
Stuck on "Provisioning"	Resource quota exceeded	Reduce component resources
Pod stays "Pending"	No available nodes	Check cluster capacity
Multiple pods failing	Shared dependency issue	Check start order

Component Issues

Symptom	Likely Cause	Solution
`ImagePullBackOff`	Image doesn't exist	Verify image name and tag
`CrashLoopBackOff`	Container keeps crashing	Check logs for errors
`OOMKilled`	Out of memory	Increase memory limit
`Evicted`	Node pressure	Reduce resource usage

Connectivity Issues

Symptom	Likely Cause	Solution
"Connection refused"	Service not ready	Wait, check readiness
"Name not resolved"	Wrong hostname	Use component label
"Timeout"	Firewall/network	Check service ports

Instruction Issues

Symptom	Likely Cause	Solution
Tab link doesn't work	Wrong label	Match component label exactly
Code not highlighted	Missing language	Add language to fence
Images not showing	Wrong path	Use `/images/filename.png`

Studio Issues

Symptom	Likely Cause	Solution
Canvas won't load	Large scenario	Reduce components
Auto-save failing	Network issue	Check connection
Changes not reflecting	Cache issue	Refresh browser

Anti-Patterns to Avoid

Don't: Assume Prior Knowledge

markdown

❌ Bad: "Configure Kafka as usual"
✅ Good: "Set the following Kafka configuration..."

Don't: Skip Verification Steps

markdown

❌ Bad: "Create the topic and continue"
✅ Good: "Create the topic, then verify with..."

Don't: Use Hardcoded Values

markdown

❌ Bad: "Connect to 192.168.1.100:9092"
✅ Good: "Connect to kafka:9092"

Don't: Forget Cleanup Instructions

markdown

❌ Bad: (scenario ends abruptly)
✅ Good: "Your lab will automatically clean up in X minutes.
         To stop early, click Stop Lab."

Don't: Over-Engineer

markdown

❌ Bad: 15 components for a basic demo
✅ Good: Minimum viable infrastructure

Checklist Before Publishing

[ ] All instructions tested step-by-step
[ ] Expected outputs documented
[ ] Tab navigation links work
[ ] Code blocks have language hints
[ ] Resources calculated and appropriate
[ ] Start order configured correctly
[ ] TTL matches expected completion time
[ ] Troubleshooting tips included
[ ] Scenario has clear name and description

Getting Help

If you're stuck:

Check component logs in the Interfaces view
Review pod events for deployment issues
Test components individually to isolate problems
Ask in your organization for scenario review

Best Practices ​

Scenario Design ​

Start with the End in Mind ​

Keep It Focused ​

Progressive Complexity ​

Real-World Relevance ​

Instruction Writing ​

The WHAT-WHY-HOW Pattern ​

Show Expected Output ​

Include Checkpoints ​

Anticipate Mistakes ​

Resource Management ​

Right-Size Components ​

Calculate Total Resources ​

Watch for Resource Contention ​

Component Configuration ​

Use Clear Labels ​

Set Appropriate Start Order ​

Configure Health Checks ​

Testing ​

Test the Happy Path ​

Test Error Recovery ​

Test Time Limits ​

Test with Fresh Eyes ​

Common Patterns ​

Database Initialization ​

Service Dependencies ​

Data Flow Verification ​

Troubleshooting Guide ​

Scenario Won't Start ​

Component Issues ​

Connectivity Issues ​

Instruction Issues ​

Studio Issues ​

Anti-Patterns to Avoid ​

Don't: Assume Prior Knowledge ​

Don't: Skip Verification Steps ​

Don't: Use Hardcoded Values ​

Don't: Forget Cleanup Instructions ​

Don't: Over-Engineer ​

Checklist Before Publishing ​

Getting Help ​

Best Practices

Scenario Design

Start with the End in Mind

Keep It Focused

Progressive Complexity

Real-World Relevance

Instruction Writing

The WHAT-WHY-HOW Pattern

Show Expected Output

Include Checkpoints

Anticipate Mistakes

Resource Management

Right-Size Components

Calculate Total Resources

Watch for Resource Contention

Component Configuration

Use Clear Labels

Set Appropriate Start Order

Configure Health Checks

Testing

Test the Happy Path

Test Error Recovery

Test Time Limits

Test with Fresh Eyes

Common Patterns

Database Initialization

Service Dependencies

Data Flow Verification

Troubleshooting Guide

Scenario Won't Start

Component Issues

Connectivity Issues

Instruction Issues

Studio Issues

Anti-Patterns to Avoid

Don't: Assume Prior Knowledge

Don't: Skip Verification Steps

Don't: Use Hardcoded Values

Don't: Forget Cleanup Instructions

Don't: Over-Engineer

Checklist Before Publishing

Getting Help