Panic Recovery
Panics happen.
They may come from:
- Programmer errors
- Unexpected nil dereferences
- Third-party library bugs
- Assumptions violated under rare conditions
What matters is how your system reacts.
In production systems, a panic inside a single request must not crash the entire process.
Plumego does not enable panic recovery by default.
This is intentional.
Instead, panic recovery is implemented explicitly as middleware.
Design Principles
Before implementing panic recovery, it is important to clarify intent:
- Panics represent bugs, not expected behavior
- Recovery should prevent process crashes, not hide problems
- Panics must be observable (logged, traced)
- Normal error handling must remain explicit
Panic recovery is a last line of defense, not a control-flow mechanism.
Where Panic Recovery Belongs
Panic recovery belongs in middleware.
Reasons:
- It applies uniformly to all requests
- It must wrap handler execution
- It should run early in the request lifecycle
Placing recovery logic elsewhere risks incomplete coverage.
Basic Panic Recovery Middleware
Below is a conceptual example of a panic recovery middleware.
func RecoveryMiddleware(logger Logger) plumego.Middleware {
return func(ctx *plumego.Context, next plumego.NextFunc) {
defer func() {
if r := recover(); r != nil {
logger.Error("panic recovered",
"trace_id", ctx.Get("trace_id"),
"panic", r,
)
ctx.JSON(
http.StatusInternalServerError,
map[string]string{
"error": "internal server error",
},
)
}
}()
next()
}
}
Key characteristics:
recover()is called in a deferred function- Panic information is logged
- A safe error response is returned
- The process continues running
Middleware Order Matters
Panic recovery middleware should be registered early.
Recommended order:
- Trace ID middleware
- Logging middleware
- Panic recovery middleware
- Authentication / Authorization
- Handlers
This ensures:
- Panics are correlated with a Trace ID
- Panic details are logged
- Authentication logic does not bypass recovery
Incorrect ordering can leave gaps.
What Panic Recovery Should Not Do
Panic recovery middleware must not:
- Suppress logging
- Retry the handler
- Attempt partial recovery of state
- Swallow panics silently
A recovered panic should still be treated as a serious incident.
Panic vs Error: Reinforcing the Boundary
A critical distinction:
- Errors are expected failure modes
- Panics are programming failures
If you find yourself relying on panic recovery during normal operation,
error handling boundaries are likely incorrect.
Panic recovery exists to prevent catastrophic failure —
not to normalize broken behavior.
Ensuring the Response Is Safe
When recovering from a panic:
- Do not leak internal details
- Do not expose stack traces to clients
- Return a generic error message
Detailed diagnostics belong in logs, not responses.
Capturing Stack Traces
In production systems, capturing stack traces is often useful.
Example (conceptual):
stack := debug.Stack()
logger.Error("panic recovered",
"trace_id", ctx.Get("trace_id"),
"panic", r,
"stack", string(stack),
)
This should be logged securely and access-controlled.
Interaction with Logging Middleware
Panic recovery middleware typically works with logging middleware.
Two common patterns:
- Recovery middleware logs the panic itself
- Logging middleware logs request completion, even after panic recovery
Ensure that:
- Panic recovery does not short-circuit logging unintentionally
- Logging accurately reflects request failure
Testing middleware interaction is essential.
Testing Panic Recovery
You should test panic recovery explicitly.
Typical tests include:
- Handler that panics
- Middleware that panics
- Nested panic scenarios
Verify that:
- The process does not crash
- A 500 response is returned
- Logs contain panic information
- Trace IDs are present
Avoiding Over-Reliance on Recovery
If panic recovery is triggered frequently:
- Treat it as a signal
- Investigate root causes
- Improve error handling discipline
A healthy system rarely triggers panic recovery.
Summary
In Plumego:
- Panic recovery is explicit middleware
- It prevents process crashes
- It preserves observability
- It does not hide bugs
- It reinforces correct error boundaries
Recovery is about resilience, not convenience.
Next
With crash safety in place, the next production concern is:
→ Authentication and JWT
This explains how to implement access control without polluting core logic.