Panic Recovery

Panics happen.

They may come from:

Programmer errors
Unexpected nil dereferences
Third-party library bugs
Assumptions violated under rare conditions

What matters is how your system reacts.

In production systems, a panic inside a single request must not crash the entire process.

Plumego does not enable panic recovery by default.
This is intentional.

Instead, panic recovery is implemented explicitly as middleware.

Design Principles

Before implementing panic recovery, it is important to clarify intent:

Panics represent bugs, not expected behavior
Recovery should prevent process crashes, not hide problems
Panics must be observable (logged, traced)
Normal error handling must remain explicit

Panic recovery is a last line of defense, not a control-flow mechanism.

Where Panic Recovery Belongs

Panic recovery belongs in middleware.

Reasons:

It applies uniformly to all requests
It must wrap handler execution
It should run early in the request lifecycle

Placing recovery logic elsewhere risks incomplete coverage.

Basic Panic Recovery Middleware

Below is a conceptual example of a panic recovery middleware.

func RecoveryMiddleware(logger Logger) plumego.Middleware {
	return func(ctx *plumego.Context, next plumego.NextFunc) {
		defer func() {
			if r := recover(); r != nil {
				logger.Error("panic recovered",
					"trace_id", ctx.Get("trace_id"),
					"panic", r,
				)

				ctx.JSON(
					http.StatusInternalServerError,
					map[string]string{
						"error": "internal server error",
					},
				)
			}
		}()

		next()
	}
}

Key characteristics:

recover() is called in a deferred function
Panic information is logged
A safe error response is returned
The process continues running

Middleware Order Matters

Panic recovery middleware should be registered early.

Recommended order:

Trace ID middleware
Logging middleware
Panic recovery middleware
Authentication / Authorization
Handlers

This ensures:

Panics are correlated with a Trace ID
Panic details are logged
Authentication logic does not bypass recovery

Incorrect ordering can leave gaps.

What Panic Recovery Should Not Do

Panic recovery middleware must not:

Suppress logging
Retry the handler
Attempt partial recovery of state
Swallow panics silently

A recovered panic should still be treated as a serious incident.

Panic vs Error: Reinforcing the Boundary

A critical distinction:

Errors are expected failure modes
Panics are programming failures

If you find yourself relying on panic recovery during normal operation,
error handling boundaries are likely incorrect.

Panic recovery exists to prevent catastrophic failure —
not to normalize broken behavior.

Ensuring the Response Is Safe

When recovering from a panic:

Do not leak internal details
Do not expose stack traces to clients
Return a generic error message

Detailed diagnostics belong in logs, not responses.

Capturing Stack Traces

In production systems, capturing stack traces is often useful.

Example (conceptual):

stack := debug.Stack()

logger.Error("panic recovered",
	"trace_id", ctx.Get("trace_id"),
	"panic", r,
	"stack", string(stack),
)

This should be logged securely and access-controlled.

Interaction with Logging Middleware

Panic recovery middleware typically works with logging middleware.

Two common patterns:

Recovery middleware logs the panic itself
Logging middleware logs request completion, even after panic recovery

Ensure that:

Panic recovery does not short-circuit logging unintentionally
Logging accurately reflects request failure

Testing middleware interaction is essential.

Testing Panic Recovery

You should test panic recovery explicitly.

Typical tests include:

Handler that panics
Middleware that panics
Nested panic scenarios

Verify that:

The process does not crash
A 500 response is returned
Logs contain panic information
Trace IDs are present

Avoiding Over-Reliance on Recovery

If panic recovery is triggered frequently:

Treat it as a signal
Investigate root causes
Improve error handling discipline

A healthy system rarely triggers panic recovery.

Summary

In Plumego:

Panic recovery is explicit middleware
It prevents process crashes
It preserves observability
It does not hide bugs
It reinforces correct error boundaries

Recovery is about resilience, not convenience.

With crash safety in place, the next production concern is:

→ Authentication and JWT

This explains how to implement access control without polluting core logic.