Backend Development

Designing a High-Performance, Multi-Goroutine Socket Server in Go

High-performance networking is one of Go’s strengths. With lightweight goroutines, a rich net package, and strong concurrency primitives, Go is a great fit for …

Leeting Yan 2025-11-25 8 min read 1624 words

High-performance networking is one of Go’s strengths. With lightweight goroutines, a rich net package, and strong concurrency primitives, Go is a great fit for building custom TCP servers, game backends, proxies, and internal protocols.

In this article, we’ll design and implement a high-performance, multi-goroutine socket server in Go, with an architecture you can evolve into a real-world production system.

We’ll cover:

Architecture and design goals
A baseline TCP server
A multi-goroutine concurrency model
Connection limits and backpressure
Request handling with worker pools
Graceful shutdown and observability

1. Goals and Design Principles

We’ll design a server that:

Accepts many concurrent TCP connections.
Handles each connection in dedicated goroutines.
Uses a worker pool for heavier request processing.
Implements backpressure and connection limits.
Supports graceful shutdown (finish in-flight work).
Is easy to extend with custom protocols.

We’ll assume a simple binary protocol:

Each message is length-prefixed with a 4-byte big-endian uint32.
After the length comes that many bytes of payload.
The server echoes the payload back with the same framing.
In a real system, the payload would be JSON, protobuf, etc.

2. Project Layout

A simple layout:

socket-server/
  go.mod
  cmd/server/main.go
  internal/server/config.go
  internal/server/tcp_server.go
  internal/server/worker_pool.go
  internal/protocol/frame.go

For brevity, we’ll inline most code here, but you can easily split into files/modules for your project.

3. Configuration and Constants

Start with some configuration defaults.

package server

import "time"

type Config struct {
    ListenAddr         string        // e.g. ":9000"
    MaxConnections     int           // maximum concurrent clients
    ReadTimeout        time.Duration // per read
    WriteTimeout       time.Duration // per write
    IdleTimeout        time.Duration // overall idle
    WorkerPoolSize     int           // number of workers
    MaxMessageSize     int           // limit message size
    Backlog            int           // optional: backlog for channels
}

func DefaultConfig() Config {
    return Config{
        ListenAddr:     ":9000",
        MaxConnections: 10000,
        ReadTimeout:    10 * time.Second,
        WriteTimeout:   10 * time.Second,
        IdleTimeout:    60 * time.Second,
        WorkerPoolSize: 64,
        MaxMessageSize: 1 << 20, // 1MB
        Backlog:        1024,
    }
}

4. Protocol Framing: Length-Prefixed Messages

Let’s build a simple framing helper so net.Conn operations are clean.

package protocol

import (
    "bufio"
    "encoding/binary"
    "fmt"
    "io"
)

const headerSize = 4 // 4 bytes length prefix

// ReadFrame reads a single length-prefixed message from r.
func ReadFrame(r *bufio.Reader, maxSize int) ([]byte, error) {
    header := make([]byte, headerSize)
    if _, err := io.ReadFull(r, header); err != nil {
        return nil, err
    }

    length := binary.BigEndian.Uint32(header)
    if length == 0 {
        return nil, fmt.Errorf("empty frame")
    }
    if int(length) > maxSize {
        return nil, fmt.Errorf("frame too large: %d > %d", length, maxSize)
    }

    payload := make([]byte, length)
    if _, err := io.ReadFull(r, payload); err != nil {
        return nil, err
    }
    return payload, nil
}

// WriteFrame writes a length-prefixed message to w.
func WriteFrame(w io.Writer, payload []byte) error {
    header := make([]byte, headerSize)
    binary.BigEndian.PutUint32(header, uint32(len(payload)))

    if _, err := w.Write(header); err != nil {
        return err
    }
    _, err := w.Write(payload)
    return err
}

5. Worker Pool for Request Processing

We don’t want each connection goroutine to do heavy CPU or IO work. Instead, we’ll dispatch tasks to a shared worker pool.

package server

import (
    "context"
)

type Task struct {
    ConnID  uint64
    Payload []byte
    ResultC chan []byte
    ErrC    chan error
}

type WorkerPool struct {
    cfg    Config
    tasks  chan Task
    cancel context.CancelFunc
}

func NewWorkerPool(ctx context.Context, cfg Config) *WorkerPool {
    ctx, cancel := context.WithCancel(ctx)
    wp := &WorkerPool{
        cfg:    cfg,
        tasks:  make(chan Task, cfg.Backlog),
        cancel: cancel,
    }
    for i := 0; i < cfg.WorkerPoolSize; i++ {
        go wp.workerLoop(ctx, i)
    }
    return wp
}

func (wp *WorkerPool) Submit(task Task) {
    wp.tasks <- task
}

func (wp *WorkerPool) Close() {
    wp.cancel()
}

func (wp *WorkerPool) workerLoop(ctx context.Context, workerID int) {
    for {
        select {
        case <-ctx.Done():
            return
        case task := <-wp.tasks:
            // In a real server: parse payload, route, call business logic, DB, etc.
            // Here we "echo" with a small modification to show processing.
            processed := append([]byte{}, task.Payload...)
            // Example: prefix with worker ID
            result := append([]byte("worker-"), []byte(string('0'+workerID))...)
            result = append(result, ':')
            result = append(result, processed...)

            select {
            case task.ResultC <- result:
            case <-ctx.Done():
                task.ErrC <- ctx.Err()
            }
        }
    }
}

This is intentionally simple: each Task includes a response channel so the connection goroutine can resume once processing is done.

6. Connection Handling: One Goroutine per Connection

Now let’s implement the TCP server: accept connections, limit them, and handle each in a goroutine.

package server

import (
    "bufio"
    "context"
    "fmt"
    "log"
    "net"
    "sync"
    "sync/atomic"
    "time"

    "example.com/socket-server/internal/protocol"
)

type TCPServer struct {
    cfg        Config
    listener   net.Listener
    activeConn int64 // atomic
    nextID     uint64
    wp         *WorkerPool

    mu      sync.Mutex
    closing bool
    wg      sync.WaitGroup
}

func NewTCPServer(cfg Config) *TCPServer {
    return &TCPServer{
        cfg: cfg,
    }
}

func (s *TCPServer) Serve(ctx context.Context) error {
    // Setup listener with optional advanced options via net.ListenConfig
    ln, err := net.Listen("tcp", s.cfg.ListenAddr)
    if err != nil {
        return err
    }
    s.listener = ln
    log.Printf("Listening on %s", s.cfg.ListenAddr)

    // Start worker pool
    s.wp = NewWorkerPool(ctx, s.cfg)

    // Accept loop
    for {
        conn, err := ln.Accept()
        if err != nil {
            if s.isClosing() {
                return nil
            }
            log.Printf("Accept error: %v", err)
            continue
        }

        // Connection limit
        if atomic.LoadInt64(&s.activeConn) >= int64(s.cfg.MaxConnections) {
            log.Printf("Too many connections, rejecting new client")
            conn.Close()
            continue
        }

        atomic.AddInt64(&s.activeConn, 1)
        s.wg.Add(1)

        connID := atomic.AddUint64(&s.nextID, 1)
        go s.handleConn(ctx, connID, conn)
    }
}

func (s *TCPServer) handleConn(ctx context.Context, connID uint64, conn net.Conn) {
    defer func() {
        conn.Close()
        atomic.AddInt64(&s.activeConn, -1)
        s.wg.Done()
        log.Printf("Conn %d closed", connID)
    }()

    log.Printf("Conn %d accepted from %s", connID, conn.RemoteAddr())

    // Buffers
    reader := bufio.NewReader(conn)

    // Idle timeout management
    if s.cfg.IdleTimeout > 0 {
        _ = conn.SetDeadline(time.Now().Add(s.cfg.IdleTimeout))
    }

    for {
        // Reset read deadline
        if s.cfg.ReadTimeout > 0 {
            _ = conn.SetReadDeadline(time.Now().Add(s.cfg.ReadTimeout))
        }

        payload, err := protocol.ReadFrame(reader, s.cfg.MaxMessageSize)
        if err != nil {
            if netErr, ok := err.(net.Error); ok && netErr.Timeout() {
                log.Printf("Conn %d read timeout", connID)
            } else {
                log.Printf("Conn %d read error: %v", connID, err)
            }
            return
        }

        // Prepare task
        resultC := make(chan []byte, 1)
        errC := make(chan error, 1)

        task := Task{
            ConnID:  connID,
            Payload: payload,
            ResultC: resultC,
            ErrC:    errC,
        }

        // Submit to worker pool (this may block if backlog is full – backpressure)
        select {
        case <-ctx.Done():
            return
        case s.wp.tasks <- task: // direct access to channel for performance
        }

        // Wait for worker response or context cancellation
        select {
        case <-ctx.Done():
            return
        case err := <-errC:
            if err != nil {
                log.Printf("Conn %d worker error: %v", connID, err)
                return
            }
        case result := <-resultC:
            if s.cfg.WriteTimeout > 0 {
                _ = conn.SetWriteDeadline(time.Now().Add(s.cfg.WriteTimeout))
            }
            if err := protocol.WriteFrame(conn, result); err != nil {
                log.Printf("Conn %d write error: %v", connID, err)
                return
            }
        }
    }
}

func (s *TCPServer) isClosing() bool {
    s.mu.Lock()
    defer s.mu.Unlock()
    return s.closing
}

// Shutdown initiates graceful shutdown: stop accepting new connections,
// close listener, and wait for active connections.
func (s *TCPServer) Shutdown(ctx context.Context) error {
    s.mu.Lock()
    s.closing = true
    s.mu.Unlock()

    if s.listener != nil {
        if err := s.listener.Close(); err != nil {
            return err
        }
    }
    if s.wp != nil {
        s.wp.Close()
    }

    done := make(chan struct{})
    go func() {
        s.wg.Wait()
        close(done)
    }()

    select {
    case <-ctx.Done():
        return ctx.Err()
    case <-done:
        log.Printf("Graceful shutdown complete")
        return nil
    }
}

Note: For simplicity, we directly use s.wp.tasks in handleConn. You can wrap it with a method if you prefer.

7. Wiring It Up in `main.go`

A simple entry point that supports graceful shutdown with signals:

package main

import (
    "context"
    "log"
    "os"
    "os/signal"
    "syscall"
    "time"

    "example.com/socket-server/internal/server"
)

func main() {
    cfg := server.DefaultConfig()
    srv := server.NewTCPServer(cfg)

    ctx, cancel := context.WithCancel(context.Background())
    defer cancel()

    // Signal handling for graceful shutdown
    sigC := make(chan os.Signal, 1)
    signal.Notify(sigC, syscall.SIGINT, syscall.SIGTERM)

    go func() {
        sig := <-sigC
        log.Printf("Received signal: %v, shutting down...", sig)
        shutdownCtx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
        defer cancel()
        if err := srv.Shutdown(shutdownCtx); err != nil {
            log.Printf("Shutdown error: %v", err)
        }
        cancel() // stop main context
    }()

    if err := srv.Serve(ctx); err != nil {
        log.Printf("Server error: %v", err)
    }
}

Compile and run:

go mod init example.com/socket-server
go mod tidy
go run ./cmd/server

8. Performance Considerations

8.1 Goroutine Model

We use:

One goroutine per connection (read + write handling).
Fixed-size worker pool for heavier processing.

This gives:

Simple code.
Good performance for thousands of connections.
Predictable CPU usage via pool size.

For ultra-high scale (hundreds of thousands of connections), you might explore:

Sharded event loops.
More advanced scheduling patterns.
Connection batching.

But for most backends, the “goroutine per connection + worker pool” pattern is more than enough.

8.2 Backpressure and Overload

Our design introduces backpressure via:

Bounded worker task channel (Backlog).
MaxConnections limit.

When overloaded:

Newly accepted connections can be rejected.
Writes to tasks block, slowing down producers (connection goroutines).

You can also add:

Load shedding: detect queue saturation and send “server busy” responses.
Prioritization: high priority tasks in separate channels.

8.3 Deadlines and Timeouts

We used:

ReadTimeout and WriteTimeout: limit individual operations.
IdleTimeout: via overall SetDeadline, avoids stale connections.

Tuning depends on your use case:

Low-latency APIs: use smaller timeouts.
Streaming: longer timeouts and heartbeat pings.

8.4 Keep-Alive and Socket Options

For long-lived connections, you can tune TCP options:

if tcpConn, ok := conn.(*net.TCPConn); ok {
    tcpConn.SetKeepAlive(true)
    tcpConn.SetKeepAlivePeriod(30 * time.Second)
}

You can also use net.ListenConfig to set advanced socket options like SO_REUSEADDR or SO_REUSEPORT for multi-process scenarios.

9. Observability: Metrics and Logging

To make this production-ready, add:

Prometheus metrics:
- active_connections
- requests_total
- request_duration_seconds
- worker_queue_length
Structured logging with request IDs.
Tracing (e.g., OpenTelemetry) at protocol message boundaries.

Example (conceptual) metric update:

// After accepting a connection
atomic.AddInt64(&s.activeConn, 1)
// Also update a gauge:
// activeConnectionsGauge.Inc()

10. Where to Go Next

You now have a high-performance, multi-goroutine TCP server architecture in Go, with:

A clear connection model
Worker pool processing
Backpressure and limits
Timeouts and graceful shutdown
A clean, extensible protocol framing layer

You can evolve this into:

A custom game server (rooms, sessions, match logic).
A binary RPC protocol.
A gateway/proxy for HTTP, WebSocket, or gRPC.
A streaming server for events and logs.

Keep Reading

Follow the engineering thread

Get the next practical Birdor note, or browse the archive for related systems, tooling, and architecture work.

Join newsletter Browse articles