Distributed Transactions in Go: Sagas, Outbox, and 2PC

The hardest problem in distributed systems isn’t consensus — it’s money changing hands across service boundaries. Here’s how real Go backends handle it.

Why You Can’t Use Database Transactions Across Services

// This looks reasonable. It is not.
func PlaceOrder(ctx context.Context, order Order) error {
    // Service A: deduct inventory
    if err := inventoryClient.Reserve(ctx, order.Items); err != nil {
        return err
    }
    // Service B: charge payment — what if this fails?
    if err := paymentClient.Charge(ctx, order.Total); err != nil {
        // NOW WHAT? Inventory is reserved, payment failed.
        // You just created an inconsistency.
        inventoryClient.Release(ctx, order.Items) // this can also fail!
        return err
    }
    return nil
}

This is the dual-write problem. No distributed transaction can make this atomic without significant tradeoffs.

Pattern 1: Saga (Choreography)

Each service publishes events; downstream services react and publish their own compensating events on failure.

// Order service publishes an event
type OrderPlacedEvent struct {
    OrderID   string    `json:"order_id"`
    Items     []Item    `json:"items"`
    Total     float64   `json:"total"`
    CreatedAt time.Time `json:"created_at"`
}

func (s *OrderService) PlaceOrder(ctx context.Context, req PlaceOrderRequest) error {
    // 1. Save order in PENDING state (local DB transaction)
    tx, _ := s.db.BeginTx(ctx, nil)
    order, err := s.repo.CreateOrder(ctx, tx, req)
    if err != nil { tx.Rollback(); return err }

    // 2. Publish event in same transaction (Outbox pattern — see below)
    err = s.outbox.Publish(ctx, tx, "order.placed", OrderPlacedEvent{
        OrderID: order.ID,
        Items:   order.Items,
        Total:   order.Total,
    })
    if err != nil { tx.Rollback(); return err }

    return tx.Commit()
}

Inventory service listens and either reserves stock (publishing inventory.reserved) or publishes inventory.failed which triggers compensation.

Pattern 2: The Transactional Outbox

The critical piece that makes sagas reliable: write the event to the DB in the same transaction as your state change, then a relay picks it up and publishes to the broker.

type OutboxEvent struct {
    ID          string    `db:"id"`
    AggregateID string    `db:"aggregate_id"`
    Topic       string    `db:"topic"`
    Payload     []byte    `db:"payload"`
    PublishedAt *time.Time `db:"published_at"` // nil = not yet published
    CreatedAt   time.Time `db:"created_at"`
}

// Outbox relay — runs as a background goroutine
func (r *OutboxRelay) Run(ctx context.Context) {
    ticker := time.NewTicker(500 * time.Millisecond)
    defer ticker.Stop()

    for {
        select {
        case <-ctx.Done():
            return
        case <-ticker.C:
            r.publishPending(ctx)
        }
    }
}

func (r *OutboxRelay) publishPending(ctx context.Context) {
    events, err := r.repo.FindUnpublished(ctx, 100)
    if err != nil { return }

    for _, evt := range events {
        if err := r.broker.Publish(ctx, evt.Topic, evt.Payload); err != nil {
            continue // retry next tick
        }
        r.repo.MarkPublished(ctx, evt.ID)
    }
}

Schema:

CREATE TABLE outbox_events (
    id           UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    aggregate_id TEXT NOT NULL,
    topic        TEXT NOT NULL,
    payload      JSONB NOT NULL,
    published_at TIMESTAMPTZ,
    created_at   TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX idx_outbox_unpublished ON outbox_events (created_at)
    WHERE published_at IS NULL;

Pattern 3: Two-Phase Commit (and why you probably shouldn’t)

2PC requires all participants to agree before committing. In Go, this means coordinating a distributed lock across services:

// Phase 1: Prepare (all services lock their resources)
// Phase 2: Commit (coordinator tells everyone to finalize)

Why to avoid it: blocking protocol, coordinator SPOF, terrible latency. Use it only for XA transactions with your DB and message broker (Kafka supports it).

Choosing the Right Pattern

Scenario	Recommended Pattern
E-commerce checkout	Saga + Outbox
Bank transfer within one DB	Local ACID transaction
Cross-bank transfer	Saga with idempotent steps
Batch job across services	Orchestration saga (e.g. Temporal)
Strong consistency required	Reconsider service boundary

Compensating Transactions

Every forward step must have a compensating step:

type SagaStep struct {
    Name       string
    Execute    func(ctx context.Context) error
    Compensate func(ctx context.Context) error
}

func RunSaga(ctx context.Context, steps []SagaStep) error {
    executed := make([]SagaStep, 0, len(steps))
    
    for _, step := range steps {
        if err := step.Execute(ctx); err != nil {
            // Roll back in reverse order
            for i := len(executed) - 1; i >= 0; i-- {
                executed[i].Compensate(ctx) // best-effort, log failures
            }
            return fmt.Errorf("saga failed at step %s: %w", step.Name, err)
        }
        executed = append(executed, step)
    }
    return nil
}

The saga pattern doesn’t give you ACID — it gives you eventual consistency with compensations. Design your UX accordingly: show “pending” states, support manual review queues for stuck sagas.