Type-Safe Boundary Enforcement in Distributed Systems

Leveraging Zod and custom utility types to ensure runtime validity across service boundaries in TypeScript microservices.

Ioannis Karasavvaidis | 8 min read November 2, 2024
TypeScript Zod Security Distributed Systems

The Runtime Gap

TypeScript’s type system is a compile-time construct. The moment data crosses a service boundary — an API call, a message queue, a database read — those guarantees evaporate. Your User type says email: string, but the API might return null, an empty string, or a number masquerading as a string.

In distributed systems, this gap between compile-time types and runtime reality is where production incidents live.

Zod as the Boundary Guard

Zod bridges the gap by providing runtime validation that generates TypeScript types. Instead of defining types and hoping the data matches, you define schemas and derive types from them:

import { z } from "zod";

const UserSchema = z.object({
  id: z.string().uuid(),
  email: z.string().email(),
  role: z.enum(["admin", "member", "viewer"]),
  metadata: z.record(z.unknown()).optional(),
  createdAt: z.string().datetime(),
});

// Type is derived FROM the schema — single source of truth
type User = z.infer<typeof UserSchema>;

The schema is the contract. The type is a consequence.

Building a Service Client

Every external service call should parse its response through a schema. No exceptions:

class ServiceClient {
  async get<T extends z.ZodType>(url: string, schema: T): Promise<z.infer<T>> {
    const response = await fetch(url);

    if (!response.ok) {
      throw new ServiceError(
        `${url} responded with ${response.status}`,
        response.status,
      );
    }

    const data = await response.json();
    const result = schema.safeParse(data);

    if (!result.success) {
      // Log the validation error for debugging
      console.error("Schema validation failed:", {
        url,
        errors: result.error.flatten(),
      });
      throw new SchemaViolationError(url, result.error);
    }

    return result.data;
  }
}

The safeParse method is intentional — it doesn’t throw, giving you control over error handling. In production, you want to log the specific validation failures before surfacing a generic error to the caller.

Discriminated Unions for API Responses

A pattern that has saved us from countless bugs: model API responses as discriminated unions rather than optional fields:

const ApiResponseSchema = z.discriminatedUnion("status", [
  z.object({
    status: z.literal("success"),
    data: UserSchema,
  }),
  z.object({
    status: z.literal("error"),
    code: z.string(),
    message: z.string(),
  }),
  z.object({
    status: z.literal("pending"),
    retryAfter: z.number(),
  }),
]);

type ApiResponse = z.infer<typeof ApiResponseSchema>;

function handleResponse(response: ApiResponse) {
  switch (response.status) {
    case "success":
      // TypeScript knows response.data is User
      return processUser(response.data);
    case "error":
      // TypeScript knows response.code exists
      return handleError(response.code, response.message);
    case "pending":
      // TypeScript knows response.retryAfter exists
      return scheduleRetry(response.retryAfter);
  }
}

The compiler enforces exhaustive handling. Add a new status? Every switch statement that handles ApiResponse will produce a type error until updated.

Message Queue Contracts

The same pattern applies to event-driven architectures. Every message published to a queue should have a schema:

const OrderEventSchema = z.discriminatedUnion("type", [
  z.object({
    type: z.literal("order.created"),
    orderId: z.string().uuid(),
    items: z.array(
      z.object({
        sku: z.string(),
        quantity: z.number().int().positive(),
      }),
    ),
    timestamp: z.string().datetime(),
  }),
  z.object({
    type: z.literal("order.cancelled"),
    orderId: z.string().uuid(),
    reason: z.string(),
    timestamp: z.string().datetime(),
  }),
]);

The consumer validates every message before processing. Invalid messages go to a dead letter queue with the validation error attached — not silently swallowed, not crashing the consumer.

The Cost of Not Validating

In a system I inherited, a upstream service changed a date field from ISO 8601 to Unix timestamps. Without boundary validation, the string was passed through three downstream services before causing a cryptic “Invalid Date” error in a PDF generation service — four hops away from the source.

With Zod at the boundary, the first service to receive the changed format would have caught it immediately, logged the exact schema violation, and returned a clear error to the upstream service.

The cost of runtime validation is measured in microseconds. The cost of not validating is measured in incident response hours.

Practical Guidelines

  1. Validate at every boundary — API responses, queue messages, database reads, file imports
  2. Derive types from schemas — never maintain parallel type definitions
  3. Use safeParse — control your error handling, don’t let Zod throw
  4. Log validation failures — they’re the canary in the coal mine
  5. Version your schemas — breaking changes should be explicit, not discovered in production