Let's Build a Customer Support AI Copilot: An Event-Driven Agent with LangGraph, Go, pgvector & Redis Streams [Part 6]

Part 6 — Deployment: `docker compose up` and the Full Local Stack

The goal is simple: clone the repo, run docker compose up, and have a fully working AI agent stack — Postgres with pgvector, Redis Streams, Ollama serving local models, the Go API, the Python worker, and the Next.js console — all healthy and talking to each other, with no API keys required.

This post walks through every layer: the database migrations, all five Dockerfiles, the compose file service by service, the environment template, the optional profiles (distributed tracing, on-demand pipeline tools), and the model setup.

The Repo Layout

text

1resolver_code/
2├── services/api/           Go API — Dockerfile, go.mod
3├── workers/agent/          Python LangGraph worker — Dockerfile, requirements.txt
4├── apps/web/               Next.js console — Dockerfile
5├── pipeline/               Bitext ingest + eval — Dockerfile, eval/Dockerfile
6├── packages/
7│   ├── graphql/            shared schema.graphql (used by API codegen + web codegen)
8│   └── events/             events.schema.json (typed event contract)
9├── db/migrations/          versioned SQL (migrate/migrate applies them)
10├── deploy/
11│   └── docker-compose.yml  the full stack
12├── data/                   golden.jsonl, HuggingFace cache (gitignored)
13├── .env.example            config template — copy to .env, zero keys needed
14└── Makefile                up / models / ingest / eval / test / gqlgen

Step 1: Database Schema

Two migrations apply in order. The first creates all tables; the second adds the ANN and FTS indexes on the KB.

db/migrations/000001_init.up.sql

sql

1-- db/migrations/000001_init.up.sql
2CREATE EXTENSION IF NOT EXISTS vector;       -- pgvector: cosine ANN
3CREATE EXTENSION IF NOT EXISTS pgcrypto;     -- gen_random_uuid()
4
5CREATE TABLE conversations (
6  id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
7  status      TEXT NOT NULL DEFAULT 'OPEN'
8                CHECK (status IN ('OPEN','ESCALATED','RESOLVED')),
9  created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
10);
11
12CREATE TABLE messages (
13  id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
14  conversation_id UUID NOT NULL REFERENCES conversations(id) ON DELETE CASCADE,
15  role            TEXT NOT NULL CHECK (role IN ('CUSTOMER','AGENT','SYSTEM')),
16  body            TEXT NOT NULL,
17  created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
18);
19
20CREATE TABLE kb_documents (
21  id        UUID PRIMARY KEY DEFAULT gen_random_uuid(),
22  source    TEXT NOT NULL,         -- 'bitext' | 'policy' | 'manual'
23  intent    TEXT NOT NULL,
24  category  TEXT NOT NULL,
25  title     TEXT NOT NULL,
26  content   TEXT NOT NULL,
27  embedding VECTOR(768) NOT NULL   -- nomic-embed-text dimensions
28);
29
30CREATE TABLE drafts (
31  id               UUID PRIMARY KEY DEFAULT gen_random_uuid(),
32  message_id       UUID NOT NULL REFERENCES messages(id) ON DELETE CASCADE,
33  intent           TEXT NOT NULL,
34  category         TEXT NOT NULL,
35  sentiment        TEXT NOT NULL CHECK (sentiment IN ('POSITIVE','NEUTRAL','NEGATIVE')),
36  urgency          TEXT NOT NULL CHECK (urgency IN ('LOW','NORMAL','HIGH')),
37  answer           TEXT NOT NULL,
38  citations        JSONB NOT NULL DEFAULT '[]'::jsonb,  -- [{kb_id, title, snippet}]
39  confidence       NUMERIC NOT NULL CHECK (confidence >= 0 AND confidence <= 1),
40  status           TEXT NOT NULL
41                     CHECK (status IN ('PENDING','SUGGESTED','ESCALATED','SENT','REJECTED')),
42  guard            JSONB NOT NULL DEFAULT '{}'::jsonb,  -- {grounded, tone, policy, reasons[]}
43  model            TEXT NOT NULL,
44  tokens           INT NOT NULL DEFAULT 0,
45  cost_cents       NUMERIC NOT NULL DEFAULT 0,
46  latency_ms       INT,
47  created_at       TIMESTAMPTZ NOT NULL DEFAULT NOW()
48);
49
50CREATE TABLE eval_runs (
51  id               UUID PRIMARY KEY DEFAULT gen_random_uuid(),
52  groundedness     NUMERIC NOT NULL,
53  routing_accuracy NUMERIC NOT NULL,
54  answer_score     NUMERIC NOT NULL,
55  safety_violations INT NOT NULL DEFAULT 0,
56  created_at       TIMESTAMPTZ NOT NULL DEFAULT NOW()
57);
58
59CREATE TABLE audit_log (
60  id         UUID PRIMARY KEY DEFAULT gen_random_uuid(),
61  draft_id   UUID NOT NULL,
62  actor      TEXT NOT NULL,
63  action     TEXT NOT NULL,
64  before     JSONB,
65  after      JSONB,
66  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
67);

db/migrations/000002_indexes.up.sql

sql

1-- db/migrations/000002_indexes.up.sql
2
3-- Conversation timeline reads
4CREATE INDEX idx_messages_conv_created ON messages (conversation_id, created_at);
5
6-- Queue filtering by status
7CREATE INDEX idx_drafts_status ON drafts (status);
8
9-- Hybrid retrieval pre-filter
10CREATE INDEX idx_kb_intent_cat ON kb_documents (intent, category);
11
12-- ANN: HNSW with cosine distance (low-latency approximate nearest-neighbour)
13CREATE INDEX idx_kb_embedding ON kb_documents
14  USING hnsw (embedding vector_cosine_ops);
15
16-- FTS: GIN index for the keyword half of hybrid retrieval
17CREATE INDEX idx_kb_content_fts ON kb_documents
18  USING gin (to_tsvector('english', title || ' ' || content));

The HNSW index (vector_cosine_ops) trades a small amount of recall for significantly lower query latency vs an exact scan. The GIN index over to_tsvector(title || ' ' || content) enables @@ plainto_tsquery(...) in the keyword retrieval path. Both are used together in the hybrid retriever, fused with RRF.

drafts.guard and drafts.citations are stored as JSONB — they're write-once, read-many structures that don't need relational joins. The check constraints on status, sentiment, urgency, and role mirror the GraphQL enums exactly, enforcing contract consistency at the database level without an extra validation layer.

Step 2: The Five Dockerfiles

Go API — multi-stage, distroless runtime:

services/api/Dockerfile

dockerfile

1# services/api/Dockerfile
2# Build context: repo root — needs packages/graphql for gqlgen schema access.
3
4FROM golang:1.25-alpine AS build
5WORKDIR /src
6
7COPY services/api/go.mod services/api/go.sum ./services/api/
8WORKDIR /src/services/api
9RUN go mod download
10
11COPY packages /src/packages
12COPY services/api /src/services/api
13
14# Generated code is not committed — regenerate from schema before compiling.
15RUN go run github.com/99designs/gqlgen generate
16RUN CGO_ENABLED=0 GOOS=linux go build -o /out/server ./cmd/server
17
18# distroless/static: no shell, no package manager, nonroot user.
19FROM gcr.io/distroless/static-debian12:nonroot
20COPY --from=build /out/server /server
21EXPOSE 8080
22USER nonroot:nonroot
23ENTRYPOINT ["/server"]

Three things worth noting. First, the build context is the repo root, not services/api, because gqlgen needs to read packages/graphql/schema.graphql. Second, go run github.com/99designs/gqlgen generate regenerates the typed resolvers inside the build — generated files aren't committed, so every build starts from the schema source of truth. Third, the runtime image is distroless/static-debian12:nonroot: no shell, no package manager, runs as UID 65532. A binary that escapes the container has nothing to pivot to.

Python worker — single-stage:

workers/agent/Dockerfile

dockerfile

1# workers/agent/Dockerfile
2FROM python:3.12-slim
3
4ENV PYTHONUNBUFFERED=1 \
5    PYTHONDONTWRITEBYTECODE=1
6
7WORKDIR /app
8COPY requirements.txt ./
9RUN pip install --no-cache-dir -r requirements.txt
10COPY . .
11
12# Drop privileges at runtime — no root needed.
13RUN useradd --create-home --uid 10001 worker
14USER worker
15
16CMD ["python", "main.py"]

The worker is a long-running process — it sits in XREADGROUP blocking on the messages stream. Single-stage is fine here: there's no compile step. PYTHONDONTWRITEBYTECODE=1 skips .pyc generation to keep the image lean.

Next.js console — three-stage, standalone output:

apps/web/Dockerfile

dockerfile

1# apps/web/Dockerfile
2# Build context: repo root — needs packages/graphql for codegen.
3
4FROM node:22-alpine AS deps
5WORKDIR /app/apps/web
6COPY apps/web/package.json apps/web/package-lock.json ./
7RUN npm ci
8
9FROM node:22-alpine AS build
10WORKDIR /app
11ENV NEXT_TELEMETRY_DISABLED=1
12COPY --from=deps /app/apps/web/node_modules ./apps/web/node_modules
13COPY apps/web ./apps/web
14COPY packages/graphql ./packages/graphql   # codegen reads this
15WORKDIR /app/apps/web
16RUN npm run build                           # prebuild: runs codegen first
17
18FROM node:22-alpine AS runner
19WORKDIR /app
20ENV NODE_ENV=production \
21    NEXT_TELEMETRY_DISABLED=1 \
22    PORT=3000 \
23    HOSTNAME=0.0.0.0
24
25RUN addgroup -S nodejs && adduser -S nextjs -G nodejs
26
27# next.config.js output: 'standalone' — self-contained server.js + static files
28COPY --from=build /app/apps/web/.next/standalone ./
29COPY --from=build /app/apps/web/.next/static ./apps/web/.next/static
30COPY --from=build /app/apps/web/public ./apps/web/public
31
32USER nextjs
33EXPOSE 3000
34CMD ["node", "apps/web/server.js"]

output: 'standalone' in next.config.js tells Next.js to bundle the minimal Node.js server and all required files into .next/standalone. The runner stage copies only that — no node_modules, no source. The prebuild npm script runs graphql-codegen before next build, so the TypeScript types are always generated from the schema before compilation catches them.

Pipeline (ingest) — single-stage:

pipeline/Dockerfile

dockerfile

1# pipeline/Dockerfile
2FROM python:3.12-slim
3WORKDIR /src/pipeline
4
5COPY requirements.txt .
6RUN pip install --no-cache-dir -r requirements.txt
7COPY . /src/pipeline
8
9# HuggingFace datasets cache under the mounted data dir to avoid re-downloads.
10ENV HF_HOME=/src/data/.hf_cache

Eval harness — builds from repo root, reuses worker code:

pipeline/eval/Dockerfile

dockerfile

1# pipeline/eval/Dockerfile
2# Build context: repo root — bundles workers/agent so the eval runs the real graph.
3FROM python:3.12-slim
4
5ENV PYTHONUNBUFFERED=1 \
6    PYTHONDONTWRITEBYTECODE=1 \
7    AGENT_PATH=/app/workers/agent \
8    PYTHONPATH=/app/workers/agent    # eval imports directly from the worker
9
10WORKDIR /app
11COPY workers/agent/requirements.txt ./requirements.txt
12RUN pip install --no-cache-dir -r requirements.txt
13
14COPY workers/agent ./workers/agent
15COPY pipeline/eval ./pipeline/eval
16
17WORKDIR /app/pipeline
18CMD ["python", "eval/run_eval.py"]

PYTHONPATH=/app/workers/agent makes the eval harness import graph, rag, llm, schemas, and policy directly from the production worker package. The eval runs the same code that production runs — not a reimplementation.

Step 3: The Docker Compose File

Service by service.

deploy/docker-compose.yml

yaml

1# deploy/docker-compose.yml
2name: resolver
3
4services:
5
6  postgres:
7    image: pgvector/pgvector:pg16           # pg16 + vector extension pre-installed
8    environment:
9      POSTGRES_USER: ${POSTGRES_USER:-resolver}
10      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-resolver}
11      POSTGRES_DB: ${POSTGRES_DB:-resolver}
12    ports:
13      - "5432:5432"
14    volumes:
15      - pgdata:/var/lib/postgresql/data
16    healthcheck:
17      test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-resolver} -d ${POSTGRES_DB:-resolver}"]
18      interval: 5s
19      timeout: 3s
20      retries: 10
21
22  redis:
23    image: redis:7.4-alpine
24    command: ["redis-server", "--appendonly", "yes"]   # persist the stream to disk
25    ports:
26      - "6379:6379"
27    volumes:
28      - redisdata:/data
29    healthcheck:
30      test: ["CMD", "redis-cli", "ping"]
31      interval: 5s
32      timeout: 3s
33      retries: 10

--appendonly yes enables Redis AOF persistence. The event stream survives a container restart; a worker that was mid-processing will see the message again via XAUTOCLAIM.

yaml

1  ollama:
2    image: ollama/ollama:0.5.7
3    ports:
4      - "11434:11434"
5    volumes:
6      - ollamadata:/root/.ollama
7    # GPU passthrough (NVIDIA). Docker Desktop on WSL2 exposes the GPU automatically.
8    # On a CPU-only host, drop this deploy block — Ollama falls back to CPU.
9    deploy:
10      resources:
11        reservations:
12          devices:
13            - driver: nvidia
14              count: all
15              capabilities: [gpu]

The GPU deploy block is optional. If the host has no NVIDIA driver or Docker GPU support, remove it and Ollama runs on CPU — slower (minutes per draft with 7b) but functional. Models live in a named volume so docker compose down doesn't delete them.

yaml

1  migrate:
2    image: migrate/migrate:v4.18.1
3    depends_on:
4      postgres:
5        condition: service_healthy       # waits for pg_isready, not just container start
6    volumes:
7      - ../db/migrations:/migrations:ro
8    command:
9      - "-path=/migrations"
10      - "-database=postgres://${POSTGRES_USER:-resolver}:${POSTGRES_PASSWORD:-resolver}@postgres:5432/${POSTGRES_DB:-resolver}?sslmode=disable"
11      - "up"
12    restart: on-failure

migrate is a one-shot service. It runs golang-migrate up, applies all pending migrations, then exits with code 0. condition: service_completed_successfully in the services that depend on it means they won't start until the schema is ready. Re-running docker compose up is idempotent — golang-migrate tracks applied versions in a schema_migrations table.

yaml

1  api:
2    build:
3      context: ..                              # repo root for schema access
4      dockerfile: services/api/Dockerfile
5    env_file: ../.env
6    environment:
7      # In-network names override the .env host defaults (which say 'localhost').
8      DATABASE_URL: postgres://${POSTGRES_USER:-resolver}:${POSTGRES_PASSWORD:-resolver}@postgres:5432/${POSTGRES_DB:-resolver}?sslmode=disable
9      REDIS_URL: redis://redis:6379/0
10      OTEL_EXPORTER_OTLP_ENDPOINT: ${OTEL_EXPORTER_OTLP_ENDPOINT:-http://jaeger:4318}
11    depends_on:
12      postgres:
13        condition: service_healthy
14      redis:
15        condition: service_healthy
16      migrate:
17        condition: service_completed_successfully
18    ports:
19      - "8080:8080"
20
21  worker:
22    build:
23      context: ../workers/agent
24    env_file: ../.env
25    environment:
26      DATABASE_URL: postgres://${POSTGRES_USER:-resolver}:${POSTGRES_PASSWORD:-resolver}@postgres:5432/${POSTGRES_DB:-resolver}?sslmode=disable
27      REDIS_URL: redis://redis:6379/0
28      OLLAMA_HOST: http://ollama:11434
29      OTEL_EXPORTER_OTLP_ENDPOINT: ${OTEL_EXPORTER_OTLP_ENDPOINT:-http://jaeger:4318}
30    depends_on:
31      postgres:
32        condition: service_healthy
33      redis:
34        condition: service_healthy
35      ollama:
36        condition: service_started         # Ollama has no healthcheck; started is enough
37      migrate:
38        condition: service_completed_successfully
39    restart: on-failure
40
41  web:
42    build:
43      context: ..                          # repo root for schema/codegen access
44      dockerfile: apps/web/Dockerfile
45    depends_on:
46      - api
47    ports:
48      - "3000:3000"

The API and worker both read from ../.env via env_file, then the environment block overrides the host-facing URLs with in-network service names (postgres, redis, ollama). This way the same .env file works for both local development (pointing at localhost) and containerized deployment (pointing at service names).

yaml

1  # profiles: ["tools"] — only starts when explicitly requested.
2  # Run via: docker compose --profile tools run --rm pipeline python ingest_bitext.py
3  pipeline:
4    build:
5      context: ../pipeline
6    profiles: ["tools"]
7    env_file: ../.env
8    environment:
9      DATABASE_URL: postgres://...
10      OLLAMA_HOST: http://ollama:11434
11      GOLDEN_PATH: /src/data/golden.jsonl
12    volumes:
13      - ../data:/src/data           # writes golden.jsonl here
14    depends_on:
15      postgres:
16        condition: service_healthy
17      ollama:
18        condition: service_started
19
20  eval:
21    build:
22      context: ..
23      dockerfile: pipeline/eval/Dockerfile
24    profiles: ["tools"]
25    env_file: ../.env
26    environment:
27      DATABASE_URL: postgres://...
28      OLLAMA_HOST: http://ollama:11434
29      GOLDEN_PATH: /src/data/golden.jsonl
30      EVAL_REPORT: /out/REPORT.md
31    volumes:
32      - ../data:/src/data
33      - ../pipeline/eval/reports:/out    # REPORT.md written here, visible on host
34    depends_on:
35      postgres:
36        condition: service_healthy
37      ollama:
38        condition: service_started
39
40  # profiles: ["observability"] — opt-in distributed tracing UI.
41  jaeger:
42    image: jaegertracing/all-in-one:1.62.0
43    profiles: ["observability"]
44    environment:
45      COLLECTOR_OTLP_ENABLED: "true"
46    ports:
47      - "16686:16686"   # Jaeger UI
48      - "4318:4318"     # OTLP/HTTP receiver
49
50volumes:
51  pgdata:
52  redisdata:
53  ollamadata:

Docker compose startup order

Step 4: Environment Configuration

.env.example is the config contract. Copy it to .env — every default works for local development with docker compose up, no keys required.

.env.example

bash

1# .env.example (abridged)
2
3# Postgres
4POSTGRES_USER=resolver
5POSTGRES_PASSWORD=resolver
6POSTGRES_DB=resolver
7DATABASE_URL=postgres://resolver:resolver@postgres:5432/resolver?sslmode=disable
8PG_POOL_MAX_CONNS=20
9
10# Redis Streams
11REDIS_URL=redis://redis:6379/0
12STREAM_MESSAGES=messages
13STREAM_DRAFTS=drafts
14STREAM_DEADLETTER=dead-letter
15CONSUMER_GROUP=agent-workers
16
17# LLM — default: Ollama, $0, local
18LLM_PROVIDER=ollama
19OLLAMA_HOST=http://ollama:11434
20
21# Model tiering: cheap for triage, stronger for drafting
22TRIAGE_MODEL=qwen2.5:3b
23DRAFT_MODEL=qwen2.5:7b
24JUDGE_MODEL=qwen2.5:7b
25EMBED_MODEL=nomic-embed-text
26EMBED_DIM=768
27
28# Hosted provider switch (only used if LLM_PROVIDER=openai)
29GEMINI_API_KEY=
30OPENAI_API_KEY=
31
32# Safety caps
33MAX_GRAPH_STEPS=12
34MAX_REPAIR_RETRIES=1
35CONFIDENCE_THRESHOLD=0.6
36RETRIEVAL_TOP_K=5
37
38# OTel: console=spans in logs, otlp=send to jaeger, none=disabled
39OTEL_TRACES_EXPORTER=console
40OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger:4318
41OTEL_SERVICE_NAME=resolver
42LOG_LEVEL=info
43
44# API
45API_PORT=8080
46API_HOST=0.0.0.0
47CORS_ALLOWED_ORIGINS=http://localhost:3000

The DATABASE_URL and REDIS_URL in .env point at in-network service names (postgres, redis). This works inside Docker networking. For local development outside containers, override them to localhost in a local .env.local or via shell exports.

Step 5: Model Tiering

Two models for different cost/quality points. The LLM interface is provider-agnostic — one env var switches the backend.

text

1TRIAGE_MODEL=qwen2.5:3b    — fast, cheap; classifies intent + category only
2DRAFT_MODEL=qwen2.5:7b     — stronger; generates the cited customer reply
3JUDGE_MODEL=qwen2.5:7b     — same quality; scores candidate against reference
4EMBED_MODEL=nomic-embed-text  — 768-dim; matches vector(768) in kb_documents

To switch to a hosted provider:

bash

1# .env overrides (no code change)
2LLM_PROVIDER=openai
3OPENAI_API_KEY=sk-...
4OPENAI_BASE_URL=https://api.openai.com/v1   # or any compatible endpoint
5TRIAGE_MODEL=gpt-4o-mini
6DRAFT_MODEL=gpt-4o

The worker's chat_from_env(cfg) reads LLM_PROVIDER and returns either OllamaChat or OpenAIChat — both implement the ChatLLM Protocol. Same graph, same nodes, no change needed anywhere else.

Step 6: Observability — Traces Across the Bus

The most architecturally interesting part of the stack: a single OTel trace spans across a process boundary via the Redis event.

text

1API receives mutation
2  → starts HTTP span (otelhttp)
3  → extracts W3C traceparent
4  → serialises trace_id into message.created event on Redis
5  → span ends
6
7Worker reads event
8  → reads trace_id from event
9  → creates child span continuing the same trace
10  → runs LangGraph graph inside that span
11  → all node calls, LLM calls, DB queries become child spans

This means a Jaeger search for a single trace_id shows the whole journey: the API's HTTP handler, the Redis publish, the worker's graph execution, and every LLM call — one waterfall view.

To enable:

bash

1# Start the Jaeger UI alongside the default stack
2docker compose --profile observability up jaeger
3
4# Tell both services to ship spans to it
5OTEL_TRACES_EXPORTER=otlp
6OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger:4318

Then open http://localhost:16686 and search for service resolver.

Step 7: Quickstart

bash

1# Clone and configure (defaults need no edits)
2cd resolver
3cp .env.example .env
4
5# Build and boot the full stack
6docker compose -f deploy/docker-compose.yml up -d
7
8# First run: pull the three models (downloads ~5GB, one-time)
9docker compose -f deploy/docker-compose.yml exec ollama ollama pull nomic-embed-text
10docker compose -f deploy/docker-compose.yml exec ollama ollama pull qwen2.5:3b
11docker compose -f deploy/docker-compose.yml exec ollama ollama pull qwen2.5:7b
12
13# Seed the knowledge base and write the golden set
14docker compose -f deploy/docker-compose.yml --profile tools run --rm pipeline python ingest_bitext.py
15
16# Run the eval gate (exits 0 on pass, 1 on fail)
17docker compose -f deploy/docker-compose.yml --profile tools run --rm eval
18
19# Open the console
20open http://localhost:3000

What each make target does:

bash

1make up       # docker compose up -d (build + start)
2make models   # pull the three Ollama models
3make ingest   # docker compose run --rm pipeline python ingest_bitext.py
4make eval     # docker compose run --rm eval
5make test     # go test ./... (API) + python tests (worker + eval)
6make gqlgen   # go run gqlgen generate (regenerate Go types from schema)
7make dev      # run API + worker + web in foreground (outside Docker, for fast iteration)

Step 8: What `docker compose up` Actually Does

In order, with timing:

postgres starts → healthcheck polls pg_isready every 5s → healthy after ~10s.
redis starts → healthcheck polls redis-cli ping → healthy after ~5s.
ollama starts immediately (service_started, no healthcheck).
migrate starts once postgres is healthy → applies 000001 and 000002 → exits 0.
api starts once postgres healthy + redis healthy + migrate completed → Go binary up in ~1s.
worker starts once postgres healthy + redis healthy + ollama started + migrate completed → Python process enters XREADGROUP blocking loop.
web starts once api is started → Next.js standalone server up in ~2s.

Total cold-start time on a modern machine: ~30 seconds. After that, docker compose up on subsequent runs is ~5 seconds (containers already built, images cached, postgres/redis data persisted in named volumes).

What We Have

text

1deploy/docker-compose.yml      — 9 services, 2 opt-in profiles, 3 named volumes
2services/api/Dockerfile        — multi-stage Go, distroless/nonroot runtime
3workers/agent/Dockerfile       — Python 3.12-slim, nonroot user
4apps/web/Dockerfile            — 3-stage Node, standalone Next.js output
5pipeline/Dockerfile            — Python 3.12-slim, HF cache mounted
6pipeline/eval/Dockerfile       — builds from repo root, reuses worker code on PYTHONPATH
7db/migrations/                 — 000001 schema + 000002 indexes, applied by golang-migrate
8.env.example                   — full config contract, zero required keys for local run

Three design decisions that carry through everything:

One .env.example, zero required secrets. Every default points at an in-network service. Clone, copy, up. No paid API, no hosted service, no configuration ceremony.

Profiles keep the default stack clean. docker compose up boots only the always-on services. pipeline and eval run on demand via --profile tools. jaeger is --profile observability. The default stack is small and fast.

Traces cross the bus. The trace_id in the Redis event is what makes this observable. Without it, you'd need two separate Jaeger searches to understand what happened to one customer message. With it, one trace shows the entire path.

Let's Build a Customer Support AI Copilot: An Event-Driven Agent with LangGraph, Go, pgvector & Redis Streams [Part 6]

Part 6 — Deployment: `docker compose up` and the Full Local Stack

The Repo Layout

Step 1: Database Schema

Step 2: The Five Dockerfiles

Step 3: The Docker Compose File

Step 4: Environment Configuration

Step 5: Model Tiering

Step 6: Observability — Traces Across the Bus

Step 7: Quickstart

Step 8: What `docker compose up` Actually Does

What We Have

Let's Build a Customer Support Co-Pilot

Ready to Build Something Extraordinary?

More from the Blog

Let's Build a Print-Ready Die-Cut Sticker SaaS from scratch in Golang & Next.js [Part 6]

Let's Build a Print-Ready Die-Cut Sticker SaaS from scratch in Golang & Next.js [Part 5]

Let's Build a Print-Ready Die-Cut Sticker SaaS from scratch in Golang & Next.js [Part 4]

Let's Build a Print-Ready Die-Cut Sticker SaaS from scratch in Golang & Next.js [Part 3]

Part 6 — Deployment: docker compose up and the Full Local Stack

The Repo Layout

Step 1: Database Schema

Step 2: The Five Dockerfiles

Step 3: The Docker Compose File

Step 4: Environment Configuration

Step 5: Model Tiering

Step 6: Observability — Traces Across the Bus

Step 7: Quickstart

Step 8: What docker compose up Actually Does

What We Have

Let's Build a Customer Support Co-Pilot

Ready to Build Something Extraordinary?

More from the Blog

Let's Build a Print-Ready Die-Cut Sticker SaaS from scratch in Golang & Next.js [Part 6]

Let's Build a Print-Ready Die-Cut Sticker SaaS from scratch in Golang & Next.js [Part 5]

Let's Build a Print-Ready Die-Cut Sticker SaaS from scratch in Golang & Next.js [Part 4]

Let's Build a Print-Ready Die-Cut Sticker SaaS from scratch in Golang & Next.js [Part 3]

Part 6 — Deployment: `docker compose up` and the Full Local Stack

Step 8: What `docker compose up` Actually Does