Skip to content

Capstone: Notification Service

This is where the whole guide collapses into one program. We build a Notification Service: a real async microservice that accepts notification requests over a REST API, validates them, persists them to Postgres, publishes an event to Kafka, and a separate worker actually “delivers” them (email/SMS/push, simulated) with retries and a dead-letter queue. It rate-limits with Redis, authenticates with JWT, and is observable (structured logs, traces, metrics) and tested end to end. Then it ships as a uv-built multi-stage Docker image you can run on Kubernetes.

This page is the architecture and the map — what the system looks like and how each subsystem ties back to the module that taught it. The full, runnable build — every key file, the worker, the Dockerfile, the compose stack — lives in the practice project linked at the bottom.

A notification service is the textbook accept-fast, deliver-later backend. The HTTP request must return in milliseconds, but actually sending an email or pushing to a device is slow, flaky, and rate-limited by third parties. So you split the system in two:

  • A synchronous API that validates, persists a durable PENDING row, publishes one event, and returns 202 Accepted. It never blocks on a provider.
  • An asynchronous worker that consumes events, simulates delivery per channel, updates the row to SENT/FAILED, retries transient failures, and parks poison messages on a DLQ.

If you’ve built this in TypeScript (an Express API + a BullMQ worker) or Go (an HTTP handler + a goroutine pool reading a channel), the shape is identical. The Python difference is that both halves are asyncio — the API and the worker share the same async DB engine, Kafka client, and Redis client, no thread pools.

Notification Service architecture
Rendering diagram…

The API and the worker are the same package, started by two different entry points (uv run app vs uv run worker) — one container image, two commands. That is the standard production shape: scale the API and the worker independently, but build and ship them together.

Every subsystem in this capstone is something a specific module taught you:

SubsystemModuleWhat you carry over
Project, uv, pyproject.toml, ruff/ty01 Dev Environmentuv init/add/run/sync, lockfile, tooling config
Modern typing, StrEnum, Protocol, PEP 69502 TypingChannel/status enums, typed repositories
Dataclasses & FP for plain data03 Data Structures & FPDomain value objects, mapping functions
Pydantic v2 + pydantic-settings04 PydanticRequest/response models, discriminated channel union, Settings
asyncio, TaskGroup, lifespans05 Async & ConcurrencyEverything is async; structured startup/shutdown
Async generators & backpressure06 StreamsThe consumer’s async for poll loop
FastAPI app, dependencies, OpenAPI07 FastAPIEndpoints, DI, validation errors, docs
async SQLAlchemy 2.0 + Alembic09 PostgresAsyncSession, the repository, migrations
redis.asyncio, cache-aside, rate limit10 Redis & CachingSliding-window limiter, preference cache
aiokafka producer/consumer, DLQ11 Kafka EventsPublish on create, worker + retries + DLQ
pytest, pytest-asyncio, Testcontainers12 TestingUnit + integration suite against real infra
JWT (pyjwt), argon2, FastAPI security13 Security & AuthBearer auth dependency, RBAC, password hashing
structlog + Prometheus + OpenTelemetry15 ObservabilityJSON logs, /metrics, traced spans
uv multi-stage Docker, K8s17 DeploymentSlim image, non-root, compose + manifests

The two modules conspicuously not load-bearing here are 08 Litestar (we picked FastAPI as the primary) and 14 API Design / 16 Advanced Python (GraphQL and metaprogramming would be enhancements, not the spine). That’s the honest version of “uses everything”: a real service uses what it needs.

The package is organized by boundary, not by framework. The domain (models, the Notification lifecycle) knows nothing about FastAPI, Kafka, or Redis. The infrastructure (db, events, auth, observability) is wiring. The two edges (api, workers) are the entry points. This is the clean-architecture discipline that keeps the worker and the API sharing logic without sharing globals.

  • Directorynotification-service/
    • pyproject.toml uv project, deps, ruff + ty config, scripts
    • alembic.ini migration config
    • docker-compose.yml app + worker + Postgres + Redis + Kafka
    • Dockerfile multi-stage uv build, runs api or worker
    • Directorysrc/app/
      • config.py Settings (pydantic-settings) — one source of config
      • Directorymodels/ Pydantic API models + the domain (Modules 02, 03, 04)
        • schemas.py request/response models, discriminated channel union
        • domain.py NotificationStatus StrEnum, lifecycle rules
      • Directorydb/
        • engine.py async engine + session factory (Module 09)
        • tables.py SQLAlchemy 2.0 ORM models
        • repository.py NotificationRepository (the only thing that touches SQL)
        • Directorymigrations/ Alembic revisions
      • Directoryevents/
        • producer.py aiokafka producer, started in lifespan (Module 11)
        • consumer.py the worker’s poll loop, retry, DLQ (Module 11)
        • schemas.py the Kafka event envelope (versioned)
      • Directoryauth/
        • jwt.py sign/verify with pyjwt, argon2 hashing (Module 13)
        • deps.py FastAPI CurrentUser dependency + RBAC
      • Directorycache/
        • redis.py client + sliding-window rate limiter (Module 10)
      • Directoryobservability/
        • logging.py structlog config (Module 15)
        • metrics.py Prometheus counters/histograms
        • tracing.py OpenTelemetry setup
      • Directoryapi/
        • main.py FastAPI app, lifespan, router mount, /metrics, /health
        • routes.py the notification endpoints
        • service.py NotificationService — orchestration across layers
      • Directoryworkers/
        • main.py the worker entry point (uv run worker)
    • Directorytests/
      • test_service.py unit: rate-limit and validation logic (mocked infra)
      • test_api.py integration: API + Postgres via Testcontainers

The one rule that makes this layout pay off: only repository.py writes SQL, and only service.py orchestrates. Routes are thin (parse, authorize, delegate); the worker reuses the same repository and service the API uses.

The NotificationService.create() method is the seam where every module meets. Read it top to bottom and you see the whole guide:

src/app/api/service.py (the orchestration spine)
async def create(self, req: CreateNotificationRequest, caller: CurrentUser) -> Notification:
# 1. Authorization (Module 13): only admins send on behalf of others
if caller.role != "admin" and req.user_id != caller.user_id:
raise Forbidden("cannot send notifications for other users")
# 2. Rate limit (Module 10): sliding window in Redis, fail closed
if not await self.limiter.allow(req.user_id):
raise RateLimited(f"rate limit exceeded for {req.user_id}")
# 3. Persist a durable PENDING row FIRST (Module 09) — durability before delivery
notification = await self.repo.create(req) # Pydantic -> ORM row
# 4. Publish the event (Module 11). If Kafka is down the row still exists,
# so a sweeper/retry can pick it up later. We never lose the request.
try:
await self.producer.publish(NotificationQueued.from_(notification))
await self.repo.set_status(notification.id, NotificationStatus.QUEUED)
except KafkaError:
log.error("queue_failed", notification_id=str(notification.id)) # stays PENDING
metrics.notifications_created.labels(channel=req.channel.type).inc() # Module 15
return notification

The ordering is deliberate and is the single most important design decision in the service: persist before you publish. If you publish first and the DB write fails, you’ve promised to deliver something you have no record of. Persisting first means the worst case is a PENDING row that never got queued — recoverable, and visible.

The service needs Postgres, Redis, and Kafka. Use the guide’s shared infra for the backing stores, then run the two halves as separate processes:

  1. Start the backing stores (Postgres 17, Redis 8, Kafka 4 in KRaft mode):

    Terminal window
    cd shared-infra && docker compose up -d
    # postgres :5432 (dev/dev/app), redis :6379, kafka :29092, kafka-ui :8090
  2. Apply migrations and start the API (terminal 1):

    Terminal window
    cd notification-service
    uv run alembic upgrade head
    uv run app # FastAPI on :8000, docs at /docs, metrics at /metrics
  3. Start the worker (terminal 2) — same codebase, different entry point:

    Terminal window
    uv run worker # consumes the notifications topic, delivers, retries
  4. Mint a dev token, send a notification, watch the worker deliver it:

    Terminal window
    TOKEN=$(uv run app-token --user user-1 --role user)
    curl -s -X POST http://localhost:8000/v1/notifications \
    -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
    -d '{"user_id":"user-1","channel":{"type":"email","to":"a@b.com","subject":"Hi"},"body":"Welcome!"}' | jq
    # -> 202 with status "QUEUED"; the worker logs "delivered" and flips it to SENT

Or run the lot — API, worker, and infra — from the project’s own docker-compose.yml with one docker compose up. Both are shown in full in the practice project.

What’s next / what we’d add for real scale

Section titled “What’s next / what we’d add for real scale”

This is production-shaped, not production-complete. An honest list of what a real deployment adds:

  • A reconciliation sweeper. Rows stuck in PENDING (Kafka was down at publish time) need a periodic job that re-publishes them. We persist-before-publish precisely so this is possible; we don’t ship the sweeper.
  • Real providers behind a Protocol. deliver() is simulated. In production each channel is an adapter (SendGrid, Twilio, FCM) behind a typed Protocol (Module 02), with per-provider timeouts via asyncio.timeout and circuit breakers.
  • Idempotency keys. Clients should send an Idempotency-Key; we’d dedupe in Redis so a retried POST doesn’t send twice. The Kafka producer is already configured idempotent, but the HTTP edge isn’t.
  • Outbox pattern. The persist-then-publish gap is a small window of inconsistency. The bulletproof version writes the event to an outbox table in the same transaction as the notification, and a relay publishes from there.
  • Templating & localization, quiet hours, batching/digest, and a delivery-status webhook back to the caller.
  • Horizontal scale: more API replicas behind an LB, more worker replicas in the same consumer group (Kafka rebalances partitions across them), Postgres read replicas for the list endpoint, and Redis for hot preference reads.

The point of the capstone isn’t a finished product — it’s that you can now reason about where each of those goes, because you built the skeleton they hang off.

Build it. The project walks you through every key file — pyproject.toml, settings, Pydantic models, SQLAlchemy + Alembic, the FastAPI app with auth, the Kafka producer in the lifespan, the separate worker with retry/DLQ, the Redis rate limiter, the structlog/Prometheus/OTel wiring, a couple of tests, and the uv multi-stage Dockerfile — then runs it end to end.