Capstone: Notification Service
This is where the whole guide collapses into one program. We build a Notification
Service: a real async microservice that accepts notification requests over a REST
API, validates them, persists them to Postgres, publishes an event to Kafka, and a
separate worker actually “delivers” them (email/SMS/push, simulated) with retries
and a dead-letter queue. It rate-limits with Redis, authenticates with JWT, and is
observable (structured logs, traces, metrics) and tested end to end. Then it ships
as a uv-built multi-stage Docker image you can run on Kubernetes.
This page is the architecture and the map — what the system looks like and how each subsystem ties back to the module that taught it. The full, runnable build — every key file, the worker, the Dockerfile, the compose stack — lives in the practice project linked at the bottom.
The mental model
Section titled “The mental model”A notification service is the textbook accept-fast, deliver-later backend. The HTTP request must return in milliseconds, but actually sending an email or pushing to a device is slow, flaky, and rate-limited by third parties. So you split the system in two:
- A synchronous API that validates, persists a durable
PENDINGrow, publishes one event, and returns202 Accepted. It never blocks on a provider. - An asynchronous worker that consumes events, simulates delivery per channel,
updates the row to
SENT/FAILED, retries transient failures, and parks poison messages on a DLQ.
If you’ve built this in TypeScript (an Express API + a BullMQ worker) or Go (an
HTTP handler + a goroutine pool reading a channel), the shape is identical. The
Python difference is that both halves are asyncio — the API and the worker
share the same async DB engine, Kafka client, and Redis client, no thread pools.
flowchart TB
client["client<br/>(Bearer JWT)"]
subgraph API["FastAPI app"]
direction TB
auth["JWT auth"]
ratelimit["rate-limit"]
validate["validate (Pydantic)"]
persist["persist PENDING"]
publish["publish"]
auth --> ratelimit --> validate --> persist --> publish
end
subgraph Worker["Worker (separate process, same codebase)"]
direction TB
consume["async for msg in consumer"]
deliver["deliver(channel)"]
retry["on transient fail: retry (bounded)"]
consume --> deliver
deliver -. on transient fail .-> retry
retry -. retry .-> consume
end
redis[("Redis")]
postgres[("Postgres")]
topic["notifications topic"]
dlq["notifications.DLQ"]
client -->|"POST /v1"| auth
ratelimit --> redis
persist --> postgres
publish --> topic
topic --> consume
deliver -->|"SENT (update Postgres)"| postgres
deliver -->|"on poison / exhausted"| dlq
retry -->|"exhausted"| dlq
The API and the worker are the same package, started by two different entry
points (uv run app vs uv run worker) — one container image, two commands. That
is the standard production shape: scale the API and the worker independently, but
build and ship them together.
Module reference map
Section titled “Module reference map”Every subsystem in this capstone is something a specific module taught you:
| Subsystem | Module | What you carry over |
|---|---|---|
Project, uv, pyproject.toml, ruff/ty | 01 Dev Environment | uv init/add/run/sync, lockfile, tooling config |
Modern typing, StrEnum, Protocol, PEP 695 | 02 Typing | Channel/status enums, typed repositories |
| Dataclasses & FP for plain data | 03 Data Structures & FP | Domain value objects, mapping functions |
| Pydantic v2 + pydantic-settings | 04 Pydantic | Request/response models, discriminated channel union, Settings |
asyncio, TaskGroup, lifespans | 05 Async & Concurrency | Everything is async; structured startup/shutdown |
| Async generators & backpressure | 06 Streams | The consumer’s async for poll loop |
| FastAPI app, dependencies, OpenAPI | 07 FastAPI | Endpoints, DI, validation errors, docs |
| async SQLAlchemy 2.0 + Alembic | 09 Postgres | AsyncSession, the repository, migrations |
redis.asyncio, cache-aside, rate limit | 10 Redis & Caching | Sliding-window limiter, preference cache |
aiokafka producer/consumer, DLQ | 11 Kafka Events | Publish on create, worker + retries + DLQ |
| pytest, pytest-asyncio, Testcontainers | 12 Testing | Unit + integration suite against real infra |
JWT (pyjwt), argon2, FastAPI security | 13 Security & Auth | Bearer auth dependency, RBAC, password hashing |
| structlog + Prometheus + OpenTelemetry | 15 Observability | JSON logs, /metrics, traced spans |
uv multi-stage Docker, K8s | 17 Deployment | Slim image, non-root, compose + manifests |
The two modules conspicuously not load-bearing here are 08 Litestar (we picked FastAPI as the primary) and 14 API Design / 16 Advanced Python (GraphQL and metaprogramming would be enhancements, not the spine). That’s the honest version of “uses everything”: a real service uses what it needs.
Project layout
Section titled “Project layout”The package is organized by boundary, not by framework. The domain
(models, the Notification lifecycle) knows nothing about FastAPI, Kafka, or
Redis. The infrastructure (db, events, auth, observability) is wiring. The
two edges (api, workers) are the entry points. This is the clean-architecture
discipline that keeps the worker and the API sharing logic without sharing globals.
Directorynotification-service/
- pyproject.toml uv project, deps, ruff + ty config, scripts
- alembic.ini migration config
- docker-compose.yml app + worker + Postgres + Redis + Kafka
- Dockerfile multi-stage uv build, runs api or worker
Directorysrc/app/
- config.py
Settings(pydantic-settings) — one source of config Directorymodels/ Pydantic API models + the domain (Modules 02, 03, 04)
- schemas.py request/response models, discriminated channel union
- domain.py
NotificationStatusStrEnum, lifecycle rules
Directorydb/
- engine.py async engine + session factory (Module 09)
- tables.py SQLAlchemy 2.0 ORM models
- repository.py
NotificationRepository(the only thing that touches SQL) Directorymigrations/ Alembic revisions
- …
Directoryevents/
- producer.py aiokafka producer, started in lifespan (Module 11)
- consumer.py the worker’s poll loop, retry, DLQ (Module 11)
- schemas.py the Kafka event envelope (versioned)
Directoryauth/
- jwt.py sign/verify with pyjwt, argon2 hashing (Module 13)
- deps.py FastAPI
CurrentUserdependency + RBAC
Directorycache/
- redis.py client + sliding-window rate limiter (Module 10)
Directoryobservability/
- logging.py structlog config (Module 15)
- metrics.py Prometheus counters/histograms
- tracing.py OpenTelemetry setup
Directoryapi/
- main.py FastAPI app, lifespan, router mount,
/metrics,/health - routes.py the notification endpoints
- service.py
NotificationService— orchestration across layers
- main.py FastAPI app, lifespan, router mount,
Directoryworkers/
- main.py the worker entry point (
uv run worker)
- main.py the worker entry point (
- config.py
Directorytests/
- test_service.py unit: rate-limit and validation logic (mocked infra)
- test_api.py integration: API + Postgres via Testcontainers
The one rule that makes this layout pay off: only repository.py writes SQL, and
only service.py orchestrates. Routes are thin (parse, authorize, delegate); the
worker reuses the same repository and service the API uses.
How the pieces connect
Section titled “How the pieces connect”The NotificationService.create() method is the seam where every module meets. Read
it top to bottom and you see the whole guide:
async def create(self, req: CreateNotificationRequest, caller: CurrentUser) -> Notification: # 1. Authorization (Module 13): only admins send on behalf of others if caller.role != "admin" and req.user_id != caller.user_id: raise Forbidden("cannot send notifications for other users")
# 2. Rate limit (Module 10): sliding window in Redis, fail closed if not await self.limiter.allow(req.user_id): raise RateLimited(f"rate limit exceeded for {req.user_id}")
# 3. Persist a durable PENDING row FIRST (Module 09) — durability before delivery notification = await self.repo.create(req) # Pydantic -> ORM row
# 4. Publish the event (Module 11). If Kafka is down the row still exists, # so a sweeper/retry can pick it up later. We never lose the request. try: await self.producer.publish(NotificationQueued.from_(notification)) await self.repo.set_status(notification.id, NotificationStatus.QUEUED) except KafkaError: log.error("queue_failed", notification_id=str(notification.id)) # stays PENDING
metrics.notifications_created.labels(channel=req.channel.type).inc() # Module 15 return notificationThe ordering is deliberate and is the single most important design decision in the
service: persist before you publish. If you publish first and the DB write
fails, you’ve promised to deliver something you have no record of. Persisting first
means the worst case is a PENDING row that never got queued — recoverable, and
visible.
Running the whole thing locally
Section titled “Running the whole thing locally”The service needs Postgres, Redis, and Kafka. Use the guide’s shared infra for the backing stores, then run the two halves as separate processes:
-
Start the backing stores (Postgres 17, Redis 8, Kafka 4 in KRaft mode):
Terminal window cd shared-infra && docker compose up -d# postgres :5432 (dev/dev/app), redis :6379, kafka :29092, kafka-ui :8090 -
Apply migrations and start the API (terminal 1):
Terminal window cd notification-serviceuv run alembic upgrade headuv run app # FastAPI on :8000, docs at /docs, metrics at /metrics -
Start the worker (terminal 2) — same codebase, different entry point:
Terminal window uv run worker # consumes the notifications topic, delivers, retries -
Mint a dev token, send a notification, watch the worker deliver it:
Terminal window TOKEN=$(uv run app-token --user user-1 --role user)curl -s -X POST http://localhost:8000/v1/notifications \-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \-d '{"user_id":"user-1","channel":{"type":"email","to":"a@b.com","subject":"Hi"},"body":"Welcome!"}' | jq# -> 202 with status "QUEUED"; the worker logs "delivered" and flips it to SENT
Or run the lot — API, worker, and infra — from the project’s own docker-compose.yml
with one docker compose up. Both are shown in full in the practice project.
What’s next / what we’d add for real scale
Section titled “What’s next / what we’d add for real scale”This is production-shaped, not production-complete. An honest list of what a real deployment adds:
- A reconciliation sweeper. Rows stuck in
PENDING(Kafka was down at publish time) need a periodic job that re-publishes them. We persist-before-publish precisely so this is possible; we don’t ship the sweeper. - Real providers behind a
Protocol.deliver()is simulated. In production each channel is an adapter (SendGrid, Twilio, FCM) behind a typedProtocol(Module 02), with per-provider timeouts viaasyncio.timeoutand circuit breakers. - Idempotency keys. Clients should send an
Idempotency-Key; we’d dedupe in Redis so a retried POST doesn’t send twice. The Kafka producer is already configured idempotent, but the HTTP edge isn’t. - Outbox pattern. The persist-then-publish gap is a small window of
inconsistency. The bulletproof version writes the event to an
outboxtable in the same transaction as the notification, and a relay publishes from there. - Templating & localization, quiet hours, batching/digest, and a delivery-status webhook back to the caller.
- Horizontal scale: more API replicas behind an LB, more worker replicas in the same consumer group (Kafka rebalances partitions across them), Postgres read replicas for the list endpoint, and Redis for hot preference reads.
The point of the capstone isn’t a finished product — it’s that you can now reason about where each of those goes, because you built the skeleton they hang off.
Practice
Section titled “Practice”Build it. The project walks you through every key file — pyproject.toml, settings,
Pydantic models, SQLAlchemy + Alembic, the FastAPI app with auth, the Kafka producer
in the lifespan, the separate worker with retry/DLQ, the Redis rate limiter, the
structlog/Prometheus/OTel wiring, a couple of tests, and the uv multi-stage
Dockerfile — then runs it end to end.