Architecture & Request Flow
What you will learn: the five long-running services (plus a test binary) and what each one is responsible for, how cells and the topology store organize the cluster, how gRPC connects the services, and the exact path a client query takes from the wire to PostgreSQL and back.
Prerequisites: you’ve read the language track. This page leans on interfaces & composition (the request path is glued together with interfaces), context (a ctx is threaded through every hop), and concurrency (results stream back through callbacks and gRPC server streams). New to the whole guide? Start at the orientation.
The one-paragraph mental model
Section titled “The one-paragraph mental model”Multigres is Vitess for Postgres: a set of small Go services that sit in front of real PostgreSQL servers and add routing, connection pooling, and automated failover. A client speaks the ordinary PostgreSQL wire protocol to a multigateway; the gateway is a stateless proxy that turns each query into a gRPC call to a multipooler; the pooler runs the query on a pooled SQL connection to a real PostgreSQL process. Everything else — lifecycle management, consensus/failover, admin — happens off to the side and is not on the query path.
The five services (plus a test binary)
Section titled “The five services (plus a test binary)”Each service is a separate binary under go/cmd/<svc>/main.go. The descriptions below are taken from the package doc-comments and cobra Short strings in those files — the system’s own words for what each one does.
| Service | Binary location | On the query path? | One-line responsibility |
|---|---|---|---|
| multigateway | go/cmd/multigateway/main.go | Yes (hop 1) | “a stateless proxy responsible for accepting requests from applications and routing them to the appropriate multipooler server(s) … It speaks both the PostgreSQL Protocol and a gRPC protocol.” |
| multipooler | go/cmd/multipooler/main.go | Yes (hop 2) | “provides connection pooling and communicates with pgctld via gRPC to serve queries from multigateway instances.” |
| pgctld | go/cmd/pgctld/main.go | No (control plane) | “manages PostgreSQL server instances … providing lifecycle management and configuration control.” |
| multiorch | go/services/multiorch/init.go | No (control plane) | Consensus + failover: builds a consensus.NewCoordinator and a recovery.NewEngine, watches topology, and drives recovery. |
| multiadmin | go/cmd/multiadmin/main.go | No (admin/observability) | “provides administrative services for the multigres cluster, exposing both HTTP and gRPC endpoints” + an HTTP reverse proxy to per-cell status pages. |
| portpoolserver | go/cmd/portpoolserver/main.go | No (tests only) | “a cross-process port allocation service for integration tests.” |
multigres itself is the operator CLI (cluster management), not a long-running service.
Cells and the topology store
Section titled “Cells and the topology store”A cell is an availability zone. Each cell runs its own full stack of services (gateway, pooler, orch, pgctld + postgres). The cluster’s metadata lives in a topology store — etcd in production (etcdtopo), an in-process fake (memorytopo) in tests. The Go package that reads and writes it is go/common/topoclient.
The store is split into two logically separate connections:
flowchart TB Global["Global Topology (static metadata)<br/>- Databases<br/>- Cell locations"] CellA["Cell Topo A (dynamic data)<br/>- Gateways<br/>- Poolers<br/>- Orch state"] CellB["Cell Topo B (dynamic data)<br/>- Gateways<br/>- Poolers<br/>- Orch state"] Global --> CellA Global --> CellB
- Global topology holds static cluster-level metadata:
Databaserecords andCelllocations. ACellis just`{ name, server_addresses, root }`— how to connect to that cell’s topology service. - Cell topology holds dynamic per-cell catalogs: the
MultiPooler,MultiGateway, andMultiOrchrecords for components living in that cell.
The global and cell topologies may run on the same etcd cluster, but they remain separate in terms of naming and client management: topoclient keeps one connection to the global topology service and one connection to each cell topology service.
Every component is uniquely named by an `ID{component, cell, name}`. That is how the gateway, given a routing decision, names the exact pooler it must reach.
Topology diagram (two cells, the real deployment shape)
Section titled “Topology diagram (two cells, the real deployment shape)”This combines the service list, the cell split, and the concrete container layout from a typical local deployment (etcd + multiadmin + per-cell stacks, exposing zone1’s gateway on PG port 15432).
flowchart TB
Client(["psql / app (PostgreSQL wire protocol)"])
Client --> GW1
subgraph zone1["zone1 (cell)"]
GW1["multigateway (PG wire in)"]
POOL1["multipooler"]
PG1["PostgreSQL"]
PGCTL1["pgctld (lifecycle)"]
ORCH1["multiorch (consensus + failover)"]
GW1 -->|"gRPC: MultiPoolerService"| POOL1
POOL1 -->|"pooled SQL"| PG1
POOL1 -->|"gRPC: PgCtld"| PGCTL1
PGCTL1 -->|"lifecycle"| PG1
ORCH1 -->|"controls recovery"| PG1
end
subgraph zone2["zone2 (cell)"]
Z2["full stack: gateway, pooler,<br/>multiorch, pgctld + PostgreSQL"]
end
ETCD[("etcd: GLOBAL topology<br/>(Databases, Cell locations)<br/>+ cell topo zone1<br/>+ cell topo zone2")]
GW1 -.->|"watches cell topo"| ETCD
zone1 --- ETCD
zone2 --- ETCD
ADMIN["multiadmin (admin/observability front door;<br/>off the query path)"]
ADMIN -.->|"HTTP reverse proxy /proxy/<type>/<cell>/<name>"| zone1
Three different things are labelled with words derived from “primary”; keep them distinct:
- leader / follower — consensus roles owned by multiorch.
- primary / standby / replica — PostgreSQL recovery-mode state (
pg_is_in_recovery()). - PoolerType PRIMARY / REPLICA — a topology routing label (writes vs. reads) set by the gateway.
How gRPC wires the services together
Section titled “How gRPC wires the services together”The inter-service contract is protobuf. Each service has a *.proto in proto/ that generates Go into `go/pb/<name>`. The contracts that matter for the big picture:
MultiPoolerService(proto/multipoolerservice.proto) — the gateway-to-pooler contract:ExecuteQuery,StreamExecute,PortalStreamExecute, plusDescribe,GetAuthCredentials,CopyBidiExecute,ConcludeTransaction, etc.PgCtld(proto/pgctldservice.proto) — pooler-to-pgctld lifecycle control.MultiOrchService(proto/multiorchservice.proto) — recovery/failover control RPCs.MultiGatewayService(proto/multigatewayservice.proto) — cross-gatewayCancelQueryonly.
Tracing a query end-to-end
Section titled “Tracing a query end-to-end”Let’s follow a single SELECT 1 from the client socket to Postgres and back. Each step names the real file and what the request looks like at that boundary.
sequenceDiagram
autonumber
participant Client
participant Gateway as multigateway
participant Pooler as multipooler
participant Postgres
Client->>Gateway: PG 'Q' message (raw SQL)
Note over Gateway: handler.HandleQuery
Note over Gateway: executor.StreamExecute (plan)
Note over Gateway: scatterconn.StreamExecute (buildTarget)
Gateway->>Pooler: gRPC StreamExecuteRequest{Query, Target}
Note over Pooler: grpcpoolerservice + executor
Pooler->>Postgres: pooled SQL (QueryStreaming)
Postgres-->>Pooler: rows
Pooler-->>Gateway: stream of StreamExecuteResponse
Gateway-->>Client: PG DataRow / CommandComplete
Step 1 — PG wire arrives at the gateway handler
Section titled “Step 1 — PG wire arrives at the gateway handler”In go/services/multigateway/handler/handler.go, HandleQuery receives a simple-protocol Q message carrying the raw SQL string. The handler parses it (syntactic only) and runs the executor under a statement-timeout context:
func (h *MultiGatewayHandler) HandleQuery(ctx context.Context, conn *server.Conn, queryStr string, callback func(ctx context.Context, result *sqltypes.Result) error) error { // ... asts, err := parser.ParseSQL(queryStr) // ... result, err = h.executor.StreamExecute(ctx, conn, st, queryStr, asts[0], countingCallback)Notice the callback parameter: results are pushed back, not collected then returned. The handler wraps the caller’s callback in a countingCallback that tallies rows. The extended query protocol (Parse/Bind/Execute/Describe/Close/Sync) has sibling handlers in the same file that ultimately funnel into the same executor.
Step 2 — the gateway executor plans
Section titled “Step 2 — the gateway executor plans”In go/services/multigateway/executor/executor.go, StreamExecute consults the plan cache + planner, then routes through an interface, not a concrete type:
type Executor struct { planner *planner.Planner exec engine.IExecute logger *slog.Logger planCache *plancache.PlanCache}exec is the engine.IExecute interface; the concrete implementation injected at startup is ScatterConn. This interface seam is classic Go decoupling — see interfaces & composition.
Step 3 — scatterconn decides read vs write
Section titled “Step 3 — scatterconn decides read vs write”In go/services/multigateway/scatterconn/scatter_conn.go, buildTarget converts the connection state into a query.Target:
func (sc *ScatterConn) buildTarget(tableGroup, shard string, state *handler.MultiGatewayConnectionState) *querypb.Target { poolerType := clustermetadatapb.PoolerType_PRIMARY if state.TargetReplica() { poolerType = clustermetadatapb.PoolerType_REPLICA } return &querypb.Target{ TableGroup: tableGroup, Shard: shard, PoolerType: poolerType, }}This is the place read-vs-write routing is decided. Target (proto/query.proto) is `{ table_group, shard, pooler_type }`, and PoolerType is PRIMARY / REPLICA / DRAINED.
Step 4 — poolergateway resolves a concrete pooler
Section titled “Step 4 — poolergateway resolves a concrete pooler”In go/services/multigateway/poolergateway/pooler_gateway.go, PoolerGateway implements queryservice.QueryService. QueryServiceByID resolves a Target/ID to a live pooler connection; StreamExecute forwards the call. The actual gRPC client is created in go/services/multigateway/poolergateway/pooler_connection.go via grpccommon.NewClient(addr, ...) then:
client: multipoolerservice.NewMultiPoolerServiceClient(conn),That single line is the proof the gateway-to-pooler hop is gRPC. Which pooler gets picked is driven by service discovery: go/services/multigateway/discovery.go watches the cell topology for MultiPooler records and feeds a poolergateway.LoadBalancer.
Step 5 — the wire call
Section titled “Step 5 — the wire call”In go/services/multigateway/poolergateway/grpc_query_service.go, StreamExecute builds the proto request and pumps the server stream into the callback:
req := &multipoolerservice.StreamExecuteRequest{ Query: sql, Target: target, Options: options, ReservationOptions: reservationOptions,}stream, err := g.client.StreamExecute(ctx, req)// ... for { response, err := stream.Recv(); ... }This is the boundary where in-process Go calls become bytes on the network. ctx (carrying the statement timeout from Step 1) rides along — see context.
Step 6 — pooler server side
Section titled “Step 6 — pooler server side”In go/services/multipooler/grpcpoolerservice/service.go, StreamExecute runs admission control (StartRequest, which lets a graceful drain keep serving single queries while rejecting new transactions), fetches its executor, and streams results back via stream.Send:
func (s *poolerService) StreamExecute(req *multipoolerpb.StreamExecuteRequest, stream multipoolerpb.MultiPoolerService_StreamExecuteServer) error { if err := s.pooler.StartRequest(req.Target, admissionKind(...)); err != nil { return mterrors.ToGRPC(err) } executor, err := s.pooler.Executor() // ... reservedState, err := executor.StreamExecute(stream.Context(), req.Target, req.Query, req.Options, ...)Step 7 — the pooler runs real SQL
Section titled “Step 7 — the pooler runs real SQL”In go/services/multipooler/internal/executor/executor.go, the pooler pulls a pooled Postgres connection, stamps a multigres_vpid, and runs the query with retry:
conn, err := e.poolManager.GetRegularConnWithSettings(ctx, settings, user, clientKey, serverKey)// ...if err := conn.Conn.QueryStreamingWithRetry(ctx, sql, callback); err != nil { return nil, wrapQueryError(err)}This is the multipooler -> postgres hop. It is a real pooled SQL connection (pgx/libpq-style), not gRPC and not via pgctld. Rows stream back up the same chain — pooler stream → gateway callback → PG DataRow/CommandComplete to the client.
Why it is built this way
Section titled “Why it is built this way”- Stateless gateway, stateful pooler. Pooling and per-connection reservation state live in the pooler (close to Postgres), so gateways can be scaled and replaced freely. The gateway only holds routing + plan caches.
- Interface seams at every hop (
engine.IExecute,queryservice.QueryService) make each layer unit-testable in isolation and let the gateway swap a real gRPC client for a fake. See interfaces & composition. - Topology in etcd, not config files. The sample
config/*.yamlfiles are tiny (mostlyhttp-port,log-level). Real cell/pooler wiring is discovered from the topology store at runtime, which is what makes failover and live rebalancing possible. - Strict dependency direction keeps it all decoupled:
cmd/may depend on anything;services/may not depend oncmd/or other services;common/may not depend oncmd/orservices/;tools/may not depend on any repo code outsidetools/. This is enforced by package layout, and it is why the gateway reaches the pooler over a generated gRPC client (acommon/pbboundary) rather than importing the pooler package.
Checkpoints
Section titled “Checkpoints”Which hops in the latency-sensitive query path are gRPC, and which is not?
Onlymultigateway -> multipooler is gRPC (MultiPoolerService, via multipoolerservice.NewMultiPoolerServiceClient). The multipooler -> postgres hop is a real pooled SQL connection (conn.Conn.QueryStreamingWithRetry). pgctld is not on the path at all.You connect to the gateway’s replica-reads port. Trace how that becomes PoolerType_REPLICA in the gRPC request.
init.go builds a second listener whose handler has SetTargetReplica(true). That makes state.TargetReplica() true, so scatterconn.buildTarget sets poolerType = REPLICA, which is placed in the query.Target carried by the StreamExecuteRequest. The decision is made by which TCP port you connected to, not by inspecting the SQL.What is the difference between global topology and cell topology, and where is each stored?
Global topology holds static metadata (Database records, Cell locations); cell topology holds dynamic per-cell catalogs (MultiPooler/MultiGateway/MultiOrch records). topoclient keeps one connection to global and one per cell; both may live on the same etcd cluster but are separate connections/namespaces (go/common/topoclient/store.go).Why can’t this cluster run with a single cell?
The shard bootstraps with anAtLeastN(2) durability policy, so a single pooler can never form a quorum to elect a leader. The MULTIGRES_NUM_CELLS minimum is 2.Exercises
Section titled “Exercises”-
Trace the flow yourself. Open these files in order and write down, at each, what the request looks like (PG message? Go args? proto message? SQL?):
go/services/multigateway/handler/handler.go(HandleQuery) →go/services/multigateway/executor/executor.go(StreamExecute) →go/services/multigateway/scatterconn/scatter_conn.go(buildTarget+StreamExecute) →go/services/multigateway/poolergateway/pooler_gateway.go(StreamExecute/QueryServiceByID) →go/services/multigateway/poolergateway/grpc_query_service.go→go/services/multipooler/grpcpoolerservice/service.go(StreamExecute) →go/services/multipooler/internal/executor/executor.go. -
Name the seven cmd binaries from their own docs. Open each
go/cmd/<svc>/main.go(and the doc-comments) and copy the one-sentence responsibility from the package comment or cobraShort. Then mark each as query-path, control-plane, or test-only. Check yourself against the table above. -
Find where read/write routing is decided. Run
grep -n "TargetReplica\|buildTarget"overgo/services/multigateway/and locate the twoserver.NewListenercalls ingo/services/multigateway/init.go. Explain in two sentences how connecting to the replica port ends up asPoolerType_REPLICAon the wire. -
Map the topology model. Open
proto/clustermetadata.protoand sort its messages into “global topology” vs “per-cell topology”, cross-checking against the topology diagram above. Then connect theMULTIGRES_NUM_CELLSminimum of 2 back to the durability requirement.
Continue to cmd & cobra to see how each main.go builds its command and runs Init then RunDefault — the assembly point where the request pipeline you just traced is wired together.