The Testing Workflow

The companion language page, Testing, covers what Go tests look like — table-driven cases, testify, the race detector, golden files. This page is the operator’s manual: how to actually run them in a large project, which command and flag, and when.

The codebase we’re reading — multigres, a set of small Go services in front of real PostgreSQL — wraps its tests rather than calling go test directly. That’s worth dwelling on, because it’s a pattern any sizable Go project eventually grows into: a thin wrapper that builds the right binaries first and hands tests a coordination point so parallel runs don’t trample each other. Bypassing the wrapper breaks in two specific ways:

Integration suites silently skip their build step, so they launch stale or missing binaries.
Concurrent test runs collide on TCP ports, producing flaky failures that look like real bugs.

The wrapper builds the binaries and wires up a port-pool socket up front; the project’s make targets cover CI and full-tree runs. We’ll refer to the wrapper command generically as the dev test runner below.

The two test modes

Tests split into two modes with very different costs and prerequisites. Knowing which one you’re invoking is the whole game — one finishes in seconds, the other launches a real cluster.

Mode	What runs	Prerequisites	Speed
unit	individual functions and packages, no external services	none — no build, no Postgres	seconds
integration	end-to-end: real multigateway, multipooler, pgctld, PostgreSQL, etcd	builds binaries first; needs the port-pool socket	minutes

The dividing line in the code is testing.Short(): any test gated behind if testing.Short() { t.Skip() } is a slow/integration test. Unit runs pass -short so those skip; integration runs don’t. (See the Testing page for how testing.Short() and t.Skip work in the test bodies themselves.)

Running unit tests

Unit selection uses ordinary Go package paths — the same ./go/... style you’d pass to go test by hand.

Run the whole short suite. The runner expands this to go test -short ./go/...:
run all unit tests
```
dev-test unit all
```
Narrow to one package subtree while iterating on it:
one package subtree
```
dev-test unit ./go/services/multipooler/...
```

Narrow further to a single test. The name becomes a -run argument, and the value is a regex:

dev-test unit ./go/services/multipooler TestConnectionPool
dev-test unit ./go/services/multigateway Test.*Route.*

Running integration tests

Integration selection uses short names that map to ./go/test/endtoend/<name>/.... Each name picks one end-to-end suite:

Name	Directory under `go/test/endtoend/`	Exercises
`all`	everything	the full end-to-end matrix
`multipooler`	`multipooler/`	connection pooling, pool lifecycle
`multiorch`	`multiorch/`	failover, leader election, consensus
`queryserving`	`queryserving/`	query routing, execution, transactions
`localprovisioner`	`localprovisioner/`	local cluster provisioning
`shardsetup`	`shardsetup/`	shard configuration
`pgregresstest`	`pgregresstest/`	the PostgreSQL regression suite (opt-in)

dev-test integration all                          # the whole matrix
dev-test integration multipooler                  # one suite
dev-test integration multiorch TestFixReplication # one test in one suite
dev-test integration all TestBootstrap            # one test name across all suites

What an integration run actually does

The reason integration tests need a wrapper at all is that three things must happen before go test even starts. Skip any one and you get a confusing failure rather than a clean one.

Integration test run

Rendering diagram…

flowchart LR
Start(["dev-test integration"]) --> Build["build binaries"]
Build --> Pool["start port-pool server"]
Pool --> Env["export port-pool socket addr"]
Env --> Test["go test ./go/test/endtoend/..."]
Test --> PG[("real PostgreSQL")]

Under the hood the run is roughly:

build && portpool-start \
  && MULTIGRES_PORT_POOL_ADDR=/tmp/multigres-port-pool.sock \
     go test [flags] [-run TestName] ./go/test/endtoend/<package>/...

The three pieces, and what each one prevents:

Build the binaries. Integration suites launch real service binaries from bin/. Stale or missing binaries fail in confusing ways, so the build runs first. (See Build & Make for what the build produces.)
Start the port-pool server. This is idempotent — it starts the pool server once and reuses the socket, so it’s safe to call on every run.
Point tests at the pool via the MULTIGRES_PORT_POOL_ADDR environment variable. Without it, parallel components grab raw ports and collide, giving you intermittent failures.

Useful flags

These pass straight through to go test, whether you go through the dev runner or a make target:

Flag	Meaning	When to reach for it
`-v`	verbose; prints each test name as it runs	finding which test hangs or fails
`-race`	enable the data-race detector (slower)	before committing concurrency changes
`-count=1`	bypass the test cache, force a fresh run	after an env/build change the cache can’t see
`-count=10`	run the test 10 times	flake detection — same flag, different intent
`-cover`	print coverage percentage	a quick coverage check
`-coverprofile=coverage.out`	write a coverage profile	feeding reporting tools
`-timeout=30m`	raise the timeout (default 10m)	long integration suites
`-p=1`	run packages sequentially	suites that conflict on shared resources
`-parallel=N`	N parallel tests within a package	tuning throughput
`-short`	skip long-running tests	implicit in unit runs; rarely passed by hand

A few combinations in practice:

dev-test unit ./go/services/multipooler/... -v -race    # race-check one package
dev-test integration multipooler TestConnCache -count=10 # hunt a flake
dev-test integration all -p=1 -timeout=45m               # serialized, long timeout

The make targets

The dev runner is the day-to-day driver; the make targets are the full-tree / CI form. Four core targets cover most needs:

make test           # full local run including integration; starts its own port-pool server
make test-short     # go test -short -v ./...  — fast whole-tree unit pass
make test-race      # go test -short -v -race ./...  — race-check the short suite
make test-coverage  # coverage including subprocess coverage from launched binaries

Target	What it does	When
`test`	starts its own port-pool server (with a `trap` to kill it), then runs `go test ./...` over the whole tree	full local run including integration; needs binaries built first
`test-short`	`go test -short -v ./...`	fast full-tree unit pass — no port-pool server, no build prerequisite
`test-race`	`go test -short -v -race ./...`	race-check the whole short suite
`test-coverage`	builds binaries instrumented with `-cover`, then runs tests under the port-pool socket	coverage that includes the subprocess binaries it launches

Patch-based conformance suites

Three suites verify the system against real PostgreSQL output using an interesting trick: instead of asserting on exact output, they store the known divergences as patch files under testdata/. The test runs the real workload, applies the patches to the expected output, and asserts the result matches. They’re double-gated by environment variables, which the make targets set for you.

Suite	Verify	Regenerate patches	”on” gate	mode variable
pgregress	`make pgregress`	`make pgregress-update-patches`	`RUN_PGREGRESS=1`	`PGREGRESS_PATCH_MODE`
pgexternal	`make pgexternal`	`make pgexternal-update-patches`	`RUN_PGEXTERNAL=1`	`PGREGRESS_PATCH_MODE`
pgproto	`make pgproto`	`make pgproto-update-patches`	`RUN_EXTENDED_QUERY_SERVING_TESTS=1`	`PGPROTO_PATCH_MODE`

Each pair runs the same test; the verify and regenerate targets differ only in the patch-mode variable:

pgregress runs the PostgreSQL regression suite. verify mode reports residual diffs as failures; generate mode absorbs them by rewriting the patch files. It clones and builds PostgreSQL from source on first run.
pgexternal clones and builds third-party extensions (pgvector, pg_cron, pg_partman) as PGXS modules against that from-source PostgreSQL, then runs their shipped regression suites. It shares pgregress’s test name and mode variable; the RUN_* gate is what selects which workload runs.
pgproto is wire-protocol conformance. Its fixtures are *.pgproto scripts (simple_query.pgproto, extended_query.pgproto, copy.pgproto). This is the suite behind the PG Wire & SQL Types page.

Running against a packaged Postgres image

There’s a separate integration target that runs the suite inside a container built on the supabase Postgres base image, rather than against a from-source PostgreSQL. One command builds everything it needs, because each step depends on the previous:

Build the base Postgres image:
build base image
```
make docker-supabase-postgres
```
Build the test image on top of it:
build test image
```
make docker-supabase-postgres-test
```
Run the suite in that image (this sets TEST_PRINT_LOGS=1 so failure logs print to stdout):
run in container
```
make test-integration-supabase
```

It mounts the repo into the container and exists separately from a normal integration run because it validates against the packaged supabase Postgres image, not a locally compiled PostgreSQL.

When an integration test fails

Integration logs are preserved only on failure — a passing test cleans up its temp directory. The failure message points you at the directory it kept:

==== TEST LOGS PRESERVED ====
Logs available at: /tmp/shardsetup_test_XXXXXXXXXX
Set TEST_PRINT_LOGS=1 to print log contents
===========================

The layout has one log per component: multigateway.log at the top, temp-multiorch/multiorch.log for the bootstrap orchestrator, and a pooler-N/ directory per multipooler instance holding multipooler.log, pgctld.log, and data/pg_data/postgresql.log. The service logs are JSON, so pipe them through jq.

To print the logs inline instead of digging through the temp directory — which is what CI and the container target do — set the print-logs variable on the run:

TEST_PRINT_LOGS=1 dev-test integration shardsetup TestName

What to run when

A quick decision table for the common situations:

Situation	Command
Edited one package, want fast feedback	`dev-test unit ./go/<pkg>/... -v`
About to commit	`dev-test unit all`, then `dev-test integration all`
Suspect a concurrency bug	add `-race`
Test passes only sometimes	`... -count=10` (flake hunt)
Cache returning a stale PASS after a build change	add `-count=1`
Whole-tree short pass without the dev runner	`make test-short`
Coverage including launched binaries	`make test-coverage`
Recording new PG divergences on purpose	`make pgregress-update-patches` (review the diff)
Validate against the supabase Postgres image	`make test-integration-supabase`

Checkpoints

Why does the “all unit” run add -short, and what breaks if you drop it?

Because go test ./go/... also traverses the end-to-end packages under go/test/endtoend/.... The -short flag makes every integration test hit its if testing.Short() { t.Skip() } guard and skip, keeping a “quick unit run” quick. Drop -short and a plain go test ./go/... launches the heavy integration suites — building binaries, starting Postgres — which is slow and usually not what you wanted.

Name the three things an integration run sets up before go test, and what each prevents.

First, it builds the service binaries so the suites launch fresh binaries from bin/ instead of stale or missing ones. Second, it starts the port-pool server (idempotent — once, reused) so runs share a coordination point. Third, it exports MULTIGRES_PORT_POOL_ADDR pointing at the pool socket, without which parallel components grab raw TCP ports and collide into flaky failures.

What is the difference between -count=1 and -count=10?

-count=1 is the cache-buster: it forces a single fresh run that ignores a cached PASS, which matters after an env or build change the test cache can’t see. -count=10 is the flake hunt: it runs the test ten times looking for an intermittent failure. Same flag, opposite intent — pick deliberately.

What flips between make pgproto and make pgproto-update-patches, and what’s the risk?

Both run the same wire-protocol conformance test; the only change is the PGPROTO_PATCH_MODE variable flipping from verify to generate. In generate mode, any divergence from PostgreSQL is written into the patch files as “expected” rather than failing — so running it by accident silently absorbs real regressions. Always review the resulting patch diff before merging.

Lint & Format The rest of the pre-commit loop: gofumpt, goimports, and golangci-lint.