The Testing Workflow
The companion language page, Testing, covers what Go tests look like — table-driven cases, testify, the race detector, golden files. This page is the operator’s manual: how to actually run them in a large project, which command and flag, and when.
The codebase we’re reading — multigres, a set of small Go services in front of real PostgreSQL — wraps its tests rather than calling go test directly. That’s worth dwelling on, because it’s a pattern any sizable Go project eventually grows into: a thin wrapper that builds the right binaries first and hands tests a coordination point so parallel runs don’t trample each other. Bypassing the wrapper breaks in two specific ways:
- Integration suites silently skip their build step, so they launch stale or missing binaries.
- Concurrent test runs collide on TCP ports, producing flaky failures that look like real bugs.
The wrapper builds the binaries and wires up a port-pool socket up front; the project’s make targets cover CI and full-tree runs. We’ll refer to the wrapper command generically as the dev test runner below.
The two test modes
Section titled “The two test modes”Tests split into two modes with very different costs and prerequisites. Knowing which one you’re invoking is the whole game — one finishes in seconds, the other launches a real cluster.
| Mode | What runs | Prerequisites | Speed |
|---|---|---|---|
| unit | individual functions and packages, no external services | none — no build, no Postgres | seconds |
| integration | end-to-end: real multigateway, multipooler, pgctld, PostgreSQL, etcd | builds binaries first; needs the port-pool socket | minutes |
The dividing line in the code is testing.Short(): any test gated behind if testing.Short() { t.Skip() } is a slow/integration test. Unit runs pass -short so those skip; integration runs don’t. (See the Testing page for how testing.Short() and t.Skip work in the test bodies themselves.)
Running unit tests
Section titled “Running unit tests”Unit selection uses ordinary Go package paths — the same ./go/... style you’d pass to go test by hand.
-
Run the whole short suite. The runner expands this to
go test -short ./go/...:run all unit tests dev-test unit all -
Narrow to one package subtree while iterating on it:
one package subtree dev-test unit ./go/services/multipooler/... -
Narrow further to a single test. The name becomes a
-runargument, and the value is a regex:one test, or a pattern dev-test unit ./go/services/multipooler TestConnectionPooldev-test unit ./go/services/multigateway Test.*Route.*
Running integration tests
Section titled “Running integration tests”Integration selection uses short names that map to ./go/test/endtoend/<name>/.... Each name picks one end-to-end suite:
| Name | Directory under go/test/endtoend/ | Exercises |
|---|---|---|
all | everything | the full end-to-end matrix |
multipooler | multipooler/ | connection pooling, pool lifecycle |
multiorch | multiorch/ | failover, leader election, consensus |
queryserving | queryserving/ | query routing, execution, transactions |
localprovisioner | localprovisioner/ | local cluster provisioning |
shardsetup | shardsetup/ | shard configuration |
pgregresstest | pgregresstest/ | the PostgreSQL regression suite (opt-in) |
dev-test integration all # the whole matrixdev-test integration multipooler # one suitedev-test integration multiorch TestFixReplication # one test in one suitedev-test integration all TestBootstrap # one test name across all suitesWhat an integration run actually does
Section titled “What an integration run actually does”The reason integration tests need a wrapper at all is that three things must happen before go test even starts. Skip any one and you get a confusing failure rather than a clean one.
flowchart LR
Start(["dev-test integration"]) --> Build["build binaries"]
Build --> Pool["start port-pool server"]
Pool --> Env["export port-pool socket addr"]
Env --> Test["go test ./go/test/endtoend/..."]
Test --> PG[("real PostgreSQL")]
Under the hood the run is roughly:
build && portpool-start \ && MULTIGRES_PORT_POOL_ADDR=/tmp/multigres-port-pool.sock \ go test [flags] [-run TestName] ./go/test/endtoend/<package>/...The three pieces, and what each one prevents:
-
Build the binaries. Integration suites launch real service binaries from
bin/. Stale or missing binaries fail in confusing ways, so the build runs first. (See Build & Make for what the build produces.) -
Start the port-pool server. This is idempotent — it starts the pool server once and reuses the socket, so it’s safe to call on every run.
-
Point tests at the pool via the
MULTIGRES_PORT_POOL_ADDRenvironment variable. Without it, parallel components grab raw ports and collide, giving you intermittent failures.
Useful flags
Section titled “Useful flags”These pass straight through to go test, whether you go through the dev runner or a make target:
| Flag | Meaning | When to reach for it |
|---|---|---|
-v | verbose; prints each test name as it runs | finding which test hangs or fails |
-race | enable the data-race detector (slower) | before committing concurrency changes |
-count=1 | bypass the test cache, force a fresh run | after an env/build change the cache can’t see |
-count=10 | run the test 10 times | flake detection — same flag, different intent |
-cover | print coverage percentage | a quick coverage check |
-coverprofile=coverage.out | write a coverage profile | feeding reporting tools |
-timeout=30m | raise the timeout (default 10m) | long integration suites |
-p=1 | run packages sequentially | suites that conflict on shared resources |
-parallel=N | N parallel tests within a package | tuning throughput |
-short | skip long-running tests | implicit in unit runs; rarely passed by hand |
A few combinations in practice:
dev-test unit ./go/services/multipooler/... -v -race # race-check one packagedev-test integration multipooler TestConnCache -count=10 # hunt a flakedev-test integration all -p=1 -timeout=45m # serialized, long timeoutThe make targets
Section titled “The make targets”The dev runner is the day-to-day driver; the make targets are the full-tree / CI form. Four core targets cover most needs:
make test # full local run including integration; starts its own port-pool servermake test-short # go test -short -v ./... — fast whole-tree unit passmake test-race # go test -short -v -race ./... — race-check the short suitemake test-coverage # coverage including subprocess coverage from launched binaries| Target | What it does | When |
|---|---|---|
test | starts its own port-pool server (with a trap to kill it), then runs go test ./... over the whole tree | full local run including integration; needs binaries built first |
test-short | go test -short -v ./... | fast full-tree unit pass — no port-pool server, no build prerequisite |
test-race | go test -short -v -race ./... | race-check the whole short suite |
test-coverage | builds binaries instrumented with -cover, then runs tests under the port-pool socket | coverage that includes the subprocess binaries it launches |
Patch-based conformance suites
Section titled “Patch-based conformance suites”Three suites verify the system against real PostgreSQL output using an interesting trick: instead of asserting on exact output, they store the known divergences as patch files under testdata/. The test runs the real workload, applies the patches to the expected output, and asserts the result matches. They’re double-gated by environment variables, which the make targets set for you.
| Suite | Verify | Regenerate patches | ”on” gate | mode variable |
|---|---|---|---|---|
| pgregress | make pgregress | make pgregress-update-patches | RUN_PGREGRESS=1 | PGREGRESS_PATCH_MODE |
| pgexternal | make pgexternal | make pgexternal-update-patches | RUN_PGEXTERNAL=1 | PGREGRESS_PATCH_MODE |
| pgproto | make pgproto | make pgproto-update-patches | RUN_EXTENDED_QUERY_SERVING_TESTS=1 | PGPROTO_PATCH_MODE |
Each pair runs the same test; the verify and regenerate targets differ only in the patch-mode variable:
- pgregress runs the PostgreSQL regression suite.
verifymode reports residual diffs as failures;generatemode absorbs them by rewriting the patch files. It clones and builds PostgreSQL from source on first run. - pgexternal clones and builds third-party extensions (pgvector,
pg_cron,pg_partman) as PGXS modules against that from-source PostgreSQL, then runs their shipped regression suites. It shares pgregress’s test name and mode variable; theRUN_*gate is what selects which workload runs. - pgproto is wire-protocol conformance. Its fixtures are
*.pgprotoscripts (simple_query.pgproto,extended_query.pgproto,copy.pgproto). This is the suite behind the PG Wire & SQL Types page.
Running against a packaged Postgres image
Section titled “Running against a packaged Postgres image”There’s a separate integration target that runs the suite inside a container built on the supabase Postgres base image, rather than against a from-source PostgreSQL. One command builds everything it needs, because each step depends on the previous:
-
Build the base Postgres image:
build base image make docker-supabase-postgres -
Build the test image on top of it:
build test image make docker-supabase-postgres-test -
Run the suite in that image (this sets
TEST_PRINT_LOGS=1so failure logs print to stdout):run in container make test-integration-supabase
It mounts the repo into the container and exists separately from a normal integration run because it validates against the packaged supabase Postgres image, not a locally compiled PostgreSQL.
When an integration test fails
Section titled “When an integration test fails”Integration logs are preserved only on failure — a passing test cleans up its temp directory. The failure message points you at the directory it kept:
==== TEST LOGS PRESERVED ====Logs available at: /tmp/shardsetup_test_XXXXXXXXXXSet TEST_PRINT_LOGS=1 to print log contents===========================The layout has one log per component: multigateway.log at the top, temp-multiorch/multiorch.log for the bootstrap orchestrator, and a pooler-N/ directory per multipooler instance holding multipooler.log, pgctld.log, and data/pg_data/postgresql.log. The service logs are JSON, so pipe them through jq.
To print the logs inline instead of digging through the temp directory — which is what CI and the container target do — set the print-logs variable on the run:
TEST_PRINT_LOGS=1 dev-test integration shardsetup TestNameWhat to run when
Section titled “What to run when”A quick decision table for the common situations:
| Situation | Command |
|---|---|
| Edited one package, want fast feedback | dev-test unit ./go/<pkg>/... -v |
| About to commit | dev-test unit all, then dev-test integration all |
| Suspect a concurrency bug | add -race |
| Test passes only sometimes | ... -count=10 (flake hunt) |
| Cache returning a stale PASS after a build change | add -count=1 |
| Whole-tree short pass without the dev runner | make test-short |
| Coverage including launched binaries | make test-coverage |
| Recording new PG divergences on purpose | make pgregress-update-patches (review the diff) |
| Validate against the supabase Postgres image | make test-integration-supabase |
Checkpoints
Section titled “Checkpoints”Why does the “all unit” run add -short, and what breaks if you drop it?
Because go test ./go/... also traverses the end-to-end packages under go/test/endtoend/.... The -short flag makes every integration test hit its if testing.Short() { t.Skip() } guard and skip, keeping a “quick unit run” quick. Drop -short and a plain go test ./go/... launches the heavy integration suites — building binaries, starting Postgres — which is slow and usually not what you wanted.Name the three things an integration run sets up before go test, and what each prevents.
First, it builds the service binaries so the suites launch fresh binaries from bin/ instead of stale or missing ones. Second, it starts the port-pool server (idempotent — once, reused) so runs share a coordination point. Third, it exports MULTIGRES_PORT_POOL_ADDR pointing at the pool socket, without which parallel components grab raw TCP ports and collide into flaky failures.What is the difference between -count=1 and -count=10?
-count=1 is the cache-buster: it forces a single fresh run that ignores a cached PASS, which matters after an env or build change the test cache can’t see. -count=10 is the flake hunt: it runs the test ten times looking for an intermittent failure. Same flag, opposite intent — pick deliberately.What flips between make pgproto and make pgproto-update-patches, and what’s the risk?
Both run the same wire-protocol conformance test; the only change is the PGPROTO_PATCH_MODE variable flipping from verify to generate. In generate mode, any divergence from PostgreSQL is written into the patch files as “expected” rather than failing — so running it by accident silently absorbs real regressions. Always review the resulting patch diff before merging.