Modules, Dependencies & Code Generation
What you will learn: how a large Go project organizes itself as a single module, how it adds and pins dependencies, the four different ways it generates code, and the CI gate that ties dependencies and codegen together so nothing drifts.
If you’re new to Go, here’s the one-paragraph mental model. A Go module is a versioned unit of code identified by an import path, with its dependency graph pinned in two files: go.mod declares requirements, go.sum records cryptographic hashes of every version. We’ll ground all of this in a real production system — multigres (“Vitess for Postgres”) — which is one module containing many packages and many binaries. For the language-level fundamentals of import paths and go.mod/go.sum semantics, see Packages, Modules & Imports. This page is about the operational side: how a project this size adds dependencies, generates code, and verifies the whole thing in CI.
The single module
Section titled “The single module”Everything begins with one go.mod at the repo root, and it’s short:
module github.com/multigres/multigres
go 1.25Every package lives under go/ and shares this one import-path prefix — for example github.com/multigres/multigres/go/common/parser/ast. There is no nested go.mod for the individual services: multigateway, multipooler, pgctld, and multiorch are all packages in the same module. The companion go.sum is hundreds of module version hash triples covering every direct and transitive dependency.
No vendoring
Section titled “No vendoring”There is no vendor/ directory. Builds resolve dependencies from the Go module cache — downloaded via the module proxy on first build — rather than from a checked-in copy of third-party source.
One consequence is worth internalizing: you cannot build a clean checkout fully offline, because the first build needs network access to populate the module cache. Reproducibility comes from two mechanisms working together:
go.sumhash-pins every Go module version, so a tampered download is rejected.- SHA256-pinned native tool downloads (covered in Supply chain below).
That’s the trade-off versus vendoring: a smaller repo with no committed third-party source, but a dependency on the proxy and cache being reachable.
Dependencies: direct vs indirect
Section titled “Dependencies: direct vs indirect”In go.mod, require blocks list dependencies. A trailing // indirect comment marks a dependency you don’t import directly — it’s pulled in transitively, or it’s a tool dependency:
require ( github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 // indirect gopkg.in/yaml.v3 v3.0.1)Note that yaml.v3 has no // indirect marker: it’s imported directly somewhere under go/.
The everyday commands
Section titled “The everyday commands”go get github.com/some/dep@v1.2.3 # add or bump a specific dependencygo mod tidy # recompute require blocks + // indirect markers, # prune unused deps, sync go.sumgo mod tidy is the one to internalize. It walks every import under go/, adds anything missing, drops anything unused, and re-derives the // indirect annotations. After any go get, run go mod tidy and commit both go.mod and go.sum. The CI gate (below) runs go mod tidy for you and fails if it produces a diff — so an un-tidied module is a build-breaker even when no code changed.
Build/codegen tools pinned as module deps
Section titled “Build/codegen tools pinned as module deps”Here’s a modern idiom worth knowing. The tool (...) directive (Go 1.24+) pins the binaries a project runs during development and codegen as module dependencies, so their versions are locked in go.sum like any other dep:
tool ( github.com/mfridman/tparse github.com/quasilyte/go-ruleguard/cmd/ruleguard github.com/wadey/gocovmerge github.com/yoheimuta/protolint/cmd/protolint golang.org/x/tools/cmd/goimports google.golang.org/grpc/cmd/protoc-gen-go-grpc google.golang.org/protobuf/cmd/protoc-gen-go mvdan.cc/gofumpt)These are invoked with go tool <name> — for example go tool goimports and go tool gofumpt (you’ll see exactly that in the go:generate lines below). The version is whatever go.sum pins: no separate install step, no PATH drift between contributors. The payoff is that a generator and the formatter that cleans up after it are reproducible from the module alone.
Code generation
Section titled “Code generation”This codebase generates code four different ways. All generated files carry a // Code generated ... DO NOT EDIT. banner and are committed to the repo. The golden rule: never hand-edit a generated file — trace the bug to its source (the .y grammar, the AST type, the OTel instrument call) and regenerate.
Before the details, here’s the whole picture — each source of truth, the generator that reads it, and the output it produces:
flowchart LR subgraph proto["proto layer"] P1["proto/*.proto"] -->|"protoc + plugins"| P2["go/pb/..."] end subgraph parser["parser layer"] G1["postgres.y"] -->|goyacc| G2["postgres.go"] R1["grammar.y"] -->|goyacc| R2["replparser/grammar.go"] A1["AST node types in ast/"] -->|asthelpergen| A2["ast_clone.go + ast_rewrite.go"] end subgraph metrics["metrics layer"] M1["OTel instrument calls across go/"] -->|metricsgen| M2["catalog_generated.go"] end
go generate and //go:generate
Section titled “go generate and //go:generate”go generate scans Go files for //go:generate <command> comment directives and runs them. It is not part of go build — it only runs when explicitly invoked, here through Makefile targets.
A small but instructive pattern shows up in every directive: run the generator, then format the output with the pinned goimports and gofumpt. Generated Go is rarely gofmt-clean on its own, so formatting is part of generation, not an afterthought.
Parser (goyacc on postgres.y):
//go:generate go run ./goyacc -f -o postgres.go postgres.y//go:generate go tool goimports -w postgres.go//go:generate go tool gofumpt -w postgres.goThe generated postgres.go even records the exact invocation in its banner: // Code generated by goyacc -f -o postgres.go postgres.y. DO NOT EDIT.
Replication parser (a second, smaller goyacc grammar):
//go:generate go run ../goyacc -p replYy -o grammar.go grammar.y//go:generate go tool goimports -w grammar.go//go:generate go tool gofumpt -w grammar.goThis grammar deliberately omits goyacc’s -f flag; the file comment explains it caused a hang in ParseReplicationCommand under go test. A good reminder that generator flags are load-bearing.
AST helpers (asthelpergen → clone + rewrite):
//go:generate go run ../../../tools/asthelpergen/main --in . --iface github.com/multigres/multigres/go/common/parser/ast.NodeThis walks every AST node type implementing the Node interface and emits ast_clone.go (deep-clone helpers) and ast_rewrite.go (a tree-walker). Both carry // Code generated by asthelpergen. DO NOT EDIT.
The deep dive on what this generated code actually does — grammar productions, AST shape, the rewriter — is in The SQL Parser: Lexer, goyacc & AST Codegen.
The metric catalog (metricsgen)
Section titled “The metric catalog (metricsgen)”A separate Go program scans the codebase for OpenTelemetry instrument constructors and writes a Prometheus metric catalog:
go run ./go/tools/metricsgen/main # write the cataloggo run ./go/tools/metricsgen/main -verify # fail if it is staleThe output is go/observability/metriccatalog/catalog_generated.go, with the banner // Code generated by go run ./go/tools/metricsgen/main. DO NOT EDIT. The -verify flag is a drift check (fail-if-stale), distinct from the default write mode. See mterrors & observability for the instruments it scans.
The Makefile targets that drive it all
Section titled “The Makefile targets that drive it all”A project this size wraps every generator behind make targets rather than asking you to remember each invocation:
make proto # protobuf + grpc + grpc-gateway -> go/pbmake parser # go generate ./go/common/parser/... (goyacc + asthelpergen + format)make metrics # go run ./go/tools/metricsgen/main -> catalog_generated.gomake generate # alias for: parser metricsmake build-all # proto + parser + metrics + binaries (the full regen + build)The four generated-artifact families
Section titled “The four generated-artifact families”| Artifact | Tool | Triggered by | Source of truth |
|---|---|---|---|
go/pb/... (proto + gRPC + grpc-gateway) | protoc + plugins | make proto | proto/*.proto |
go/common/parser/postgres.go | goyacc | make parser | postgres.y |
go/common/parser/replparser/grammar.go | goyacc | make parser | grammar.y |
ast_clone.go + ast_rewrite.go | asthelpergen | make parser / make build-all | AST node types in ast/ |
go/observability/metriccatalog/catalog_generated.go | metricsgen | make metrics | OTel instrument calls across go/ |
The validate-generated-files CI gate
Section titled “The validate-generated-files CI gate”This is the gate that ties dependencies and codegen together. It rebuilds everything from source, re-tidies the module, and fails if the working tree changed — i.e. if anyone forgot to regenerate or commit:
validate-generated-files: clean build-all ## Validate that generated files match source. go mod tidy echo "" echo "Checking files modified during build..." MODIFIED_FILES=$$(git status --porcelain | grep "^ M" | awk '{print $$2}') ; \ if [ -n "$$MODIFIED_FILES" ]; then \ echo "Modified files found:"; \ ... echo "Please run 'make build-all && go mod tidy' and commit the changes"; \ exit 1; \ else \ echo "Generated files are up-to-date."; \ fiWhat can make it fail — effectively your pre-push checklist:
- You changed a
.protobut didn’t runmake protoand commitgo/pb/.... - You changed
postgres.yorgrammar.ybut didn’t commit the regenerated.go. - You added or removed an AST node type but didn’t run
make build-allto updateast_clone.go/ast_rewrite.go. - You added or changed an OTel instrument but didn’t regenerate
catalog_generated.go. - You ran
go getbut forgot togo mod tidyand commitgo.mod/go.sum. This trips the gate even when no generated.gochanged, becausego mod tidyruns as part of the target.
The fix is always the message it prints: make build-all && go mod tidy, then commit the diff.
Supply-chain pinning: tools and downloads
Section titled “Supply-chain pinning: tools and downloads”The native tools (the ones that aren’t Go modules) get their own hardening. A setup script sources two helpers:
source "$SCRIPT_DIR/safe_download.sh"source "$SCRIPT_DIR/tool_checksums.sh"The checksum table is a hardcoded SHA256 lookup keyed by tool, version, platform, and arch — protoc, etcd, sqllogictest, pgproto. It errors out for any unknown combination rather than guessing:
get_sha256() { local tool="$1" version="$2" platform="$3" arch="$4" ext="$5" case "$tool" in "protoc") case "$version-$platform-$arch" in "25.1-linux-x86_64") echo "ed8fca87a11c888fed329d6a59c34c7d436165f662a2c875246ddb1ac2b6dd50" ;; ... *) echo "ERROR: no SHA256 hash available for protoc $version-$platform-$arch" >&2; exit 1 ;;The download wrapper is a hardened curl: it validates the expected hash is 64 hex chars, retries up to 3× with exponential backoff, verifies the SHA256 after download, and deletes the file on mismatch:
# Usage: safe_download <url> <output_file> <sha256>... if [ "$actual_sha256" != "$expected_sha256" ]; then echo "DO NOT USE THIS FILE. It may be malicious or corrupted." >&2 rm -f "$output_file" return 1 fiThe setup script also go installs the proto plugins, including the gateway plugin at a pinned version that lives separately from the tool block:
GOBIN=$MTROOT/bin go install google.golang.org/protobuf/cmd/protoc-gen-go google.golang.org/grpc/cmd/protoc-gen-go-grpcGOBIN=$MTROOT/bin go install github.com/grpc-ecosystem/grpc-gateway/v2/protoc-gen-grpc-gateway@v2.27.4The proto pipeline that consumes these plugins is detailed in gRPC & Protobuf.
Automated dependency management: who owns what
Section titled “Automated dependency management: who owns what”Three bots, deliberately scoped so they don’t collide:
| Tool | Owns | Config |
|---|---|---|
| Dependabot | Go module deps (root, monthly, grouped minor+patch, 7-day cooldown); npm deps for the web admin | .github/dependabot.yml |
| Renovate | The go directive bump in go.mod; GitHub Actions SHA pinning. Explicitly disables Go module datasource updates (left to Dependabot) | renovate.json |
| govulncheck | Vulnerability scan, daily + on PRs touching go.mod/go.sum | .github/workflows/govulncheck.yaml |
Renovate’s split is intentional and encoded in config:
"enabledManagers": ["gomod", "github-actions"],"postUpdateOptions": ["gomodTidy"],...{ "description": "Leave Go module dependency updates to Dependabot", "matchManagers": ["gomod"], "matchDatasources": ["go"], "enabled": false },{ "description": "Disable major Go version bumps", "matchDatasources": ["golang-version"], "matchUpdateTypes": ["major"], "enabled": false }GitHub Actions are pinned to exact commit SHAs, not tags. Renovate keeps them pinned, and a separate CI job enforces it:
- name: Verify action version specificity with pinact uses: suzuki-shunsuke/pinact-action@cf51507d80d4d6522a07348e3d58790290eaf0b6 # v2.0.0Notice the action itself is SHA-pinned, with the human-readable tag in a trailing comment — exactly the pattern it enforces on everyone else.
And govulncheck installs a pinned scanner and runs against the module on any change to its dependency files:
pull_request: paths: - "go.mod" - "go.sum"...- name: Install govulncheck run: go install golang.org/x/vuln/cmd/govulncheck@v1.3.0What to run when
Section titled “What to run when”| Situation | Run | Then |
|---|---|---|
| Added/bumped a Go dependency | go get pkg@ver | go mod tidy, commit go.mod + go.sum |
Changed a .proto | make proto | commit go/pb/... |
Changed postgres.y / grammar.y | make parser | commit regenerated .go |
| Added/removed an AST node type | make build-all | commit ast_clone.go + ast_rewrite.go |
| Added/changed an OTel instrument | make metrics | commit catalog_generated.go |
| Before pushing (catch-all sanity) | make validate-generated-files | fix anything it flags |
| First-time setup / new machine | make tools | populates SHA-verified native tools |
Where the generated code lands
Section titled “Where the generated code lands”Putting the layout next to the pipelines makes the source-to-output mapping concrete:
Directorygo/
- go.mod one module, declared at the root
- go.sum dependency hashes
Directorycommon/
Directoryparser/
- postgres.y goyacc source
- postgres.go generated — DO NOT EDIT
Directoryreplparser/ second goyacc grammar
- …
Directoryast/ AST node types (the asthelpergen source)
- ast_clone.go generated — DO NOT EDIT
- ast_rewrite.go generated — DO NOT EDIT
Directorypb/ generated proto + gRPC, from
proto/*.proto- …
Directoryobservability/
Directorymetriccatalog/
- catalog_generated.go generated — DO NOT EDIT
Directorytools/ the generators themselves (asthelpergen, metricsgen, …)
- …
Directoryproto/
.protosource for the pb layer- …
Checkpoints
Section titled “Checkpoints”- You can explain why there’s no
vendor/directory and what provides reproducibility instead. - You can read a
go.modrequireline and say whether a dep is direct or indirect, and whatgo mod tidywould do to it. - You can name the four generated-artifact families, the tool that produces each, and the
maketarget that triggers it. - You know why
make parseralone is insufficient after an AST type change, and what to run instead. - You can list at least four distinct ways the
validate-generated-filesgate fails — including the non-codegen one (go mod tidydrift). - You can describe the two separate tool-pinning mechanisms (
tool (...)block vs the native checksum table) and which tool falls into which. - You can say which bot owns Go module updates, which owns the
godirective bump, and which owns Action SHA pinning.
Checkpoint questions
Section titled “Checkpoint questions”Why can’t you build a clean checkout fully offline, and what gives you reproducibility instead?
There’s novendor/ directory, so the first build must download dependencies from the module proxy to populate the cache. Reproducibility comes from go.sum hash-pinning every Go module version plus SHA256-pinned native tool downloads — not from committed third-party source.You ran go get to bump a dependency but changed no .go code. Can the validate-generated-files gate still fail?
Yes. The gate runs go mod tidy, and if you forgot to tidy and commit go.mod/go.sum, that produces a diff in the working tree and the job fails — even though no generated .go file changed.After adding a new AST node type, why is make parser insufficient, and what do you run instead?
make parser runs goyacc but doesn’t reliably regenerate the asthelpergen output for the new type; you must run make build-all and commit the regenerated ast_clone.go and ast_rewrite.go, or the validate-generated-files CI job fails.A project pins gofumpt in its go.mod tool (…) block but protoc in a separate SHA256 table. Why the two mechanisms?
gofumpt is a Go module, so the tool (...) directive pins it in go.sum like any dependency and runs it via go tool. protoc is a native (non-Go) binary downloaded as a release artifact, so it’s pinned by SHA256 in a checksum table and verified on download — Go’s module machinery can’t pin it.Exercises
Section titled “Exercises”All grounded in a read-only checkout of the codebase — never modify it.
- Map the directives. Run
grep -rn "go:generate" go/and match each directive to the artifact it produces (postgres.go,grammar.go,ast_clone.go,ast_rewrite.go) and themaketarget that triggers it. - Read the gate. Walk the
validate-generated-filestarget line by line and write the contributor checklist of everything that would make it fail in CI. - Classify deps. Open
go.mod, pick fiverequirelines, label each direct vs indirect using the// indirectmarkers, and explain whatgo mod tidywould do to them. - Trace protoc’s supply chain. Find the protoc version in the
Makefile, the matching SHA256 in the checksum table, and the safe-download call in the setup script. Describe exactly what happens when the checksum doesn’t match. - Compare the bots. Read
renovate.jsonand.github/dependabot.ymland write one paragraph: who owns Go module updates, who owns thegodirective bump, who owns Action SHA pinning.
Debugging & Profiling — when the generated code or a dependency misbehaves at runtime.
See also
Section titled “See also”- Build & make — the
make tools/build/build-allentrypoints that wrap codegen. - Testing workflow — the test targets that run after build/codegen; uses
tparse/gocovmergefrom the tool block. - Lint & format —
gofumpt/goimports/protolint/golangci-lint, also in the tool block and in thego:generateformat steps. - The SQL Parser: Lexer, goyacc & AST Codegen — what goyacc and asthelpergen actually produce.
- gRPC & Protobuf — the
proto -> go/pbgeneration pipeline. - mterrors & observability — the OTel instruments metricsgen scans.
- Packages, Modules & Imports — Go module fundamentals for the new-to-Go reader.
- Glossary — goyacc, asthelpergen, govulncheck, Renovate, Dependabot.
- Orientation — where these tools sit in the overall map.