Skip to content

Modules, Dependencies & Code Generation

What you will learn: how a large Go project organizes itself as a single module, how it adds and pins dependencies, the four different ways it generates code, and the CI gate that ties dependencies and codegen together so nothing drifts.

If you’re new to Go, here’s the one-paragraph mental model. A Go module is a versioned unit of code identified by an import path, with its dependency graph pinned in two files: go.mod declares requirements, go.sum records cryptographic hashes of every version. We’ll ground all of this in a real production system — multigres (“Vitess for Postgres”) — which is one module containing many packages and many binaries. For the language-level fundamentals of import paths and go.mod/go.sum semantics, see Packages, Modules & Imports. This page is about the operational side: how a project this size adds dependencies, generates code, and verifies the whole thing in CI.


Everything begins with one go.mod at the repo root, and it’s short:

go.mod
module github.com/multigres/multigres
go 1.25

Every package lives under go/ and shares this one import-path prefix — for example github.com/multigres/multigres/go/common/parser/ast. There is no nested go.mod for the individual services: multigateway, multipooler, pgctld, and multiorch are all packages in the same module. The companion go.sum is hundreds of module version hash triples covering every direct and transitive dependency.

There is no vendor/ directory. Builds resolve dependencies from the Go module cache — downloaded via the module proxy on first build — rather than from a checked-in copy of third-party source.

One consequence is worth internalizing: you cannot build a clean checkout fully offline, because the first build needs network access to populate the module cache. Reproducibility comes from two mechanisms working together:

  1. go.sum hash-pins every Go module version, so a tampered download is rejected.
  2. SHA256-pinned native tool downloads (covered in Supply chain below).

That’s the trade-off versus vendoring: a smaller repo with no committed third-party source, but a dependency on the proxy and cache being reachable.


In go.mod, require blocks list dependencies. A trailing // indirect comment marks a dependency you don’t import directly — it’s pulled in transitively, or it’s a tool dependency:

go.mod
require (
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 // indirect
gopkg.in/yaml.v3 v3.0.1
)

Note that yaml.v3 has no // indirect marker: it’s imported directly somewhere under go/.

adding and tidying dependencies
go get github.com/some/dep@v1.2.3 # add or bump a specific dependency
go mod tidy # recompute require blocks + // indirect markers,
# prune unused deps, sync go.sum

go mod tidy is the one to internalize. It walks every import under go/, adds anything missing, drops anything unused, and re-derives the // indirect annotations. After any go get, run go mod tidy and commit both go.mod and go.sum. The CI gate (below) runs go mod tidy for you and fails if it produces a diff — so an un-tidied module is a build-breaker even when no code changed.

Here’s a modern idiom worth knowing. The tool (...) directive (Go 1.24+) pins the binaries a project runs during development and codegen as module dependencies, so their versions are locked in go.sum like any other dep:

go.mod
tool (
github.com/mfridman/tparse
github.com/quasilyte/go-ruleguard/cmd/ruleguard
github.com/wadey/gocovmerge
github.com/yoheimuta/protolint/cmd/protolint
golang.org/x/tools/cmd/goimports
google.golang.org/grpc/cmd/protoc-gen-go-grpc
google.golang.org/protobuf/cmd/protoc-gen-go
mvdan.cc/gofumpt
)

These are invoked with go tool <name> — for example go tool goimports and go tool gofumpt (you’ll see exactly that in the go:generate lines below). The version is whatever go.sum pins: no separate install step, no PATH drift between contributors. The payoff is that a generator and the formatter that cleans up after it are reproducible from the module alone.


This codebase generates code four different ways. All generated files carry a // Code generated ... DO NOT EDIT. banner and are committed to the repo. The golden rule: never hand-edit a generated file — trace the bug to its source (the .y grammar, the AST type, the OTel instrument call) and regenerate.

Before the details, here’s the whole picture — each source of truth, the generator that reads it, and the output it produces:

The four code-generation pipelines
Rendering diagram…

go generate scans Go files for //go:generate <command> comment directives and runs them. It is not part of go build — it only runs when explicitly invoked, here through Makefile targets.

A small but instructive pattern shows up in every directive: run the generator, then format the output with the pinned goimports and gofumpt. Generated Go is rarely gofmt-clean on its own, so formatting is part of generation, not an afterthought.

Parser (goyacc on postgres.y):

go/common/parser/generate.go
//go:generate go run ./goyacc -f -o postgres.go postgres.y
//go:generate go tool goimports -w postgres.go
//go:generate go tool gofumpt -w postgres.go

The generated postgres.go even records the exact invocation in its banner: // Code generated by goyacc -f -o postgres.go postgres.y. DO NOT EDIT.

Replication parser (a second, smaller goyacc grammar):

go/common/parser/replparser/generate.go
//go:generate go run ../goyacc -p replYy -o grammar.go grammar.y
//go:generate go tool goimports -w grammar.go
//go:generate go tool gofumpt -w grammar.go

This grammar deliberately omits goyacc’s -f flag; the file comment explains it caused a hang in ParseReplicationCommand under go test. A good reminder that generator flags are load-bearing.

AST helpers (asthelpergen → clone + rewrite):

go/common/parser/ast/generate.go
//go:generate go run ../../../tools/asthelpergen/main --in . --iface github.com/multigres/multigres/go/common/parser/ast.Node

This walks every AST node type implementing the Node interface and emits ast_clone.go (deep-clone helpers) and ast_rewrite.go (a tree-walker). Both carry // Code generated by asthelpergen. DO NOT EDIT.

The deep dive on what this generated code actually does — grammar productions, AST shape, the rewriter — is in The SQL Parser: Lexer, goyacc & AST Codegen.

A separate Go program scans the codebase for OpenTelemetry instrument constructors and writes a Prometheus metric catalog:

generating and verifying the metric catalog
go run ./go/tools/metricsgen/main # write the catalog
go run ./go/tools/metricsgen/main -verify # fail if it is stale

The output is go/observability/metriccatalog/catalog_generated.go, with the banner // Code generated by go run ./go/tools/metricsgen/main. DO NOT EDIT. The -verify flag is a drift check (fail-if-stale), distinct from the default write mode. See mterrors & observability for the instruments it scans.

A project this size wraps every generator behind make targets rather than asking you to remember each invocation:

Makefile codegen targets
make proto # protobuf + grpc + grpc-gateway -> go/pb
make parser # go generate ./go/common/parser/... (goyacc + asthelpergen + format)
make metrics # go run ./go/tools/metricsgen/main -> catalog_generated.go
make generate # alias for: parser metrics
make build-all # proto + parser + metrics + binaries (the full regen + build)
ArtifactToolTriggered bySource of truth
go/pb/... (proto + gRPC + grpc-gateway)protoc + pluginsmake protoproto/*.proto
go/common/parser/postgres.gogoyaccmake parserpostgres.y
go/common/parser/replparser/grammar.gogoyaccmake parsergrammar.y
ast_clone.go + ast_rewrite.goasthelpergenmake parser / make build-allAST node types in ast/
go/observability/metriccatalog/catalog_generated.gometricsgenmake metricsOTel instrument calls across go/

This is the gate that ties dependencies and codegen together. It rebuilds everything from source, re-tidies the module, and fails if the working tree changed — i.e. if anyone forgot to regenerate or commit:

Makefile
validate-generated-files: clean build-all ## Validate that generated files match source.
go mod tidy
echo ""
echo "Checking files modified during build..."
MODIFIED_FILES=$$(git status --porcelain | grep "^ M" | awk '{print $$2}') ; \
if [ -n "$$MODIFIED_FILES" ]; then \
echo "Modified files found:"; \
...
echo "Please run 'make build-all && go mod tidy' and commit the changes"; \
exit 1; \
else \
echo "Generated files are up-to-date."; \
fi

What can make it fail — effectively your pre-push checklist:

  1. You changed a .proto but didn’t run make proto and commit go/pb/....
  2. You changed postgres.y or grammar.y but didn’t commit the regenerated .go.
  3. You added or removed an AST node type but didn’t run make build-all to update ast_clone.go/ast_rewrite.go.
  4. You added or changed an OTel instrument but didn’t regenerate catalog_generated.go.
  5. You ran go get but forgot to go mod tidy and commit go.mod/go.sum. This trips the gate even when no generated .go changed, because go mod tidy runs as part of the target.

The fix is always the message it prints: make build-all && go mod tidy, then commit the diff.


The native tools (the ones that aren’t Go modules) get their own hardening. A setup script sources two helpers:

tools/setup_build_tools.sh
source "$SCRIPT_DIR/safe_download.sh"
source "$SCRIPT_DIR/tool_checksums.sh"

The checksum table is a hardcoded SHA256 lookup keyed by tool, version, platform, and arch — protoc, etcd, sqllogictest, pgproto. It errors out for any unknown combination rather than guessing:

tools/tool_checksums.sh
get_sha256() {
local tool="$1" version="$2" platform="$3" arch="$4" ext="$5"
case "$tool" in
"protoc")
case "$version-$platform-$arch" in
"25.1-linux-x86_64")
echo "ed8fca87a11c888fed329d6a59c34c7d436165f662a2c875246ddb1ac2b6dd50" ;;
...
*) echo "ERROR: no SHA256 hash available for protoc $version-$platform-$arch" >&2; exit 1 ;;

The download wrapper is a hardened curl: it validates the expected hash is 64 hex chars, retries up to 3× with exponential backoff, verifies the SHA256 after download, and deletes the file on mismatch:

tools/safe_download.sh
# Usage: safe_download <url> <output_file> <sha256>
...
if [ "$actual_sha256" != "$expected_sha256" ]; then
echo "DO NOT USE THIS FILE. It may be malicious or corrupted." >&2
rm -f "$output_file"
return 1
fi

The setup script also go installs the proto plugins, including the gateway plugin at a pinned version that lives separately from the tool block:

tools/setup_build_tools.sh
GOBIN=$MTROOT/bin go install google.golang.org/protobuf/cmd/protoc-gen-go google.golang.org/grpc/cmd/protoc-gen-go-grpc
GOBIN=$MTROOT/bin go install github.com/grpc-ecosystem/grpc-gateway/v2/protoc-gen-grpc-gateway@v2.27.4

The proto pipeline that consumes these plugins is detailed in gRPC & Protobuf.

Automated dependency management: who owns what

Section titled “Automated dependency management: who owns what”

Three bots, deliberately scoped so they don’t collide:

ToolOwnsConfig
DependabotGo module deps (root, monthly, grouped minor+patch, 7-day cooldown); npm deps for the web admin.github/dependabot.yml
RenovateThe go directive bump in go.mod; GitHub Actions SHA pinning. Explicitly disables Go module datasource updates (left to Dependabot)renovate.json
govulncheckVulnerability scan, daily + on PRs touching go.mod/go.sum.github/workflows/govulncheck.yaml

Renovate’s split is intentional and encoded in config:

renovate.json
"enabledManagers": ["gomod", "github-actions"],
"postUpdateOptions": ["gomodTidy"],
...
{ "description": "Leave Go module dependency updates to Dependabot",
"matchManagers": ["gomod"], "matchDatasources": ["go"], "enabled": false },
{ "description": "Disable major Go version bumps",
"matchDatasources": ["golang-version"], "matchUpdateTypes": ["major"], "enabled": false }

GitHub Actions are pinned to exact commit SHAs, not tags. Renovate keeps them pinned, and a separate CI job enforces it:

.github/workflows/lint.yml
- name: Verify action version specificity with pinact
uses: suzuki-shunsuke/pinact-action@cf51507d80d4d6522a07348e3d58790290eaf0b6 # v2.0.0

Notice the action itself is SHA-pinned, with the human-readable tag in a trailing comment — exactly the pattern it enforces on everyone else.

And govulncheck installs a pinned scanner and runs against the module on any change to its dependency files:

.github/workflows/govulncheck.yaml
pull_request:
paths:
- "go.mod"
- "go.sum"
...
- name: Install govulncheck
run: go install golang.org/x/vuln/cmd/govulncheck@v1.3.0

SituationRunThen
Added/bumped a Go dependencygo get pkg@vergo mod tidy, commit go.mod + go.sum
Changed a .protomake protocommit go/pb/...
Changed postgres.y / grammar.ymake parsercommit regenerated .go
Added/removed an AST node typemake build-allcommit ast_clone.go + ast_rewrite.go
Added/changed an OTel instrumentmake metricscommit catalog_generated.go
Before pushing (catch-all sanity)make validate-generated-filesfix anything it flags
First-time setup / new machinemake toolspopulates SHA-verified native tools

Putting the layout next to the pipelines makes the source-to-output mapping concrete:

  • Directorygo/
    • go.mod one module, declared at the root
    • go.sum dependency hashes
    • Directorycommon/
      • Directoryparser/
        • postgres.y goyacc source
        • postgres.go generated — DO NOT EDIT
        • Directoryreplparser/ second goyacc grammar
        • Directoryast/ AST node types (the asthelpergen source)
          • ast_clone.go generated — DO NOT EDIT
          • ast_rewrite.go generated — DO NOT EDIT
    • Directorypb/ generated proto + gRPC, from proto/*.proto
    • Directoryobservability/
      • Directorymetriccatalog/
        • catalog_generated.go generated — DO NOT EDIT
    • Directorytools/ the generators themselves (asthelpergen, metricsgen, …)
  • Directoryproto/ .proto source for the pb layer

  • You can explain why there’s no vendor/ directory and what provides reproducibility instead.
  • You can read a go.mod require line and say whether a dep is direct or indirect, and what go mod tidy would do to it.
  • You can name the four generated-artifact families, the tool that produces each, and the make target that triggers it.
  • You know why make parser alone is insufficient after an AST type change, and what to run instead.
  • You can list at least four distinct ways the validate-generated-files gate fails — including the non-codegen one (go mod tidy drift).
  • You can describe the two separate tool-pinning mechanisms (tool (...) block vs the native checksum table) and which tool falls into which.
  • You can say which bot owns Go module updates, which owns the go directive bump, and which owns Action SHA pinning.
Why can’t you build a clean checkout fully offline, and what gives you reproducibility instead? There’s no vendor/ directory, so the first build must download dependencies from the module proxy to populate the cache. Reproducibility comes from go.sum hash-pinning every Go module version plus SHA256-pinned native tool downloads — not from committed third-party source.
You ran go get to bump a dependency but changed no .go code. Can the validate-generated-files gate still fail? Yes. The gate runs go mod tidy, and if you forgot to tidy and commit go.mod/go.sum, that produces a diff in the working tree and the job fails — even though no generated .go file changed.
After adding a new AST node type, why is make parser insufficient, and what do you run instead? make parser runs goyacc but doesn’t reliably regenerate the asthelpergen output for the new type; you must run make build-all and commit the regenerated ast_clone.go and ast_rewrite.go, or the validate-generated-files CI job fails.
A project pins gofumpt in its go.mod tool (…) block but protoc in a separate SHA256 table. Why the two mechanisms? gofumpt is a Go module, so the tool (...) directive pins it in go.sum like any dependency and runs it via go tool. protoc is a native (non-Go) binary downloaded as a release artifact, so it’s pinned by SHA256 in a checksum table and verified on download — Go’s module machinery can’t pin it.

All grounded in a read-only checkout of the codebase — never modify it.

  1. Map the directives. Run grep -rn "go:generate" go/ and match each directive to the artifact it produces (postgres.go, grammar.go, ast_clone.go, ast_rewrite.go) and the make target that triggers it.
  2. Read the gate. Walk the validate-generated-files target line by line and write the contributor checklist of everything that would make it fail in CI.
  3. Classify deps. Open go.mod, pick five require lines, label each direct vs indirect using the // indirect markers, and explain what go mod tidy would do to them.
  4. Trace protoc’s supply chain. Find the protoc version in the Makefile, the matching SHA256 in the checksum table, and the safe-download call in the setup script. Describe exactly what happens when the checksum doesn’t match.
  5. Compare the bots. Read renovate.json and .github/dependabot.yml and write one paragraph: who owns Go module updates, who owns the go directive bump, who owns Action SHA pinning.

Debugging & Profiling — when the generated code or a dependency misbehaves at runtime.