Postgres Wire Protocol & SQL Types
What you will learn: how a Go service speaks the PostgreSQL V3 wire protocol on both sides of a connection, and how it represents and converts SQL values without losing the NULL-vs-empty distinction.
Prerequisites: pointers, values & memory (byte slices, the *[]byte pool idiom), interfaces & composition (the Handler seam, embedding), errors (errors.As, wrapping), sync & memory model (sync.Pool, mutexes), and stdlib & idioms (encoding/binary, bufio). It builds on the architecture/request flow and gRPC & protobuf.
This is the chapter where bytes meet meaning. A psql client opens a TCP socket to the gateway and starts speaking a binary, message-framed protocol that PostgreSQL has used since version 7.4. To accept that connection, multigres has to be a PostgreSQL server on the socket. On the far end, the pooler has to be a PostgreSQL client to a real backend database. Both jobs live in go/common/pgprotocol, and they are near-mirror-images of each other.
flowchart LR
Client(["psql / pgx (a client)"])
Client -->|"pg wire (server role)"| GW["multigateway<br/>(Handler)"]
GW -->|"gRPC (sqltypes ⇄ proto)"| Pool["multipooler"]
Pool -->|"pg wire (client role)"| PG[("PostgreSQL<br/>(a server)")]
1. The wire format: one byte, one length, one body
Section titled “1. The wire format: one byte, one length, one body”Every V3 message (after startup) has the same frame: a 1-byte type, a 4-byte length, and a body.
+--------+------------------+----------------------------+| type | length (Int32) | body (length-4 bytes) || 1 byte | big-endian | |+--------+------------------+----------------------------+ 'Q' includes itself, e.g. "SELECT 1\0" excludes type byteTwo facts are load-bearing and easy to get wrong:
- The length field includes its own 4 bytes but excludes the 1-byte type. So a
Qcarrying"SELECT 1\0"(9 bytes) sends length =4 + 9 = 13. - Startup, SSLRequest, CancelRequest, and GSSENCRequest packets have NO type byte — just a length and a body. That is why there is a separate
readStartupPacket.
The decode side reflects fact #1 exactly:
func (c *Conn) ReadMessageLength() (int, error) { hdr, err := c.bufferedReader.Peek(4) if err != nil { return 0, err } length := binary.BigEndian.Uint32(hdr) if _, err := c.bufferedReader.Discard(4); err != nil { return 0, err } if length < 4 { return 0, fmt.Errorf("invalid message length: %d", length) } return int(length - 4), nil // body length}The single source of truth: constants.go
Section titled “The single source of truth: constants.go”Every message-type byte is a named constant in go/common/pgprotocol/protocol/constants.go. Never write a raw 'T' in protocol code — use the constant. The groups:
| Group | Examples |
|---|---|
| Frontend (client) messages | MsgBind = 'B', MsgClose = 'C', MsgDescribe = 'D', MsgExecute = 'E', MsgParse = 'P', MsgQuery = 'Q', MsgSync = 'S', MsgTerminate = 'X', MsgFlush = 'H' |
| Backend (server) messages | MsgParseComplete = '1', MsgBindComplete = '2', MsgCommandComplete = 'C', MsgDataRow = 'D', MsgErrorResponse = 'E', MsgRowDescription = 'T', MsgReadyForQuery = 'Z', MsgNoData = 'n', MsgParameterDescription = 't' |
| Auth codes | AuthOk = 0, AuthSASL = 10, AuthSASLContinue = 11, AuthSASLFinal = 12 |
| Error/Notice field codes | FieldSeverity = 'S', FieldCode = 'C', FieldMessage = 'M', … FieldRoutine = 'R' |
| Transaction status | TxnStatusIdle = 'I', TxnStatusInBlock = 'T', TxnStatusFailed = 'E' |
| Format codes | FormatText = 0, FormatBinary = 1 |
| Special version codes | CancelRequestCode, SSLRequestCode, GSSENCRequestCode |
The protocol version is a typed integer with methods — a small but clean example of a defined type carrying behavior:
type ProtocolVersion uint32
func (v ProtocolVersion) Major() uint16 { return uint16(v >> 16) }func (v ProtocolVersion) Minor() uint16 { return uint16(v & 0xFFFF) }func (v ProtocolVersion) IsSupported() bool { return v == ProtocolVersionNumber } // only 3.02. Symmetric encode/decode: writeDataRow vs parseDataRow
Section titled “2. Symmetric encode/decode: writeDataRow vs parseDataRow”The cleanest way to internalize the protocol is to read an encoder and its matching decoder side by side. The canonical pair is DataRow (D).
The gateway sending rows back to psql:
func (c *Conn) writeDataRow(row *sqltypes.Row) error { bodyLen := 2 // column count (Int16) for _, value := range row.Values { bodyLen += 4 // each column's length prefix (Int32) if value != nil { bodyLen += len(value) } }
buf, pos := c.startPacket(protocol.MsgDataRow, bodyLen) pos = writeInt16At(buf, pos, int16(len(row.Values))) for _, value := range row.Values { if value == nil { pos = writeInt32At(buf, pos, -1) // NULL is signalled by length -1 } else { pos = writeInt32At(buf, pos, int32(len(value))) pos = writeBytesAt(buf, pos, value) } } return c.writePacket(buf, pos)}The pooler reading rows from a backend PostgreSQL:
func (c *Conn) parseDataRow(body []byte) (*sqltypes.Row, error) { reader := NewMessageReader(body) columnCount, err := reader.ReadInt16() // ... row := &sqltypes.Row{Values: make([]sqltypes.Value, columnCount)} for i := range columnCount { value, err := reader.ReadByteString() // returns nil for length -1 if err != nil { return nil, fmt.Errorf("failed to read column value: %w", err) } // nil for NULL, []byte{} for empty string - preserved correctly row.Values[i] = value } return row, nil}Notice the symmetry: the encoder writes -1 for a nil value; the decoder’s ReadByteString reads -1 back as nil. The RowDescription (T) pair works the same way — writeRowDescription writes name\0 | tableOID(4) | colNum(2) | typeOID(4) | typeSize(2) | typeMod(4) | format(2) per field, and parseRowDescription reads those exact fields back in order.
The in-place encoders and the MessageReader
Section titled “The in-place encoders and the MessageReader”The writeXxxAt helpers all take (buf, pos), write at buf[pos:], and return the new pos. They never allocate; the buffer is sized once by startPacket. One subtle correctness rule:
// writeStringAt writes s followed by a single null terminator.// Caller is responsible for ensuring s contains no embedded NULs.func writeStringAt(buf []byte, pos int, s string) int { n := copy(buf[pos:], s) buf[pos+n] = 0 return pos + n + 1}On the read side, MessageReader is a tiny cursor over a []byte. It is returned by value but its methods take pointer receivers to mutate pos:
func NewMessageReader(buf []byte) MessageReader { return MessageReader{buf: buf, pos: 0} }3. Simple vs extended query, and the dispatch loop
Section titled “3. Simple vs extended query, and the dispatch loop”PostgreSQL has two query protocols. The simple one is a single message; the extended one is a typed pipeline used by every modern driver.
Simple query — one Q message carrying a SQL string. The server replies (per statement) with RowDescription → DataRow* → CommandComplete, then a single ReadyForQuery at the end.
sequenceDiagram participant Client as psql participant GW as gateway Client->>GW: 'Q' "SELECT id FROM t" GW-->>Client: 'T' RowDescription GW-->>Client: 'D' DataRow (repeated) GW-->>Client: 'C' CommandComplete GW-->>Client: 'Z' ReadyForQuery
Extended query — a pipeline of typed messages (pgx, JDBC, libpq prepared statements):
sequenceDiagram participant Client participant GW as gateway Client->>GW: 'P' Parse GW-->>Client: '1' ParseComplete Client->>GW: 'B' Bind GW-->>Client: '2' BindComplete Client->>GW: 'D' Describe GW-->>Client: 't' ParameterDescription / 'T' RowDescription (or 'n' NoData) Client->>GW: 'E' Execute GW-->>Client: 'D' DataRow* / 's' PortalSuspended Client->>GW: 'S' Sync GW-->>Client: 'Z' ReadyForQuery (ONLY here)
Both protocols funnel through one loop. serve() reads the type byte, opens a write-buffering window, dispatches, and flushes only at Sync/Query boundaries:
switch msgType {case protocol.MsgSync, protocol.MsgQuery: if err := c.endWriterBuffering(); err != nil { // flush bufio to socket // ... }}handleMessage is the big switch on the type byte that routes each message to a handleXxx method, which in turn decodes the body and calls the Handler.
4. The Handler interface: the seam between framing and execution
Section titled “4. The Handler interface: the seam between framing and execution”The protocol layer knows nothing about gRPC, routing, or query planning. It decodes a message and calls a method on a Handler. This is the classic interface-as-seam idiom: pgprotocol/server depends on the interface; the gateway provides the implementation that routes to the pooler over gRPC.
type Handler interface { HandleQuery(ctx context.Context, conn *Conn, query string, callback func(ctx context.Context, result *sqltypes.Result) error) error HandleParse(ctx context.Context, conn *Conn, name, queryStr string, paramTypes []uint32) error HandleBind(ctx context.Context, conn *Conn, portalName, stmtName string, params [][]byte, paramFormats, resultFormats []int16) error HandleExecute(ctx context.Context, conn *Conn, portalName string, maxRows int32, includeDescribe bool, callback func(ctx context.Context, result *sqltypes.Result) error) error HandleDescribe(ctx context.Context, conn *Conn, typ byte, name string) (*query.StatementDescription, error) HandleClose(ctx context.Context, conn *Conn, typ byte, name string) error HandleSync(ctx context.Context, conn *Conn) error ConnectionClosed(conn *Conn) GetPreparedStatementInfo(connID uint32, name string) *preparedstatement.PreparedStatementInfo}Two design notes worth absorbing:
- The streaming callback contract.
HandleQuery/HandleExecutepush results in chunks. An emptyresult.CommandTagmeans “more chunks of this result set are coming”; settingCommandTag(e.g."SELECT 42") signals the last chunk and triggersCommandComplete. This lets a huge result set flow to the client without buffering it all in memory. includeDescribefolding. When aDescribe('P')is immediately followed byExecuteon the same portal, the protocol layer folds them into one backend call. The handler fetches theRowDescriptionalongside the rows so the first chunk can carryFields.
CancelHandler is a separate interface, because only the gateway needs cross-gateway cancel routing — a nice example of keeping interfaces small and composing capability via optional interface assertion (ConnectionEstablishedHandler is another).
5. sqltypes: representing SQL values as byte slices
Section titled “5. sqltypes: representing SQL values as byte slices”The internal value model deliberately avoids Go scalars. A column value is just bytes:
// Value represents a nullable column value.// nil means NULL, []byte{} means empty string.type Value []byte
func (v Value) IsNull() bool { return v == nil }Why bytes and not int64/string/bool? Because PostgreSQL sends most values in text format (format code 0): integers arrive as ASCII digits, booleans as t/f, arrays as {a,b}. sqltypes keeps the raw bytes and interprets them only when needed:
Value.IsTruemirrors PostgreSQL’sboolin()spellings (t/true/y/yes/on/1, case-insensitive).Value.SQLLiteralrenders a value as a quoted, escaped SQL literal (doubling embedded'), or the keywordNULL.ParseTextArrayparses{foo,bar}literals into[]string, honoring quoting/backslash escapes and rejecting NULL elements and multi-dimensional arrays.
func ParseTextArray(s string) ([]string, error) { if len(s) < 2 || s[0] != '{' || s[len(s)-1] != '}' { return nil, fmt.Errorf("not a PostgreSQL array literal: %q", s) } // ...}Type identity does not travel in the value bytes — it travels as OIDs in query.Field.DataTypeOid. The client decoder resolves the OID to a human-readable name via the parser’s OID table:
field.DataTypeOid = dataTypeOIDfield.Type = ast.Oid(dataTypeOID).String()ast.Oid and the OID constants (BOOLOID, TEXTOID, INT4OID, …) live in go/common/parser/ast/oids.go — see the parser chapter. Reuse them; don’t redefine uint32 = 25 for TEXT.
The NULL-vs-empty invariant, end to end
Section titled “The NULL-vs-empty invariant, end to end”This is the single most important correctness property in the whole subsystem: a nil []byte is SQL NULL; []byte{} (length 0, non-nil) is the empty string. They are different values. A newcomer who writes `if len(v) == 0 { /* no value */ }` conflates them.
The trouble is the gRPC hop. Protobuf cannot distinguish a nil repeated field from an empty one — both serialize as “absent”. So sqltypes carries the distinction explicitly through a -1/0/>0 length encoding:
// Row.ToProtofor i, v := range r.Values { if v == nil { lengths[i] = -1 // NULL } else { lengths[i] = int64(len(v)) // 0 = empty string, >0 = real length totalLen += len(v) }}// RowFromProtoswitch length {case -1: values[i] = nil // NULLcase 0: values[i] = []byte{} // empty string, not NULLdefault: values[i] = pr.Values[offset : offset+int(length)] offset += int(length)}The proto field itself documents the scheme (proto/query.proto): // -1 means NULL, 0 means empty string, >0 means actual length. The same -1/0/>0 scheme protects bind parameters in ParamsToProto/ParamsFromProto. And at the Result level, HasFields exists for the same reason: to tell “no result set” apart from “zero-column result set” across protobuf.
Trace one NULL all the way: wire -1 → parseDataRow ReadByteString returns nil → Row.ToProto writes lengths[i] = -1 → gRPC → RowFromProto restores nil → writeDataRow writes -1 to the client. Collapse nil and []byte{} at any of those hops and a NULL silently becomes an empty string (or vice versa) in the client’s result set.
6. Prepared statements, portals, and consolidation
Section titled “6. Prepared statements, portals, and consolidation”The extended protocol’s Parse/Bind state is modeled by two types that embed their proto carriers plus a parsed AST (embedding is the composition idiom):
type PortalInfo struct { *querypb.Portal *PreparedStatementInfo}
type PreparedStatementInfo struct { *querypb.PreparedStatement astStruct ast.Stmt}NewPreparedStatementInfo parses the SQL up front and maps syntax errors to SQLSTATE 42601:
asts, err := parser.ParseSQL(ps.Query)if err != nil { // ParseSQL only does syntactic parsing, so any error here is a syntax error. return nil, mterrors.NewParseError(err.Error()) // 42601}The Consolidator: one backend statement for many client names
Section titled “The Consolidator: one backend statement for many client names”Two different client connections often prepare the same SQL with the same parameter types under different statement names. The Consolidator dedups them so a single backend prepared statement is reused, reference-counted across connections:
type Consolidator struct { mu sync.Mutex stmts map[string]*PreparedStatementInfo // (query, paramTypes) -> canonical stmt incoming map[uint32]map[string]*PreparedStatementInfo // connID -> name -> stmt usageCount map[*PreparedStatementInfo]int // refcount lastUsedID int}AddPreparedStatement computes dedupKey(queryStr, paramTypes); if a canonical statement already exists it bumps usageCount and reuses it, otherwise it mints stmt0, stmt1, … This is a textbook map-of-maps + mutex pattern — see sync & memory model. The mu sync.Mutex guards every field; all access goes through Lock() / defer Unlock().
Decoding a bind parameter: bytes + OID + format
Section titled “Decoding a bind parameter: bytes + OID + format”When the planner needs the concrete value of a $N parameter, lookupBind recombines three pieces: the raw bytes (from the portal’s packed param array), the declared OID, and the per-parameter wire format:
func lookupBind(portalInfo *PortalInfo, pr *ast.ParamRef, callSite string) (raw []byte, oid ast.Oid, format int32, err error) { // ... params := sqltypes.ParamsFromProto(portalInfo.Portal.ParamLengths, portalInfo.Portal.ParamValues) slot := pr.Number - 1 // ... bounds check, NULL check ... format = paramFormatFor(portalInfo.Portal.ParamFormats, slot) oid = paramOidFor(portalInfo.PreparedStatementInfo.PreparedStatement.GetParamTypes(), slot) return raw, oid, format, nil}paramFormatFor encodes the Bind message’s rule precisely: an empty format list means “all text”, a single element “applies to all”, otherwise one code per parameter. DecodeBindAsBool then branches on format: binary is a single byte (raw[0] != 0), text mirrors boolin spellings. Change the client to send a bool in binary and you take the format == 1 branch — the raw bytes and the decode path both change.
7. Errors are wire-shaped: ErrorResponse and NoticeResponse
Section titled “7. Errors are wire-shaped: ErrorResponse and NoticeResponse”ErrorResponse (E) and NoticeResponse (N) have an identical field-coded body; only the type byte differs. So multigres models both with one type, mterrors.PgDiagnostic, which captures all 14 PostgreSQL diagnostic fields (Severity, Code/SQLSTATE, Message, Detail, Hint, Position, …).
The encode path uses errors.As to recover the structured diagnostic from a wrapped Go error:
func (c *Conn) writeError(err error) error { if err == nil { return nil } rootErr := mterrors.RootCause(err) var diag *mterrors.PgDiagnostic if errors.As(rootErr, &diag) { return c.writePgDiagnosticResponse(protocol.MsgErrorResponse, diag) } // Generic Go error: synthesize an INTERNAL diagnostic. return c.writePgDiagnosticResponse(protocol.MsgErrorResponse, mterrors.NewPgError("ERROR", mterrors.PgSSInternalError, err.Error(), ""))}For SQLSTATE codes, NewParseError (42601), and the full error→wire flow, see mterrors & observability.
8. Memory: bufpool and zero-alloc packet writes
Section titled “8. Memory: bufpool and zero-alloc packet writes”The query path is latency-sensitive, so the packet machinery is built to avoid per-message allocation.
bufpool — a bucketed sync.Pool of *[]byte
Section titled “bufpool — a bucketed sync.Pool of *[]byte”func (p *Pool) Get(size int) *[]byte { sp := p.findPool(size) if sp == nil { // size > maxSize: no pooling, allocate directly buf := make([]byte, size) return &buf } buf := sp.pool.Get().(*[]byte) *buf = (*buf)[:size] return buf}The fast path: write straight into bufio’s buffer
Section titled “The fast path: write straight into bufio’s buffer”startPacket reserves space and writePacket commits it. The fast path writes the body directly into the buffered writer’s internal slice via AvailableBuffer(), so the commit is a self-copy that just advances the cursor — no separate buffer, no pool round-trip, no allocation:
func (c *Conn) startPacket(msgType byte, bodyLen int) ([]byte, int) { totalLen := 5 + bodyLen c.bufMu.Lock() if c.bufferedWriter != nil { avail := c.bufferedWriter.AvailableBuffer() if cap(avail) >= totalLen { buf := avail[:totalLen] buf[0] = msgType binary.BigEndian.PutUint32(buf[1:5], uint32(4+bodyLen)) return buf, 5 // pos points at first body byte } } // fallback: borrow from listener.bufPool, or make() if no listener // ...}9. Conformance: the pgproto gate, and pgx as a test client
Section titled “9. Conformance: the pgproto gate, and pgx as a test client”How do you prove your wire implementation actually behaves like PostgreSQL? Differential testing.
The make pgproto target runs pgpool-II’s data-file-driven pgproto tool against two targets — a real PostgreSQL build (the oracle) and the gateway backed by a 2-pooler cluster — and diffs the response traces:
pgproto: build RUN_EXTENDED_QUERY_SERVING_TESTS=1 PGPROTO_PATCH_MODE=verify \ go test -v -timeout 30m -run TestPgProtoConformance ./go/test/endtoend/queryserving/pgproto/...Known, accepted divergences are recorded as patches under the test’s testdata/patches/*.patch. In verify mode this is a CI gate; a separate generate mode regenerates them. The .pgproto data files are line-oriented directives mapping directly to wire messages — Q Simple Query, P Parse, B Bind, D Describe, E Execute, S Sync, H Flush, plus reader directives Y/y.
Separately, the gateway is also exercised as a server by real Go drivers: tests drive it with jackc/pgx. pgx speaks pg-wire to the gateway — confirming the server role end to end. (Multigres wrote its own SCRAM-SHA-256 server because pgx is client-only and offers no server-side SCRAM.)
Checkpoints
Section titled “Checkpoints”For a Q message carrying the 9-byte payload “SELECT 1\0”, what Int32 length goes on the wire, and what does ReadMessageLength return?
The length field is 4 + 9 = 13 (it includes its own 4 bytes but not the Q type byte). ReadMessageLength returns the body length 13 - 4 = 9.A column value comes back as []byte (non-nil, length 0). Is it NULL? Trace what Row.ToProto writes for it.
It is not NULL — it is the empty string. Row.ToProto takes the else branch and writes lengths[i] = 0. Only nil produces -1 (NULL). On the way back, RowFromProto restores 0 as []byte{} and -1 as nil.Why does writePacket panic instead of returning an error when pos != len(buf)?
A size mismatch means the bodyLen pre-pass disagreed with the encode loop, so the buffer is either too short (would have already panicked on an out-of-range write) or too long (trailing garbage). Putting a mis-framed packet on the wire would desync the client mid-session, so panicking surfaces the bug loudly in tests rather than corrupting a live connection.When is ReadyForQuery (Z) sent in the extended query protocol, and why not after each message?
Only after Sync (the serve loop flushes at the Sync/Query boundary). Sending it after Parse/Bind/Describe/Execute — especially after an error — desyncs strict drivers like pgx. After an error mid-batch, the connection drains messages until Sync via the discardingUntilSync flag.Exercises
Section titled “Exercises”-
Trace the simple-query reply path. Starting at
serve()→handleMessage()for an incomingQ, follow intohandler.HandleQuery, then back through the reply writerswriteRowDescription/writeDataRow/writeCommandComplete/writeReadyForQuery. Draw the frontend→backend message sequence and mark exactly where thebufiowriter is flushed (the Sync/Query boundary inserve()). -
Trace the extended-query pipeline. Open
go/common/pgprotocol/server/extended_query_test.go, pick one Parse+Bind+Describe+Execute+Sync scenario, and map each Go call to wire bytes:handleParse(decode stmt name, query, param-type OIDs),handleBind(decode formats + params viaReadByteString), then theParseComplete/BindComplete/ParameterDescription/RowDescriptionreplies. Confirm from the code whyReadyForQueryappears only afterSync. -
Prove the NULL-vs-empty invariant on paper. Follow a NULL column value from
writeDataRow(-1length) →parseDataRowReadByteString(returnsnil) →Row.ToProto(lengths[i] = -1) →RowFromProto(nil). List every spot where collapsingniland[]byte{}would corrupt a result, includingResult.HasFields. -
Run the gate. Run
make pgproto(verify mode), then inspect the recordedtestdata/patches/*.patchand the directive table in that dir’s README. Pick one patched divergence and explain in prose what the gateway does differently from real PostgreSQL and why recording a patch (rather than changing code) was acceptable. -
Follow a bind parameter. Run
grep -rn 'ParamsFromProto' go/to find callers. Forpreparedstatement.lookupBind, explain how the raw bytes, declared OID (paramOidFor), and wire format (paramFormatFor) recombine to decode a single$N. Then state what changes inDecodeBindAsBoolif the client sent that parameter in binary format.
Continue to consensus & failover — how multiorch elects a leader and reroutes traffic when a PostgreSQL primary fails.
See also
Section titled “See also”- Architecture & request flow — where pg-wire sits in client → gateway → pooler → PG.
- gRPC & protobuf — the
query.protoField/Row/PreparedStatement/Portalcarriers thatToProto/FromProtobridge. - Service anatomy — how the gateway’s
Handlerimplementation wirespgprotocol/serverto gRPC routing. - Parser, lexer, AST & codegen —
ast.Oid/oids.gofor type-OID names;parser.ParseSQLused byNewPreparedStatementInfo. - mterrors & observability —
PgDiagnostic, SQLSTATE,NewParseError(42601), the error→ErrorResponseflow. - Language: interfaces, pointers/memory, errors, sync & memory model.
- Reference: glossary (OID, portal, prepared statement, simple vs extended query, SQLSTATE), idioms & gotchas (nil-vs-empty slice, sync.Pool of pointers, length-prefix framing).
- Tooling: testing workflow (running the pgproto suite /
RUN_EXTENDED_QUERY_SERVING_TESTS).