Pipelines, Webhooks & Observability
The chapters so far covered what Auther is — its authentication core, its permission engine, its machine-access story. This one covers how you extend and operate it: running your own logic at auth events without redeploying the server, notifying other systems when something happens, and seeing what the running system is actually doing. Four subsystems work together here — pipelines, webhooks, metrics, and the admin UI that ties them together.
The pipeline system
Section titled “The pipeline system”Pipelines let you run your own code at auth events. Most identity providers force a choice: either accept the built-in behavior, or fork the server to change it. Auther’s pipeline system is the escape hatch in between. Administrators attach custom Lua scripts to 16 auth lifecycle hooks, and those scripts run sandboxed at the moment an event fires — so you can geo-block a signup, enrich a token with custom claims, or fan out a side effect, all without touching the codebase.
Scripts are organized as a DAG — a directed acyclic graph, where each node is a script and the edges say “this one runs after that one,” with no cycles allowed — and executed under strict safety limits.
Lifecycle hooks
Section titled “Lifecycle hooks”The hooks are defined in src/lib/pipelines/definitions.ts and fall into three
families. Each hook has a type that determines what a script attached to it can
do: a blocking hook can abort the operation, an async hook runs
side effects after the fact, and an enrichment hook contributes data back into
the flow.
The authentication lifecycle is where most policy lives:
| Hook | Type | Purpose |
|---|---|---|
before_signup | Blocking | Can abort user registration (e.g. geo-blocking, email domain filtering) |
after_signup | Async | Post-registration side effects (e.g. send a welcome Slack message) |
before_signin | Blocking | Can abort sign-in (e.g. ban check, suspicious IP detection) |
after_signin | Async | Post-login side effects (e.g. audit logging) |
before_signout | Blocking | Can abort sign-out (rarely used) |
token_build | Enrichment | Inject custom claims into JWTs |
The API-key lifecycle mirrors the create/exchange/revoke flow from the API Keys & Machine Access chapter:
| Hook | Type | Purpose |
|---|---|---|
apikey_before_create | Blocking | Can abort API key creation |
apikey_after_create | Async | Post-creation side effects |
apikey_before_exchange | Blocking | Can abort the API key to JWT exchange |
apikey_after_exchange | Async | Post-exchange side effects |
apikey_before_revoke | Blocking | Can abort API key revocation |
And the OAuth client lifecycle covers registration and authorization:
| Hook | Type | Purpose |
|---|---|---|
client_before_register | Blocking | Can abort client registration |
client_after_register | Async | Post-registration side effects |
client_before_authorize | Blocking | Can abort authorization |
client_after_authorize | Async | Post-authorization side effects |
client_access_change | Async | Access level changes |
Execution model
Section titled “Execution model”The PipelineEngine (src/lib/auth/pipeline-engine.ts) treats a pipeline as a
layered DAG. The key insight is that scripts in the same layer have no dependency on
each other, so they can run together, while later layers depend on the results of
earlier ones — so the engine walks the graph one layer at a time.
flowchart TB
Start["Hook fires<br/>(e.g. before_signup)"] --> L1
subgraph L1["Layer 1 — parallel via Promise.all"]
A["script A"]
B["script B"]
end
L1 -->|"outputs become next layer's prev"| L2
subgraph L2["Layer 2 — parallel"]
C["script C"]
D["script D"]
end
L2 --> L3["Layer 3 …"]
L3 --> Done["Aggregate result"]
Done --> Decision{"any script<br/>returned allowed: false?"}
Decision -->|"yes (blocking hook)"| Block["Abort the auth flow"]
Decision -->|"no / error / timeout"| Pass["Continue (fail-open)"]
Four properties define the model:
- DAG execution. Scripts are organized into layers. Layers execute
sequentially; scripts within a layer execute in parallel via
Promise.all. - Context propagation. Each layer’s output becomes the next layer’s
previnput. Anoutputsmap lets a script reach back and read the result of a specific named node. - Safety limits. A pipeline is capped at
MAX_CHAIN_DEPTH = 10layers andMAX_PARALLEL_NODES = 5scripts per layer. - Fail-open. Script errors and timeouts do not block the auth flow. Only an
explicit
allowed: falsereturn from a blocking hook stops the operation.
The script sandbox
Section titled “The script sandbox”Because pipeline scripts are user-authored and run on the auth server’s hot path,
the sandbox is doing real security work, not bookkeeping. Each Lua script runs
inside wasmoon (Lua-in-WASM) with the following protections:
| Protection | Detail |
|---|---|
| Disabled globals | os, io, package, require, loadfile, dofile |
| Instruction limit | 50,000 operations (enforced via debug.sethook) |
| Execution timeout | 10 seconds |
| SSRF protection | helpers.fetch() blocks private IPs, enforces HTTPS, applies a 3s timeout and a 1MB response limit |
| Secret encryption | Pipeline secrets are encrypted at rest with AES-256-GCM |
Rather than exposing raw Lua libraries, the sandbox hands scripts a curated
helpers table:
helpers.log(message) -- Log to tracehelpers.now() -- Current Unix timestamphelpers.hash(value) -- SHA-256 hashhelpers.env(key) -- Read env var (restricted)helpers.secret(key) -- Read encrypted secrethelpers.fetch(url, options) -- HTTPS-only HTTP clienthelpers.matches(value, pattern) -- Pattern matchinghelpers.trace(name, fn) -- Create nested spanhelpers.metrics.count(name, val) -- Emit counter metrichelpers.metrics.gauge(name, val) -- Emit gauge metricTo make this concrete, here are two sketches that use the model end to end. The
first is an enrichment hook on token_build that adds a custom claim to every
JWT — the kind of thing you’d otherwise have to fork the server for:
-- token_build (Enrichment): tag every token with a plan tierlocal tier = helpers.fetch("https://billing.internal.example/tier").bodyhelpers.metrics.count("token.enriched", 1)return { claims = { tier = tier } }The second is a blocking hook on before_signin that aborts a login. Returning
allowed: false is the only thing that stops the flow — anything else (including a
crash or timeout) lets sign-in proceed, which is the fail-open behavior described
below:
-- before_signin (Blocking): block sign-in when an external check says solocal banned = helpers.fetch("https://risk.internal.example/check").bodyif helpers.matches(banned, "true") then helpers.metrics.count("auth.signin.blocked", 1) return { allowed = false, reason = "blocked by risk check" }endreturn { allowed = true }Pipeline observability
Section titled “Pipeline observability”Every pipeline run produces OpenTelemetry-compatible traces (the pipeline_traces
table) and spans (the pipeline_spans table). The helpers.trace(name, fn) helper
above is what lets a script carve its work into nested spans. The admin UI renders
these as a waterfall trace viewer, so you can see exactly which layer and which
script spent the time on any given execution.
The webhook system
Section titled “The webhook system”Webhooks tell other apps what happened. Pipelines run logic inside Auther; webhooks are the outbound counterpart. They let external applications receive real-time notifications about auth events, so a downstream system can react when a user is created, a session is deleted, or an OAuth client is registered.
A concrete case: your billing service subscribes to user.created, and the moment
someone signs up Auther fires it an HTTP POST so the service can provision an account
— no polling, no shared database.
Architecture
Section titled “Architecture”Four database tables model the webhook lifecycle, and the split matters — it separates configuration from intent from the record of what happened from each attempt to deliver it:
| Table | Role |
|---|---|
webhook_endpoint | A destination URL with its configuration: retry policy, delivery format, and encrypted signing secret. |
webhook_subscription | Which event types each endpoint wants to receive (e.g. user.created, session.deleted). |
webhook_event | A record that something happened, with a JSON payload snapshot. |
webhook_delivery | A specific attempt to send one event to one endpoint, with status tracking. |
Event types
Section titled “Event types”The events an endpoint can subscribe to span the user, session, account, verification, client, and access lifecycles:
user.created, user.updated, user.deleted, user.verifiedsession.created, session.deletedaccount.linked, account.unlinkedverification.sent, verification.completedclient.created, client.updated, client.deletedaccess.granted, access.revokedDelivery flow
Section titled “Delivery flow”Delivery is deliberately asynchronous: an auth event records what happened and hands the actual HTTP delivery to a queue, so a slow or unavailable recipient never slows down the auth flow that triggered it. The path from event to delivered payload runs in seven steps:
- An auth event occurs — for example, a user signs up.
emitWebhookEvent()insrc/lib/webhooks/delivery-service.tscreates an event record.- For each active endpoint subscribed to this event type, a
pendingdelivery record is created. - A
WebhookDeliveryJobis enqueued to Upstash QStash — a hosted message queue that holds the job and retries delivery on Auther’s behalf. - The QStash worker at
/api/internal/queues/webhook-deliveryreceives the job, verifies the QStash signature, and callsdeliverWebhook(). - The delivery service loads the event and endpoint, decrypts the signing secret, builds an HMAC-SHA256 signed payload, and makes the HTTP request.
- The delivery record is updated with the response status, body, and duration.
Retries are governed by the endpoint’s policy: none performs zero retries, while
standard allows three retries managed by QStash’s exponential backoff.
Webhook signatures
Section titled “Webhook signatures”A recipient needs to trust that a payload genuinely came from Auther and wasn’t
tampered with in transit. Payloads are therefore signed with HMAC-SHA256
(src/lib/webhooks/signature.ts), and recipients verify the signature using a
constant-time comparison to avoid timing attacks. Each request carries four headers:
x-webhook-id: <event_id>x-webhook-signature: <hmac_sha256_hex>x-webhook-timestamp: <unix_ms>x-webhook-origin: better-authMetrics and observability
Section titled “Metrics and observability”Metrics tell you what the running system is doing. You cannot operate what you
cannot see. The metrics system
(src/lib/services/metrics-service.ts) records counters, gauges, and histograms to
the project’s own SQLite database (the metrics table), so operational telemetry
lives alongside the data it describes rather than in a separate observability stack.
| Metric type | Method | Example |
|---|---|---|
| Counter | count(name, value, tags) | auth.login.attempt with { method: "email", status: "success" } |
| Gauge | gauge(name, value, tags) | jwks.active_key.age_ms |
| Histogram | histogram(name, value, tags) | authz.check.duration_ms with { result: "allowed" } |
| Measured | measure(name, fn, tags) | Wraps an async function and records its duration |
The key metrics tracked across the system give a sense of what’s worth measuring in an IdP:
auth.login.attempt— login attempts by method and statusauth.register.success— successful registrationsauth.session.created.count— new sessions by sourceoidc.{route}.request.count— OIDC endpoint requestsoidc.{route}.latency_ms— OIDC endpoint latencyauthz.check.duration_ms— permission-check latencyauthz.decision.count— permission decisions by result and sourceauthz.rebac.traversal_depth— ReBAC graph traversal depthjwks.rotate.triggered.count— key rotationsapikey.resolve.duration_ms— API key permission resolution timewebhook.delivery.duration_ms— webhook delivery latencylua.pool.active— active Lua engine countemail.send.duration_ms— email-sending latency
The admin dashboard at /admin visualizes these metrics with Recharts.
The admin UI
Section titled “The admin UI”Everything above — pipelines, webhooks, metrics — needs an operator-facing surface,
and that is the admin dashboard at /admin. It’s also the management console for the
identity data covered in earlier chapters. Each section maps to a subsystem you’ve
already met:
| Section | Path | Purpose |
|---|---|---|
| Dashboard | /admin | Metrics visualization, system overview |
| Users | /admin/users | User management (create, edit, ban, impersonate) |
| Groups | /admin/groups | User-group management with member and permission tabs |
| Clients | /admin/clients | OAuth client management (register, configure, access control) |
| Keys | /admin/keys | JWKS signing-key management and rotation |
| Sessions | /admin/sessions | Active session monitoring and revocation |
| Webhooks | /admin/webhooks | Webhook endpoint configuration and delivery logs |
| Pipelines | /admin/pipelines | Pipeline editor (DAG canvas), secret management, trace viewer |
| Access | /admin/access | Platform-level access-control management |
| Requests | /admin/requests | Permission-escalation request review |
| Settings | /admin/settings | Platform configuration |
| Profile | /admin/profile | Current user’s profile and session management |
Crucially, the admin UI is not a privileged bypass. Each section is protected by the
corresponding platform guard — /admin/users, for example, requires the
users:view permission — so the same ABAC engine described in
ABAC, Guards, Groups & Invites gates the console
itself.