Limits & scaling

This page summarizes practical limits and scaling considerations.

For hard limits (sizes and allowed characters), see Limits.

Key limits to keep in mind

Payload size: keep payloads small (≤64KB by default).
Channel and tenant length: use short, stable names.
Subscription count: prefer fewer broad subscriptions over many tiny ones.

Scaling patterns

Use channel roots and shard hot streams

If one channel becomes too hot, shard it into a fixed set:

orders.shard.00.# … orders.shard.63.#

This reduces fanout pressure per channel.

Keep fanout predictable

High fanout (1→N) is normal, but “unbounded fanout” can become an incident.

Recommendations:

apply per-tenant quotas if possible,
keep an eye on “top channels by listeners”,
separate public and internal traffic roots (example: pub.# vs sys.#).

Design for drops

Because delivery is best-effort and online-only, design payloads and consumers to tolerate loss:

send periodic “state snapshots” if intermediate updates can be missed,
include sequence numbers when ordering matters,
make consumers idempotent (deduplicate by message id if needed).

See Delivery semantics.

Capacity signals

Typical early warning signs:

rising disconnect rate (especially “slow consumer” patterns)
rising drops
growing publish rate without matching deliver rate
increasing p95/p99 latency

If you track these, you can scale before users notice.

Limits & scaling