Interactive Guide

What is NATS?

A communication fabric for distributed applications.

NATS lets any service talk to any other service—without knowing where it lives, how many instances are running, or whether it's even online yet. One protocol covers pub/sub, request/reply, queue-based load distribution, persistent streaming, key-value storage, and object storage. Layer on multi-region clustering, edge deployments via leaf nodes, and built-in security with accounts and decentralized auth—all from a single binary.

But to understand what that means—and everything NATS can do—let's start with something everyone knows.

Scroll to explore

HTTP vs NATS

HTTP Server — Layer Stack

ApplicationYour code

HTTP ProtocolGET, POST, 200 OK

TCPReliable, ordered byte stream

The HTTP Server

A process listening on a port.

Everyone is familiar with the HTTP server. It's just a process running on a machine. It binds to a port (usually 80 or 443), accepts connections, and speaks a protocol: HTTP. Under the hood, it's all built on TCP—Transmission Control Protocol handles the reliable delivery so the application doesn't have to.

A Process, a Port, a Protocol

learn more →

NATS is exactly the same idea. Where an HTTP server listens on port 80 and speaks HTTP, a NATS server listens on port 4222 and speaks the NATS protocol. Clients connect via TCP, send messages, and receive messages. No magic, no special infrastructure—just a process on your machine.

A simple foundation—a single binary speaking a simple protocol over TCP.

nginx

HTTP Server

≈

nats-server

NATS Server

Two server processes...

1/7

Why this matters: A process, a port, a protocol. The same mental model as the HTTP server you already know—just speaking a different language on the wire. The difference is what happens once you're connected.

The Limits of HTTP

A server process on TCP. So why not just use HTTP?

HTTP works. But as systems grow, you bolt on a message broker for async work, a service mesh for discovery, a load balancer for routing, and a cache for shared state. Each one adds operational burden, failure modes, and complexity.

Client asks. Server answers. That's the whole protocol. HTTP does exactly one thing: a client sends a request to a specific server, and the server responds. Everything else—push notifications, streaming, fan-out—is bolted on after the fact.

Better protocols don't change the model underneath. Each one improves something—encoding, query flexibility, full-duplex—but none of them introduce native many-to-many messaging. Every connection is still one client talking to one server.

Clientknows URL

GET /api/data

200 OK

Server:443

01Location Dependent

Clients must know the exact URL, host, and port.

→ Discovery

02Always 1:1

One client, one server, every time. Fan-out requires extra infrastructure.

→ Load Balancer

03Sync by Default

Requests block until a response arrives. Both sides must be online.

→ Service Mesh

This is all HTTP does natively

HTTP at its core...

1/11

Why this matters: These tools add streaming and bidirectional communication, but the topology stays the same: one client, one server. For pub/sub fan-out, queue-based load balancing, and location-transparent routing—you still end up needing a separate system.

TCP and NATS

Built for reliability. Not for millions of messages per second.

TCP handles a lot for you—retransmissions, congestion control, ordered delivery. The kernel takes care of it so your application doesn't have to. But at scale, its “helpful” features start working against you.

The NATS protocol is human-readable text over TCP—you can debug it with telnet. But NATS doesn't just speak text over TCP. It takes control of the problems TCP can't solve at high throughput:

TCP — The Foundation

Retransmission

Resends lost packets automatically

Congestion Control

Adjusts flow to avoid overload

Ordered Delivery

Packets arrive in sequence

TCP was built for reliable communication...

1/13

Why this matters: NATS doesn't just ride on TCP—it compensates for TCP's weaknesses. This is why NATS can deliver millions of messages per second with predictable latency, even when clients misbehave or networks hiccup.

Core

CorePub/Sub, request/reply, queue groups

A Different Kind of Messaging

One sender, one receiver, tight coupling everywhere—you've seen what HTTP can't do. NATS flips every one of those constraints.

Instead of addressing servers by location, you address messages by subject. Instead of one-to-one, you get many-to-many. Instead of request/response only, you get fire-and-forget, fan-out, and load-balanced queues—all from the same primitive.

This is the foundation everything else in NATS builds on.

Core

learn more →

The foundational data communication layer for distributed systems

Core is the foundational layer that everything else in the NATS ecosystem builds on. At its heart is publish/subscribe—fire-and-forget messaging where any publisher can reach any subscriber through named subjects.

FOUNDATIONS

Subject-Based Routing

Problem: You must know the exact URL—host, port, path. Every move requires updating DNS, configs, or a service mesh.

NATS solution: Services subscribe to subjects, not endpoints. No service discovery, no load balancer config, no DNS games. Connect anywhere, reach everywhere.

Many-to-Many

Problem: Every request goes to exactly one server. Fan-out, event broadcasting, and audit trails all require separate infrastructure.

NATS solution: Many-to-many by default. Any publisher can reach any number of subscribers. Add an audit service? Just subscribe—zero changes to producers.

Why this matters: Everything that follows—subjects, pub/sub, request/reply, queue groups, JetStream, KV stores—is built on top of Core. Understand this layer and the rest clicks into place.

Subject-Based Routing

learn more →

Publishers and subscribers never need to know where each other are.

In NATS, you don't send messages to servers, IPs, or endpoints — you publish to subjects. A subject is just a string like orders.us.east. No admin CLI, no partition count, no pre-creation. Publish to it and it exists. Subscribers express interest in subjects, and NATS routes messages to them — regardless of where they are in the network.

The dot separator creates a natural hierarchy, and wildcards let subscribers listen to entire categories of messages without knowing every specific subject:

PUBLISH TO SUBJECT:

* — matches exactly one token

> — matches one or more tokens (must be last)

SUBSCRIBERS WHO RECEIVE IT:

orders.>all orders

orders.us.*US orders (one level)

orders.*.easteast region orders

orders.us.eastexact match

>everything

5 matches — Exact match — all wildcards fire

1/5

Why this matters: Location independence means services can move, scale, or be replaced without updating any routing configuration. A new instance just subscribes to the same subjects and starts receiving messages. No service discovery, no load balancer updates, no config changes.

Many-to-Many

learn more →

Many-to-many, not point-to-point.

Publishers send to subjects. Any number of subscribers can listen. Adding a new consumer is a one-line change in the consumer—publishers don't even know. No producer modifications, no coordination, no redeployment.

PUBLISHERS

Order API

User API

Payment API

NATS

SUBSCRIBERS

Processororders.*

Analyticsorders.*, users.*

Notificationspayments.*

3 publishers, 3 subscribers — each with different subject subscriptions

1/5

Why this matters: Need audit logging across every service? Deploy a subscriber that listens to > (everything). Zero coordination, zero producer changes, instant visibility.

Publish & Subscribe

learn more →

Fire-and-forget messaging.

Publish a message and NATS delivers it to all matching subscribers immediately. No disk writes. No acknowledgments. No broker consensus. Just memory-to-memory transfer.

This is at-most-once delivery—and that's a deliberate design choice, not a limitation. For real-time data—telemetry, metrics, live updates—you want the latest value, not a queue of stale ones. No redelivery storms, no message pile-ups. The system stays stable under failure.

Sensor

Publisher

NATS

No ACK

No disk

NATS

No ACK

No disk

Dashboard

online

Logger

online

No replay, no backlog

Sensor, Dashboard, Logger — all connected to NATS

Messages live in memory. Miss it and it's gone — that's the point.

1/5

Why this matters: Most messages don't need persistence guarantees. By making simple pub/sub the default, NATS keeps the majority of your traffic blazing fast. Save the heavyweight machinery for the messages that truly need it.

Request & Reply

learn more →

RPC without the service mesh.

Need a response? NATS creates a unique inbox subject for each request. Responders publish to that inbox, and only you receive the reply. No sidecars, no Envoy config, no Istio—just pub/sub with a clever addressing trick. Send one request and collect multiple replies (scatter-gather) to find the fastest responder or aggregate results from shards. No response? The request times out cleanly—unlike HTTP hanging connections, failed services don't cascade into client-side thread exhaustion.

REQ

Requester

inbox: _INBOX.abc

NATS

RESP

Responder

Requester creates unique inbox (_INBOX.abc)

1/5

Why this matters: gRPC needs protobuf schemas, generated stubs, and often a service mesh for load balancing. NATS request/reply gives you RPC semantics with zero ceremony. Services just subscribe to their name and respond.

No Responders

learn more →

Instant failure feedback, not silent timeouts.

With HTTP, if a service is down your request hangs until a timeout fires—30 seconds of wasted time and a blocked thread. NATS knows the subscription table. If nobody is listening on a subject, the server tells you immediately with a no responders status. No guessing, no waiting, no cascading failures from accumulated hanging connections.

REQ

Requester

NATS

no responders

NATS

no responders

???

No subscribers

Requester sends a request to a subject

1/4

Why this matters: Circuit breakers exist because HTTP can't tell you "nobody is home" fast enough. NATS gives you that answer in microseconds, built into the protocol. One less library to configure, one less failure mode to handle.

Queue Groups

learn more →

Scaling without Kubernetes.

Add a queue group name to your subscription and NATS distributes messages across all subscribers in that group. No coordinator, no leader election, no split-brain scenarios.

Exactly one subscriber — each message goes to one subscriber in the group. Start a new worker and it gets messages immediately—stop one and others pick up instantly. No partition rebalancing delay.

Load balance — NATS distributes work evenly across workers. Scale from 1 to 1000 processes with zero configuration changes.

Fanning out — queue groups and regular subscribers coexist on the same subject. Load balance to workers while simultaneously sending to monitoring and analytics.

PUB

Publisher

msg #1

NATS

queue: workers

Each message goes to exactly one worker in the queue group

1/3

Why this matters: Traditional message brokers tie consumers to partitions or channels, making scaling disruptive. NATS queue groups scale from 1 to 1000 workers with zero configuration changes. Just start more processes.

JetStream

JetStreamStreams, consumers, persistence

Core

When Things Go Down

In distributed systems, something is always down. Deploys, crashes, network blips—failure is the norm, not the exception.

Core is fast precisely because it makes no durability promises—at-most-once delivery, no persistence, no replay. If a subscriber isn't connected when a message is published, that message is gone.

So how do you keep the speed and simplicity of NATS while surviving the reality that things go down?

The Persistence Problem

Every messaging system has the same Achilles' heel.

A message broker sits between producers and consumers, holding messages in memory until they're delivered. This works beautifully—until something goes wrong.

What if a consumer crashes? Messages queue up. What if the broker itself crashes? Everything in memory vanishes. This isn't a bug in any particular system—it's a fundamental tension in distributed messaging.

HAPPY PATH

Producer

msg

Broker

msg

Consumer

online

Producer sends → Broker holds in memory → Consumer receives.

1/6

Why this matters: The persistence problem isn't unique to any one tool. It's the central design tension in all messaging infrastructure. Understanding it helps you evaluate any system—and understand why NATS built JetStream.

JetStream

learn more →

Still fast. Now durable.

Core is fire-and-forget. If no one is listening when a message is published, it vanishes. JetStream adds persistence on top—same protocol, same subjects, but messages are stored and can be replayed.

FOUNDATIONS

Async First

Problem: Fire-and-forget means producers and consumers must be online at the same time. If a consumer is down, the message is gone.

JetStream: Messages persist in streams until acknowledged. Publish now, consume later. Producers and consumers are fully decoupled in time.

Why this matters: Most messaging systems force a choice: fast-and-ephemeral or durable-and-heavy. NATS gives you both. Core for real-time fire-and-forget, JetStream for when every message must be accounted for. One binary, one protocol, two modes.

Persistence

Publish now, consume later.

JetStream offers a durability spectrum. More durability means more work per message—pick the right trade-off per stream.

Memory — fastest. Messages live in RAM and are lost on restart. Ideal for caches and ephemeral state.

Disk — durable. Messages survive server restarts. The default for most workloads.

Replicated — safest. Messages are written to multiple servers. Survives machine failures with no data loss.

Persistence enables temporal decoupling. Subscribers can go offline, come back, and catch up on missed messages automatically.

PUB

Publisher

JETSTREAM STREAM

3 messages persisted

SUB

Consumeronline

CONSUMER RECEIVED:

← caught up!

MEMORY

Producer

msg

Broker

msg

Consumer

Memory — fastest, lost on restart

Publisher and consumer in sync...

1/7

Why this matters: Most brokers force you to choose persistence upfront. NATS lets you mix: telemetry over Core (fast, ephemeral), orders through JetStream (durable, guaranteed). One system, right tool for each job.

Streams

learn more →

Append-only logs that capture messages by subject.

A stream is an append-only log bound to one or more subjects. Every matching message is stored in sequence. Retention policy controls when messages leave the log:

Limits-based — cap by max messages, max bytes, or max age. Oldest messages are discarded when limits are hit.

Interest-based — keep messages until all consumers have seen them. Nothing is discarded while there's still interest.

Work-queue — delete on ack. Each message is processed exactly once, then removed from the stream.

RETENTION: LIMITS

STREAM

max: 6 messages

Limits — oldest messages discarded when the cap is hit

1/3

Why this matters: Streams decouple storage from delivery. Publishers fire messages into subjects as usual—streams silently capture them. No code changes on the publish side, no new API to learn.

Consumers

learn more →

Durable cursors that track delivery progress.

Consumers act as durable cursors, tracking delivery progress independently. Messages redeliver until acknowledged, guaranteeing at-least-once delivery. For critical paths, idempotent publishing with message deduplication provides exactly-once semantics. When a consumer disconnects, JetStream remembers its position. On reconnect, missed messages replay automatically.

STREAM: orders

...

C1pos: 1

C2pos: 0

processed

pending

Consumer tracks position with a cursor.

1/6

Why this matters: Each consumer is independent—add an analytics consumer alongside your processing consumer without affecting either. Multiple readers, one stream, zero interference.

Data Stores

Data StoresKV Store, Object Store

JetStream

Core

Beyond Messages

Streams solved durability. But your app needs more than a log of messages.

Configuration that services read on startup. Session data that changes mid-flight. ML models too large for a single message. Every distributed system eventually needs state and files alongside its event stream—and traditionally that means bolting on a separate KV store, object store, and another SDK.

What if the infrastructure you already have could handle all three?

Data Stores

Most platforms cover one or two of these well, but not all three. Message brokers add key-value APIs as an afterthought. Key-value stores bolt on pub/sub. Nobody covers all three over a single protocol.

NATS does. A key-value pair is just a subject with the latest message retained. A large file is a sequence of chunked messages in a stream. Same protocol, same connection, same replication—no new infrastructure.

FOUNDATIONS

Unified Storage

Problem: Running a message broker, a KV store, and an object store means three separate systems to deploy, monitor, and keep consistent.

NATS: A KV pair is a subject with the latest message retained. A file is chunked messages in a stream. One protocol, one connection, one cluster—no extra infrastructure.

Why this matters: Instead of running a separate KV store and object store alongside your message broker, NATS gives you messages, state, and file storage from the same binary you're already running. One system to deploy, monitor, and reason about.

Key Value Store

learn more →

Key-value storage over your existing NATS connection.

NATS KV is a key-value store built on top of JetStream. Get, put, delete, and watch keys—all over your existing NATS connection.

Every KV bucket is backed by a JetStream stream. Keys map to subjects, values to message payloads, and revisions to sequence numbers. A put("user.123.name", "Alice") becomes a publish to $KV.users.user.123.name. Watchers are just JetStream consumers with subject filters.

Watch — subscribe to key patterns like user.123.> and get notified on every change in real-time. No polling. Build reactive UIs, configuration hot-reload, or distributed coordination.

TTL — keys expire automatically after a configured duration. Clean up stale state without manual intervention.

History — retain previous values per key with atomic compare-and-swap for safe concurrent updates.

KV BUCKET: users

user.123.name

"Alice"r1

user.123.status

"online"r1

config.theme

"dark"r3

user.123.location

"NYC"r1

WATCHING

user.123.>

UPDATE:

KV store with 3 keys...

1/4

Why this matters: A dedicated KV store means another cluster to manage. NATS KV gives you the same get/put/watch semantics over the connection you're already using for messaging. One system, fewer moving parts, and replication comes free from JetStream.

Object Store

learn more →

Large blob storage over NATS.

Store files, images, ML models—anything up to gigabytes. NATS automatically chunks large objects into JetStream messages, handles replication across the cluster, and reassembles them on read. Like S3, but without the separate service.

Each object is split into fixed-size chunks (default 128 KB) and published to a dedicated JetStream stream. Metadata—name, size, content type, checksum—is stored in a companion stream. On read, the client fetches chunks in order and verifies the checksum.

Watch for changes — subscribe to object updates just like KV watchers. Get notified when a model is updated, a config file changes, or a new artifact is published.

Replicated — inherits JetStream replication. Configure R=3 and your objects survive server failures. No need for a separate distributed file system.

FILE

config.bin512 KB

↓

→

chunk

JETSTREAM STREAM

Chunk 1seq 1

Chunk 2seq 2

Chunk 3seq 3

Chunk 4seq 4

↓

→

read

READER

config.bin512 KB

Large file ready to store...

1/4

Why this matters: S3 is durable but adds latency and another SDK. NATS Object Store gives you blob storage over the same connection you're already using for messaging and KV. One system for messages, state, and files.

Scaling

Clustering & Leaf NodesMulti-region, edge deployments

Data Stores

JetStream

Core

Scaling Beyond One Server

Streams, key-value pairs, object storage—all from a single binary. But real systems don't run on a single server.

Production means multiple regions, cloud providers, and edge devices. It means tolerating failures without downtime. It means messages published in Tokyo reaching subscribers in Frankfurt without your code knowing the difference.

NATS was built for this from the start—servers form a mesh automatically, routing messages only where they're needed across clusters, superclusters, and leaf nodes.

Clustering

Connect anywhere, reach everywhere.

NATS servers form clusters so your code doesn't change whether subscribers are local, in another region, or on the edge—the network figures it out. Interest-based routing means messages only flow where subscribers exist—NATS doesn't copy data to regions with no listeners, so bandwidth is automatically optimized.

Leaf nodes — extend NATS to the edge via a 20MB binary that runs on a Raspberry Pi—factories, stores, vehicles, anywhere with intermittent connectivity. Messages queue locally during outages.

Clusters — a group of NATS servers that share clients and messages. They form a full mesh automatically—publish to any node, subscribers on any other node receive it.

Superclusters — connect multiple clusters via gateway connections for global reach. Each cluster operates independently, but messages route seamlessly across all of them.

FOUNDATIONS

Location-Transparent Mesh

Problem: Cross-region messaging usually requires separate replication tooling, connection routing, and careful configuration for each new region.

NATS: Servers discover each other and form a full mesh automatically. Publish to any node, subscribers on any other node receive it. Interest-based routing means messages only flow where listeners exist.

Why this matters: NATS clustering is declarative—list your servers and they form a mesh. Location independence isn't a feature, it's the architecture.

Leaf Node

learn more →

Extend NATS to the edge—factories, stores, vehicles, anywhere.

Single upstream connection — a standalone NATS server (20MB binary) connects to a cluster. Runs on a Raspberry Pi, an industrial gateway, or a vehicle's onboard computer—anywhere a full cluster member would be overkill.

Subject filtering — control exactly which subjects bridge between the leaf and the upstream cluster. A factory floor node might only sync sensors.> upstream while keeping local traffic private. Only the data you choose leaves the edge.

Resilient at the edge — when the upstream connection drops, the leaf node keeps running. Local publishers and subscribers continue communicating. With JetStream enabled, messages destined for the cluster queue until connectivity is restored—no data loss, no application changes.

Device publishes sensor data at the factory...

Same NATS subjects, same client code — just closer to the data source

1/7

Why this matters: Edge deployments usually mean a completely separate messaging stack with custom sync logic. Leaf nodes give you the same NATS subjects, the same client libraries, and the same publish/subscribe semantics—just closer to the data source.

Cluster

learn more →

Full-mesh routes within a region—every server can reach every subscriber.

Full-mesh route connections — every server maintains a direct TCP link to every other server in the cluster. Publish to any server, and it reaches all subscribers regardless of which server they're connected to.

Auto-discovery — point a new server at any existing member and it learns the full topology. No static config files listing every node. If one goes down, clients automatically reconnect to a surviving member.

Transparent failover — clients connect to multiple servers and failover automatically. No connection pooling libraries, no retry logic to write, no circuit breakers to configure—the client library handles it.

Publisher

Sub A

Sub B

Publisher

CLUSTER

Sub A

Sub B

CLUSTER

Publisher sends message to nats-1...

Core routing is symmetric — JetStream uses Raft to elect a leader per stream for consistency

1/6

Why this matters: NATS core routing is symmetric—every server can route messages to every other. For JetStream, NATS uses the Raft consensus algorithm to elect a leader per stream, ensuring consistency without a single external coordinator.

Raft Consensus

learn more →

How JetStream keeps streams consistent across a cluster

One Raft group per stream — JetStream runs a separate Raft consensus group for each stream and each consumer. No single leader bottleneck—each group elects its own leader independently.

Meta group for placement — a cluster-wide meta group (all JetStream-enabled servers) decides where to place new streams and consumers. Each stream group then handles its own data replication; each consumer group tracks delivery state.

Quorum writes — a message is only acknowledged once a majority of replicas have written it. With R3, that means 2 of 3 servers must confirm before the publisher gets an ACK.

RAFT GROUP — STREAM "ORDERS" (R3)

Stream ORDERS replicated across 3 servers (R3)...

R1 = no consensus. R3 = tolerates 1 failure. R5 = tolerates 2. Choose replication factor per stream.

1/8

Why this matters: Raft gives JetStream strong consistency without external dependencies like ZooKeeper or etcd. Every stream gets its own consensus group, so a busy stream can't block an unrelated one.

Supercluster

learn more →

Gateways between regions—global reach without full-mesh overhead.

Gateway connections — each cluster elects a gateway pair that maintains a single logical link to every other cluster. No full-mesh between regions—just targeted hops.

Interest-based routing — messages only traverse gateways when subscribers exist in the remote region. Publish orders.us.east in New York and nobody in Frankfurt is listening? The message never leaves the US cluster.

Accounts and security boundaries are shared across the supercluster, so a service in any region can reach any other service on the same account—same subjects, same permissions, no extra configuration per region.

Publisher

orders.us

Sub A

orders.>

Sub B

orders.>

Publisher

orders.us

SUPERCLUSTER

Sub A

orders.>

Sub B

orders.>

SUPERCLUSTER

Publisher sends to orders.us in US-EAST...

No subscribers in a region? Message never leaves the origin cluster

1/5

Why this matters: Traditional multi-region messaging requires dedicated replication pipelines per topic, manual failover runbooks, and careful bandwidth budgeting. Superclusters make it declarative—list your clusters and gateways handle the rest.

Security

SecurityTLS, NKeys, JWTs, account isolation, auth callout

Clustering & Leaf Nodes

Data Stores

JetStream

Core

Locking It Down

Clusters spanning regions, leaf nodes at the edge, messages flowing everywhere—but who's allowed to connect? And what stops one tenant's data from leaking into another?

Most messaging systems bolt security on after the fact—an external auth service here, a firewall rule there. NATS takes the opposite approach: authentication, authorization, and multi-tenancy are built into the protocol itself.

No external databases to secure. No config file restarts to add users. Just cryptographic identity baked into every connection.

Security

learn more →

Secure by default, decentralized by design.

TLS encryption is a flag away. Accounts are isolated by default—multi-tenant by design, messages in one account are invisible to others, shared only through explicit exports/imports. Auth scales from simple credentials to fully decentralized identity:

Tokens — a single shared secret string. The simplest option for development and internal services.

Username/Password — familiar credentials with optional bcrypt hashing. Easy to set up, easy to reason about.

NKeys — Ed25519 key pairs where the server stores only public keys—private keys never leave the client. Nothing valuable to steal.

Decentralized JWTs — account operators issue credentials without touching server config. Add users, revoke access, change permissions with no server restart and no config file edits.

Auth Callout — delegate authentication to your own external service. Plug in LDAP, OAuth2, or any custom identity provider without forking the server.

ACCOUNT A

Order Service

orders.new

Processor

NATS

export

↔

import

ACCOUNT B

Billing

blocked

orders.new

Analytics

Two accounts on the same NATS server...

1/5

Why this matters: Most message brokers need external auth systems for production. NATS bakes multi-tenancy into the protocol. Onboard new customers without server changes. Revoke access instantly. Keep tenants isolated without running separate clusters.

Auth Callout

learn more →

Your auth rules. NATS enforcement.

NKeys and JWTs handle most scenarios, but what if credentials already live in an LDAP directory, an OAuth provider, or a custom database?

Auth Callout — lets NATS delegate authentication to your own service—a regular NATS subscriber that receives connection requests, validates credentials against any backend, and returns a signed JWT with scoped permissions.

The callout service subscribes to $SYS.REQ.USER.AUTH. When a client connects, NATS publishes the connection details to that subject. Your service validates, builds a user JWT, signs it, and replies. NATS enforces the permissions—your service owns the decision.

CLIENT

user: alice

creds

pub: orders.>

sub: replies.>

→

NATS SERVER

NATS

delegating...

→

AUTH SERVICE

$SYS.REQ.USER.AUTH

✓ validated

→

BACKEND

LDAP / OAuth / DB

LDAP

OAuth

CLIENT

user: alice

creds

pub: orders.>

sub: replies.>

↓

NATS SERVER

NATS

delegating...

↓

AUTH SERVICE

$SYS.REQ.USER.AUTH

✓ validated

BACKEND

LDAP / OAuth / DB

LDAP

OAuth

CONNECT + creds

auth request

signed JWT

✓authorized connection

Client connects with credentials...

Auth logic lives in your code — NATS just enforces the result

1/5

Why this matters: Most brokers force a choice: use their auth system or build a custom plugin in their language. Auth Callout lets you write validation logic in any language, against any backend, deployed as a regular NATS client. Swap auth providers without touching server config.

Why Not Just Use ___?

KafkaEvent streaming, durable logs

RabbitMQTraditional message broker

RedisIn-memory cache + pub/sub

PulsarTiered storage streaming

ZeroMQBrokerless messaging library

But What About ___?

Messaging, persistence, data stores, clustering—all from one system. So why does anyone reach for something else?

Kafka has been the default for event streaming. RabbitMQ owns the traditional message broker space. Redis is everyone's first cache. Each is battle-tested, well-documented, and already in your stack. The question isn't whether they work—it's what you pay in complexity when your needs span more than one of them.

Let's see how NATS stacks up—and where the trade-offs actually are.

At a Glance

Before diving into the details, here's what each system gives you out of the box.

Capability	NATS	Kafka	RabbitMQ	Redis	Pulsar	ZeroMQ
Pub/Sub	✓	✓	✓	✓	✓	✓
Request/Reply	✓	✕	✓	✕	✕	✓
Queue Groups	✓	✓	✓	✓	✓	✕
Persistence	✓	✓	✓	✓	✓	✕
Streaming / Replay	✓	✓	✓	✓	✓	✕
KV Store	✓	✕	✕	✓	✕	✕
Object Store	✓	✕	✕	✕	✕	✕
Wildcard Subjects	✓	✕	✓	✓	✕	✕
Clustering	✓	✓	✓	✓	✓	✕
Multi-Tenancy	✓	✕	✓	✕	✓	✕
Auth Built-In	✓	✓	✓	✓	✓	✓
Single Binary	✓	✕	✕	✓	✕	✕

Protocol & Fundamentals

How each system talks on the wire, names its destinations, and layers on persistence.

The protocol shapes everything else—how easy it is to debug, how fast you can onboard, and how much ceremony stands between you and your first message. Compare the foundations.

NATSWire Protocol

Plain text over TCP. You can debug it with telnet. Write a client in any language in an afternoon.

Kafka

Custom binary protocol. Requires a client library — no telnet debugging.

RabbitMQ

AMQP 0-9-1 — complex binary framing with channels, exchanges, and bindings baked into the wire format.

Redis

RESP protocol. Simple, but no built-in routing, wildcards behave differently, and no queue groups.

ZeroMQ

ZMTP binary protocol. Powerful socket types, but no broker — you wire topology by hand. No subject routing, no discovery, no clustering.

Wire Protocol — simplicity you can telnet to

1/3

Why this matters: A simple protocol means fewer dependencies, faster debugging, and clients in every language. NATS gives you that simplicity at the wire level, and JetStream layers persistence on top without changing the protocol.

Messaging Patterns

Every messaging system claims pub/sub. Few give you the rest without extra infrastructure.

Pub/Sub, request/reply, queue groups, and back-pressure are the building blocks of real-time messaging. See how NATS handles each one natively, while alternatives require workarounds, extra components, or manual plumbing.

NATSPub/Sub

Built into the core protocol — fire-and-forget by default. No offsets, no replay, no pile-up. Messages that nobody is listening for simply disappear.

Kafka

Every message hits disk and gets an offset. Consumers replay from last committed offset — sometimes thousands of stale messages after a crash.

RabbitMQ

Requires declaring exchanges, queues, and bindings before a single message flows. Slow consumers cause memory pressure on the broker.

Redis

Simple fire-and-forget pub/sub with glob wildcards via PSUBSCRIBE, but no dot-delimited subject hierarchy and no fan-out control.

ZeroMQ

PUB/SUB socket types exist, but you manage connections and topic filtering yourself. No broker means no fan-out guarantees — if a subscriber isn't connected, the message is gone.

Pub/Sub — built in, not bolted on

1/4

Why this matters: Every alternative solves one or two patterns—pub/sub here, queuing there, request/reply somewhere else. NATS Core handles all of them natively, in a single binary, with zero external dependencies.

Persistence & Streaming

JetStream isn't the first persistent streaming system. But it's the only one that isn't a separate system.

Kafka, Pulsar, and RabbitMQ Streams were built for durable streaming too. The difference is in what else they bring along: separate protocols, separate clusters, separate operational burdens. JetStream is a capability you turn on inside the NATS server you already run.

JetStreamPersistence Model

Choose per-subject: ephemeral telemetry over Core, durable orders through JetStream. One system, right guarantee for each message.

Kafka

Every message hits disk, even ephemeral telemetry you'll never read again. You can set very short retention per-topic, but messages always touch disk first.

Pulsar

Tiered storage helps offload old data. Non-persistent topics exist for ephemeral data, but persistent topics always flow through BookKeeper first.

RabbitMQ

Quorum queues add durability, but at the cost of throughput. Classic queues are faster but can lose messages on node failure.

Persistence Model — choose the right guarantee per message

1/4

Why this matters: Kafka and Pulsar are powerful distributed logs. But they are separate systems with separate protocols, separate operational burdens, and separate failure modes. JetStream gives you durable streams, key-value storage, and object storage—all inside the same binary that handles your real-time messaging.

Operations & Security

Running at scale means clustering, security, and multi-tenancy. NATS builds all three into the server.

Clustering shouldn't require a coordinator service. Auth shouldn't require an external system. Multi-tenancy shouldn't be a naming convention. See how NATS handles operations and security compared to the alternatives.

NATSClustering

Full-mesh clustering with zero-config gossip protocol. Add a node, point it at any existing node, and the cluster self-organizes. Superclusters span regions with gateway connections.

Kafka

Requires KRaft for metadata consensus. Broker addition needs partition reassignment. Cross-region replication is a separate product (MirrorMaker).

RabbitMQ

Clustering works within a LAN but breaks across regions. Federation and shovels exist but add operational complexity.

Redis

Redis Cluster shards data by hash slots. Adding nodes means resharding. Sharded Pub/Sub (Redis 7.0+) participates in the cluster protocol, but classic Pub/Sub does not.

ZeroMQ

No clustering — it's a library, not a server. You build your own topology with broker patterns (ROUTER/DEALER), but there's no automatic failover or discovery.

Clustering — self-organizing, multi-region

1/2

Why this matters: Clustering that self-organizes and security that's built into the protocol mean fewer moving parts in production. No separate metadata service, no external auth backends, no separate federation plugins.

The Full Picture

Security

Clustering & Leaf Nodes

KV Store & Object Store

JetStream

Core

The Full Picture

We've seen how NATS compares to the alternatives. Now let's step back and see the full picture.

Four layers in a single binary—core messaging, persistent streams, data stores, and clustering. Most apps only need one or two. The question is which layers fit your problem, and when to reach for each.

One System, Not Eight

The others make you stitch together separate services. NATS is one unified system—not a bundle of features, but layers built on a single foundation.

Most microservice architectures require a service mesh for discovery, a load balancer for routing, a message broker for async communication, a cache for shared state, object storage for large payloads, multi-region replication, and a separate auth layer. That's eight systems with eight protocols, eight auth configurations, and eight failure modes. NATS replaces them not by bundling eight tools together, but because everything—pub/sub, streaming, KV, object storage—is built on the same subjects, connections, and security model.

TYPICAL MICROSERVICES INFRASTRUCTURE

Service MeshIstio, Linkerd, Consul

Sidecars, mTLS, traffic routing, retries

Service DiscoveryConsul, etcd, Eureka

Registry, health checks, DNS

Load BalancerNGINX, HAProxy, ALB

Routing rules, sticky sessions, health probes

Message BrokerKafka, RabbitMQ, SQS

Topics, partitions, consumer groups, schemas

Cache / KV StoreRedis, Memcached

Session state, distributed locks, pub/sub

Object StorageS3, MinIO, GCS

Blobs, file artifacts, large payloads

Multi-region / EdgeMirrorMaker, CDN, IoT Gateways

Cross-region replication, edge compute, data locality

Auth / Multi-tenancyVault, Keycloak, OPA

Identity, token management, tenant isolation

8+ systems to install, configure, and operate

Why this matters: Every additional system is another protocol, another cluster, another 3am failure mode. NATS isn't a Swiss Army knife of bolted-on features—it's a layered architecture where KV is built on JetStream, JetStream is built on Core, and everything shares one subject namespace, one connection, and one auth model. One binary to deploy, one system to monitor, one set of skills to learn.

Putting It Together

Start simple. Layer on capabilities as your needs grow.

NATS gives you a toolkit of capabilities that share one subject namespace, one connection, and one security model. Most applications start with Core and add capabilities only when needed. Security wraps everything—enable it at any point, and it applies across all layers.

Start here

Core

JetStream

Data Stores

Clustering

Add as needed

Start here→Add as needed

Securitywraps all layers

Real-time telemetry & metricsPub/Sub

Speed matters, occasional loss is fine

Service-to-service RPCRequest/Reply

Built-in timeouts, no extra infrastructure

Work distribution across instancesQueue Groups

Instant scaling, no coordination needed

Fast failure when services are downNo Responders

Microsecond feedback, no hanging connections

1/5

Start simple. Add complexity only when needed.

That's the NATS philosophy—and why it scales from IoT sensors to global financial systems.