SQL vs NoSQL for Low Latency: Chasing the 'Why'

Everyone repeats the same checklist for picking NoSQL:

Your application requires super-low latency.
Your data is unstructured, or you have no relational data.
You only need to serialize/deserialize data (JSON, XML, YAML).
You need to store a massive amount of data.

I latched onto the first line. NoSQL also has higher write throughput, so low latency makes sense. But why? This post follows the questions that popped into my head, in order, until the "why" actually landed.

Why is a relational write slower in the first place?

Short answer: relational databases do more work per write to guarantee correctness. Many NoSQL databases relax some guarantees to get lower latency and higher throughput.

Insert one order into PostgreSQL or MySQL. The database may need to:

Validate constraints — primary key uniqueness, foreign key existence, check constraints.
Update multiple indexes — B-tree indexes must stay sorted.
Maintain ACID — atomicity, consistency, isolation, durability.
Write to the transaction log (WAL/redo) and potentially wait for fsync before acknowledging.
Handle locks or MVCC bookkeeping.

One logical write triggers several physical operations.

Now a typical NoSQL write (Cassandra, Redis, DynamoDB):

key -> value

{ "userId": 123, "name": "Farid", "orders": [...] }

No joins. No foreign keys. Fewer constraints. Often append-only. The database appends to a log, updates in-memory structures, replicates asynchronously, and returns success immediately.

Less work = lower write latency.

So is SQL just slow?

No. That's the wrong conclusion. The per-write difference is real but modest. The real reason low-latency systems reach for NoSQL is horizontal scaling.

One relational server handling 10,000 writes/sec eventually hits a wall — CPU, memory, disk IOPS, network. Scaling writes across many machines is hard when:

Joins may span servers.
Transactions may span servers.
Strong consistency must hold across all of them.

Distributed transactions are expensive.

Many NoSQL systems are designed around sharding from day one:

User 1-1M    -> Server A
User 1M-2M   -> Server B
User 2M-3M   -> Server C

A write touches one partition. PUT user:123 goes straight to the one machine that owns that key. Add servers, get 100k → 1M → 10M writes/sec.

And often you care about P99 < 10ms more than perfect consistency. A metrics event arriving 100ms late: nobody notices. A bank transfer losing consistency: disaster. Different requirements, different tools.

Wait — can't I just shard PostgreSQL the same way?

Yes. PostgreSQL can absolutely be sharded. Instagram, Shopify, and Discord have all run sharded Postgres.

The real question isn't "can it?" — it's "how much complexity appears once data spans shards?"

If every request touches exactly one user:

SELECT * FROM users WHERE user_id = 123;

Sharded Postgres scales great.

It hurts when you cross shards:

SELECT * FROM users u JOIN orders o ON u.id = o.user_id;

If users lives on Shard A and orders on Shard B, the database must query multiple machines, ship data over the network, and do a distributed join. Network latency dwarfs local memory/disk access.

Transactions get worse. Move $100 from Account A (Shard A) to Account B (Shard B):

Deduct from A.
Network fails.
Step for B never runs.

Money lost. The fixes — two-phase commit, consensus, distributed transaction coordinators — all add latency.

NoSQL systems lean on one principle: a request should ideally touch one partition. That's the whole trick. It's not magic in the engine — it's a constraint in the data model.

But MongoDB has `$lookup` — isn't that basically a join?

This is where people conclude "Mongo supports joins, so it scales like a relational DB." It doesn't follow.

$lookup does perform join-like operations:

db.orders.aggregate([
  { $lookup: {
      from: "users",
      localField: "userId",
      foreignField: "_id",
      as: "user"
  }}
])

Conceptually the same as:

SELECT * FROM orders JOIN users ON orders.user_id = users.id;

The difference is architectural expectation, not capability.

PostgreSQL is built assuming you join constantly. A normalized schema (Users, Orders, Payments, Subscriptions) is encouraged.

MongoDB nudges you to embed instead:

{
  "_id": 123,
  "name": "Farid",
  "subscriptions": [...],
  "orders": [...],
  "payments": [...]
}

One read returns everything. db.users.findOne({_id: 123}) touches one shard. No join.

And in a sharded cluster, $lookup across shards does exactly what distributed SQL does: contact multiple shards, gather, merge. NoSQL does not eliminate cross-node joins. If your Mongo app is $lookup stacked on $lookup, you're treating it like a relational database — and inheriting the same scaling and latency pain.

The actual "why"

The expensive thing was never SQL. The expensive thing is combining data that lives on different machines.

Call it:

JOIN in PostgreSQL
$lookup in MongoDB
distributed query in Cassandra
graph traversal in Neo4j

Same underlying cost: network communication and coordination.

So the honest rule isn't "SQL slow, NoSQL fast." It's:

Relational databases optimize for correctness and rich querying. NoSQL databases often optimize for predictable low-latency writes and horizontal scaling — by restricting features (joins, foreign keys, distributed transactions) so the data for a request stays colocated on one shard.

High-scale, low-latency systems spend more effort on data modeling than on database choice. They make sure each request can be answered by a single machine.

The modern reality

The line is blurring. PostgreSQL, CockroachDB, and TiDB hit very high throughput. Many NoSQL systems now support transactions and stronger consistency. Distributed SQL — CockroachDB, TiDB, Spanner, YugabyteDB — proves SQL can scale horizontally.

The catch: distributed + strong consistency + low latency is fundamentally hard. Every system trades something somewhere.

A common production architecture just uses each for what it does best:

PostgreSQL  -> source of truth
Redis       -> low-latency cache
Kafka       -> event stream
Cassandra   -> high-volume analytics/events

The takeaway I started with — "NoSQL for low latency" — turned out to be a shortcut for: keep the data a request needs on one machine, and pick whichever engine makes that natural.