Kafka vs Redpanda: Real Benchmarks on Identical Hardware

Redpanda claims 10x lower latency than Kafka. We tested both on identical VMs to see if the numbers hold up. Spoiler: reality is more nuanced than the marketing suggests, and the winner depends heavily on your workload pattern.

Original content from computingforgeeks.com - post 165371

This benchmark covers Apache Kafka 4.2.0 (running in KRaft mode, no ZooKeeper) against Redpanda 26.1.2 community edition. We measured producer throughput, consumer throughput, end-to-end latency, sustained load performance, and resource consumption. Same hardware, same test tools, same message sizes. The raw numbers tell a story that neither vendor’s marketing team wants you to hear. If you’re evaluating Kafka or Redpanda for a new project, these results should save you a few weeks of tire-kicking.

Tested April 2026 on Ubuntu 24.04.4 LTS with Kafka 4.2.0 (KRaft) and Redpanda 26.1.2, 4 vCPU, 8GB RAM

Test Environment

Two identical Proxmox VMs, each running Ubuntu 24.04.4 LTS. The specs: 4 vCPU (host-model passthrough), 8 GB RAM, 50 GB virtio disk backed by NVMe. Network is a bridged 1 Gbps link. One VM ran Kafka 4.2.0, the other ran Redpanda 26.1.2 community edition.

Kafka ran in KRaft mode with the combined controller+broker configuration. ZooKeeper was removed entirely in Kafka 4.0, so KRaft is the only option now. The JVM heap was left at the default 2 GB (KAFKA_HEAP_OPTS="-Xmx2G -Xms2G"). Redpanda ran as a single-node deployment with --mode dev-container disabled, meaning production defaults for I/O and memory.

All benchmarks used Kafka’s own performance test tools (kafka-producer-perf-test and kafka-consumer-perf-test), which ship with both Kafka and Redpanda’s rpk toolchain. This keeps the comparison fair because the same client code drives both brokers.

Verify the Kafka version:

kafka-broker-api-versions.sh --bootstrap-server 10.0.1.50:9092 | head -1

The output confirms Kafka 4.2.0:

10.0.1.50:9092 (id: 1 rack: null) -> (
  ApiVersion(apiKey=PRODUCE, minVersion=0, maxVersion=11)
  ...
)

Check Redpanda’s version on the other VM:

rpk version

You should see something like:

v26.1.2 (rev abcdef12)
rpk v26.1.2

Architecture Differences

Before the numbers, a quick look at what makes these two systems fundamentally different under the hood.

Feature	Apache Kafka 4.2.0	Redpanda 26.1.2
Language	Java (JVM)	C++ (Seastar framework)
Consensus	KRaft (ZooKeeper fully removed in 4.0)	Raft
Schema Registry	Separate (Confluent Schema Registry)	Built-in
HTTP Proxy	Separate (Confluent REST Proxy)	Built-in (Pandaproxy)
Deployment	JDK + Kafka tarball or container	Single binary or container
Memory model	JVM heap (configurable, default 2 GB)	Allocates available RAM for cache
Thread model	Thread pool	Thread-per-core (Seastar)
Ecosystem	Kafka Connect, Kafka Streams, ksqlDB	Kafka API compatible, no native equivalent

The architectural bet is clear. Kafka leans on the JVM ecosystem and a massive community. Redpanda bets that C++ with a thread-per-core design can deliver better latency without the JVM overhead. Both approaches have trade-offs that show up in the benchmarks below.

Producer Throughput Results

Each test sent 1 million messages. We varied message size (100 bytes and 1 KB), partition count (1 and 6), and acknowledgment mode (acks=1 and acks=all). The producer perf test command looked like this for Kafka:

kafka-producer-perf-test.sh \
  --topic test-topic \
  --num-records 1000000 \
  --record-size 100 \
  --throughput -1 \
  --producer-props bootstrap.servers=10.0.1.50:9092 acks=1

The same command ran against Redpanda on 10.0.1.51 with identical parameters. Here are the results:

Test	Kafka rec/s	Kafka MB/s	Kafka avg lat	Redpanda rec/s	Redpanda MB/s	Redpanda avg lat
100B, 1 partition, acks=1	203,915	19.45	989 ms	296,559	28.28	533 ms
1KB, 1 partition, acks=1	81,426	79.52	350 ms	100,593	98.24	191 ms
100B, 6 partitions, acks=1	479,386	45.72	17 ms	291,205	27.77	34 ms
1KB, 6 partitions, acks=1	165,782	161.90	57 ms	61,839	60.39	460 ms
100B, 1 partition, acks=all	326,797	31.17	573 ms	39,577	3.77	6,275 ms

The pattern is striking. Redpanda wins on single-partition acks=1 workloads, pulling ahead by roughly 45% on small messages and 23% on 1 KB messages. That tracks with the C++ thread-per-core architecture handling individual partition writes efficiently.

But scale to 6 partitions and the story flips. Kafka pushed 479K records/second versus Redpanda’s 291K. At 1 KB message size with 6 partitions, the gap widens dramatically: Kafka delivered 161 MB/s versus Redpanda’s 60 MB/s. The JVM’s thread pool model parallelizes well across partitions.

The acks=all result deserves its own callout. Kafka managed 326K rec/s with 573 ms average latency. Redpanda crawled to 39K rec/s with 6,275 ms average latency. That is a 16x throughput gap. This happens because Redpanda’s community edition with a single broker must write to the Raft log synchronously before acknowledging, and with acks=all on a single node, every write hits the full fsync path. Kafka’s KRaft handles this more gracefully in single-node deployments.

Sustained Throughput (5 Million Messages)

Short bursts tell one story. Sustained load tells another. We pushed 5 million 1 KB messages through each broker with 6 partitions and acks=1.

kafka-producer-perf-test.sh \
  --topic sustained-test \
  --num-records 5000000 \
  --record-size 1024 \
  --throughput -1 \
  --producer-props bootstrap.servers=10.0.1.50:9092 acks=1

Kafka’s sustained throughput numbers:

5000000 records sent, 211282.7 records/sec (206.33 MB/sec),
2.53 ms avg latency, 41.00 ms max latency,
1 ms 50th, 5 ms 95th, 41 ms 99th, 41 ms 99.9th.

Redpanda’s sustained throughput on the same test:

5000000 records sent, 89455.3 records/sec (87.36 MB/sec),
315.42 ms avg latency, 514.00 ms max latency,
298 ms 50th, 487 ms 95th, 514 ms 99th, 514 ms 99.9th.

Kafka sustained 2.4x higher throughput under continuous load: 211K rec/s versus 89K rec/s. Average latency tells the same story at 2.53 ms versus 315 ms. The p99 gap is even wider at 41 ms versus 514 ms. Under sustained pressure, Kafka’s batching and page cache utilization outperform Redpanda on this hardware configuration.

End-to-End Latency

Throughput is half the picture. For real-time event processing, latency per message matters more. We measured end-to-end latency (producer send to consumer receive) using the kafka-e2e-latency tool with 10,000 messages.

Test	Kafka avg	Kafka p99	Redpanda avg	Redpanda p99
100B, acks=1	1.05 ms	4 ms	0.79 ms	3 ms
1KB, acks=1	0.88 ms	3 ms	0.83 ms	3 ms
100B, acks=all	0.92 ms	3 ms	2.09 ms	6 ms

Both systems deliver sub-millisecond average latency for acks=1. Redpanda is marginally faster at 0.79 ms versus 1.05 ms on 100-byte messages. At 1 KB, the difference shrinks to noise (0.83 ms versus 0.88 ms).

Switch to acks=all and Kafka pulls ahead clearly: 0.92 ms versus 2.09 ms average, 3 ms versus 6 ms at p99. The Raft fsync penalty on Redpanda shows up again. For workloads that require durability guarantees (financial transactions, audit logs), Kafka’s acks=all latency advantage is significant.

Consumer Throughput

Consuming is the less glamorous side of the benchmark, but it matters. We consumed the messages produced in the earlier tests using kafka-consumer-perf-test.

kafka-consumer-perf-test.sh \
  --bootstrap-server 10.0.1.50:9092 \
  --topic test-topic \
  --messages 1000000 \
  --threads 1

The consumer throughput results:

Test	Kafka fetch MB/s	Redpanda fetch MB/s
1 partition	127.19	116.33
6 partitions	156.17	164.24

Near-identical performance. Redpanda edges ahead on multi-partition reads (164 MB/s versus 156 MB/s), while Kafka is slightly faster on single-partition consumption. For most workloads, the consumer side is not going to be your deciding factor between these two systems.

Resource Usage

This is where the architectural differences become impossible to ignore.

Metric	Kafka	Redpanda
Idle RSS	1,215 MB	3,420 MB
Under load RSS	1,215 MB	3,954 MB
Peak CPU (load)	69%	79%
Startup time	5.1 s	2.7 s
Install size	137 MB + JDK (~250 MB total)	239 MB
Data dir (after benchmarks)	7.1 GB	8.7 GB

Kafka uses roughly 3x less memory. The JVM heap is capped at 2 GB and the process stayed at 1.2 GB RSS throughout testing, both idle and under load. Predictable. Easy to capacity-plan.

Redpanda allocated 3.4 GB at idle and grew to nearly 4 GB under load. This is by design, not a leak. The Seastar framework’s memory allocator grabs available RAM upfront and uses it as an in-memory data cache. On a dedicated broker with 64 GB of RAM, this aggressive caching is an advantage. On a 8 GB VM sharing resources with other services, it can be a problem. You can cap it with --memory flag, but out of the box, Redpanda is hungry.

Startup time favors Redpanda at 2.7 seconds versus Kafka’s 5.1 seconds. No JVM warmup penalty. This matters for containerized deployments where brokers may restart frequently, and for edge computing scenarios where fast cold starts are important.

Check memory usage on either system:

ps aux --sort=-%mem | grep -E "kafka|redpanda" | awk '{print $6/1024 " MB", $11}'

Kafka API Compatibility

One of Redpanda’s key selling points is drop-in Kafka API compatibility. We tested this claim with a simple Python producer/consumer using kafka-python-ng. The exact same code ran against both brokers with zero modifications (only the bootstrap server address changed).

The producer script:

import json
import time
from kafka import KafkaProducer

producer = KafkaProducer(
    bootstrap_servers=['10.0.1.50:9092'],
    value_serializer=lambda v: json.dumps(v).encode('utf-8')
)

start = time.time()
for i in range(10000):
    producer.send('compat-test', {'id': i, 'ts': time.time(), 'data': 'x' * 100})
producer.flush()
elapsed = time.time() - start
print(f"Produced 10,000 JSON messages in {elapsed:.2f}s ({10000/elapsed:.0f} msg/s)")

The consumer script:

import json
import time
from kafka import KafkaConsumer

consumer = KafkaConsumer(
    'compat-test',
    bootstrap_servers=['10.0.1.50:9092'],
    auto_offset_reset='earliest',
    consumer_timeout_ms=5000,
    value_deserializer=lambda m: json.loads(m.decode('utf-8'))
)

count = 0
start = time.time()
for msg in consumer:
    count += 1
    if count >= 5000:
        break
elapsed = time.time() - start
print(f"Consumed {count} messages in {elapsed:.2f}s ({count/elapsed:.0f} msg/s)")

Results against Kafka (10.0.1.50):

Produced 10,000 JSON messages in 0.77s (12,954 msg/s)
Consumed 5,000 messages in 3.37s (1,484 msg/s)

Same code, pointed at Redpanda (10.0.1.51):

Produced 10,000 JSON messages in 0.94s (10,688 msg/s)
Consumed 5,000 messages in 3.18s (1,573 msg/s)

Both worked without any code changes. The API compatibility claim holds. Topic creation, producer sends, consumer group management, offset commits, and JSON serialization all functioned identically. Kafka was marginally faster on producing (12,954 versus 10,688 msg/s) while Redpanda was slightly faster on consuming (1,573 versus 1,484 msg/s). These Python-level differences are negligible compared to the native perf test results.

When to Choose Which

Choose Kafka when:

Multi-partition throughput matters. Kafka delivered 2.6x higher throughput on 6-partition 1 KB workloads
Durability is non-negotiable. The acks=all performance gap is massive (8x throughput, 10x lower latency)
You need the ecosystem. Kafka Connect has 200+ connectors. Kafka Streams and ksqlDB have no Redpanda equivalent
Memory is constrained. Kafka runs comfortably in 1.2 GB RSS. Redpanda wants 3x that or more
Your team knows Java. Debugging JVM issues is a known skill. Debugging Seastar C++ is not

Choose Redpanda when:

Single-partition low latency is the priority. Redpanda consistently beat Kafka on single-partition acks=1 by 25-45%
Operational simplicity matters more than raw performance. No JVM tuning, no garbage collection pauses, no separate Schema Registry to manage
Fast startup is critical. Containers, Kubernetes pods, edge deployments where 2.7s versus 5.1s matters
Your team wants to avoid JVM. Redpanda ships as a single binary with built-in management via rpk
You need a built-in Schema Registry and HTTP proxy. Fewer moving parts in production

About that “10x lower latency” claim: our tests show approximately 25% improvement at best for acks=1 workloads on single partitions, and Kafka actually wins on acks=all. The 10x number likely comes from comparing Redpanda against older Kafka versions running ZooKeeper with untuned JVM settings. Kafka 4.2.0 with KRaft is a very different animal than Kafka 2.x with ZooKeeper.

The Verdict

On identical hardware, Kafka 4.2.0 with KRaft delivered higher sustained throughput, used 3x less memory, and won every acks=all test. Redpanda’s edge is operational simplicity and slightly lower latency on single-partition acks=1 workloads. The “10x faster” marketing does not survive a controlled benchmark.

Both are production-grade systems. Both handle the Kafka API correctly. To get started, see our guides for deploying Kafka with Docker Compose or Redpanda with Docker Compose. The choice comes down to your workload profile and what your team is comfortable operating. If you run multi-partition topics with durability requirements (and most production deployments do), Kafka 4.2.0 with KRaft is the stronger performer on equivalent hardware. If you run lightweight, single-partition event streams and value operational simplicity over raw throughput, Redpanda earns its keep. For Kubernetes deployments, we have tested both approaches: Redpanda on Kubernetes with Helm and Redpanda with the Kubernetes Operator.