We have run d2 instances with Kafka. They're currently unstable -- Amazon confirmed a host issue with d2 instances that gets tickled by a Kafka workload yesterday. Otherwise, it seems the d2 instance type is ideal as it gets an enormous amount of disk throughput and you'll likely be network bottlenecked.


Steven Wu
June 2, 2015 at 1:07 PM
EBS (network attached storage) has got a lot better over the last a few
years. we don't quite trust it for kafka workload.

At Netflix, we were going with the new d2 instance type (HDD). our
perf/load testing shows it satisfy our workload. SSD is better in latency
curve but pretty comparable in terms of throughput. we can use the extra
space from HDD for longer retention period.

On Tue, Jun 2, 2015 at 9:37 AM, Henry Cai <hcai@pinterest.com.invalid>

Henry Cai
June 2, 2015 at 12:37 PM
We have been hosting kafka brokers in Amazon EC2 and we are using EBS
disk. But periodically we were hit by long I/O wait time on EBS in some
Availability Zones.

We are thinking to change the instance types to a local HDD or local SSD.
HDD is cheaper and bigger and seems quite fit for the Kafka use case which
is mostly sequential read/write, but some early experiments show the HDD
cannot catch up with the message producing speed since there are many
topic/partitions on the broker which actually makes the disk I/O more
randomly accessed.

How are people's experience of choosing disk types on Amazon?