Kafka on Dell Power Edge Servers – a Winning Combination
Mon, 06 Feb 2023 19:07:45 -0000
|Read Time: 0 minutes
By far the most popular pub/sub messaging software is kafka. Producers send data and messages to a broker for later use by consumers. Data is published to one or more topics which are queues. Consumers read messages from a topic and mark it as read. Most topics may have multiple consumers. Topics may be partitioned to enable parallel processing by brokers. Once all consumers have read the message it is logically deleted. Replicas create another copy of your data to help prevent data loss.
Regarding your platform choice there are many options including:
- Bare metal servers with DAS
- Virtualized
- HCI
- K8S
Some tips:
- Keep your cluster clean. Don’t use kafka to retain data or replay data past a few days or a week. Once data is consumed let it be deleted.
- Use an odd number of nodes w/ a minimum of three or five depending on your tolerance for failures. Most environments will have many more nodes.
- The storage should be local and SSDs are highly recommended.
- No RAID should be needed if replicas are in effect.
- Use random partitioning
- One replica is likely a minimum viable cfg w/ two replicas or three copies being most common in production.
What might this look like on some PE Servers. For 15G Ice Lake servers the most attractive server would be an R650. It’s a 1U server with 10 drive bays, decent memory and a wide selection of processors. A middle of the road configuration might look something like the following:
- Seven R650 servers
- 256GB of RAM with 16 x 16GB DIMMs in a fully balanced config.
- Dual 16c processors w/ a bit faster clock speed. So the 6346 would fit the bill @3.1GHz
- Dual 25GbE NICs
- HBA355E – This assumes no RAID for your data drives
- If you plan on using RAID for your kafka data then select the H755 PERC which has 8GB of cache.
- 6 x 1.92TB vSAS RI SSDs
- 99% of the time read intensive drives will suffice
- If your retention is one day or less than mixed use would be in order, but I’ve not seen that
- M.2 BOSS 480GB RI SSD pair – fully hot swappable RAID1 pair
- Here’s where your OS and possibly the kafka Confluent software would go
For your kafka needs feel free to contact me @ Mike.King2@dell.com to discuss your challenge further.