SSL/TLS gives a very high throughput penalty

Hi
We’ve just deployed a new Kafka Clusters with Confluent Platform 5.5.1 on VMs with the following configuration

  • 3 Zookeeper
  • 8 nodes with 12 CPU / 32 Go
  • 2 differents listeners configured : one SASL_PLAINTEXT and one SASL_SSL

I’m running performance test using kafka-producer-perf-test.sh with different values for Record.size and Batch.size

Topic configuration :

  • 4 Partitions
  • 3 replicas

I’m very surprised about the results and the penalty on throughput and latency ~50% using the SASL_SSL listener compare to the 25-30% expected

SASL_PLAINTEXT listener

Record Size	Batch Size	compression	Ack	 Records	                       Throughput	avg latency (ms)	max latency (ms)
100000  16384	       none	       1	 10000 records sent	222,2	          117,44	             681
100000  32768	       none	       1	 10000 records sent	303,81	        85,35                506
100000  65536	       none	       1	 10000 records sent	239,98	        112,34	             576
1000000	16384	       none	       1	 10000 records sent	360,54	        87,75              	 586
1000000	32768	       none	       1	 10000 records sent	348,58	        90,76	               691
1000000	65536	       none	       1	 10000 records sent	339,57	        92,82                652
4000000	16384	       none	       1	 10000 records sent	253,82	        132,33            	1493
4000000	32768	       none	       1	 10000 records sent	251,57	        133,48            	1631
4000000	65536	       none        1	 10000 records sent	261,14	        128,47            	 721

SASL_SSL listener

Record Size	Batch Size	compression	Ack	Records	                      Throughput	avg latency (ms)	max latency (ms)
100000	16384	        none	       1	10000 records sent     95,99	        308,18               858
100000	32768	        none           1	10000 records sent     90,34	        323,95               913
100000	65536	        none	       1	10000 records sent     88,57	        328	                 966
1000000	16384	        none	       1	10000 records sent     146,61	        217,54            	1128
1000000	32768	        none	       1	10000 records sent     144,3	        220,47            	 945
1000000	65536	        none	       1	10000 records sent     139,54	        228,3              	1265
4000000	16384	        none	       1	10000 records sent     144,84      	  233,43	            1129
4000000	32768	        none	       1	10000 records sent     146,59	        230,89            	1202
4000000	65536	        none	       1	10000 records sent     146,59	        230,99            	1050

Do you have any idea on the configuration or parameter which could explain these bad performances through the SSL listener
Thanks for your help
Lionel

The broker will use additional network threads for ProduceRequests that are sent using the SSL protocol. The broker may have sufficient network threads allocated to handle process requests received on the SASL_PLAINTEXT listener but may not have enough to efficiently process requests received on the SASL_SSL listener. This may be contributing to the large penalty you are seeing for throughput and latency. Increasing the broker num.network.threads setting from its default value of 3 may help.

1 Like