I use a confluent 5.5.0 and run all components on the same box. All components work fine, except kafka: it writes to log such messages:
ERROR Error while accepting connection (kafka.network.Acceptor)
java.net.SocketException: Invalid argument
at sun.nio.ch.Net.setIntOption0(Native Method)
at sun.nio.ch.Net.setSocketOption(Net.java:341)
at sun.nio.ch.SocketChannelImpl.setOption(SocketChannelImpl.java:190)
at sun.nio.ch.SocketAdaptor.setBooleanOption(SocketAdaptor.java:271)
at sun.nio.ch.SocketAdaptor.setTcpNoDelay(SocketAdaptor.java:306)
at kafka.network.Acceptor.accept(SocketServer.scala:654)
at kafka.network.Acceptor.run(SocketServer.scala:579)
at java.lang.Thread.run(Thread.java:748)
and tries to create TCP socket with port number “0”. In netstat is looks like this
and time by time counts of this face TCP sockets only increases. I have a limit for number for open files by process. And when kafka reaches this limit - its killed by operation system. Increasing limits it is not solution, because it reaches anyway, may be some time later.
Yes, the same errors.
I think, it’s related to zookeeper, but i can’t understand, how? Some time ago, I tried to set variable JMX_PORT at bin/kafka-run-class and zookeeper started, but the kafka - didn’t. Error was little other, such as “can’t use port”. When i disabled JMX_PORT kafka started. So, zookeeper and kafka use the same environment and same predefined variables, like JMX_PORT.
zookeeper use only this ports (via netstat command): 2181, 49146. The first - is standard, the second - randomly setted. In my mind, kafka wants to run zookeeper by itself, not separated. I also tried not to run zookeeper, but in this case kafka couldn’t run, because it couldn’t connect to zookeeper.
Not exactly.
JMX_PORT is an example (may be they use another variable at the same time), which shows, that zookeeper and kafka try to use the same port:
with JMX_PORT kafka cann’t run
without JMX_PORT (with default options in bin/kafka-run-class) kafka runs, but write to logs errors, that i described in first message.
Because, it’s Oracle Solaris, there is no command lsof. Instead I use pfiles, which show all files using by the process
Number of regular files (it’s not growing):
Sorry for long reply. I found solution. It’s a bug, that related to setsockopt TCP_NODELAY (in java terms it’s a setTcpNoDelay), and java can’t set it.