Standalone Kafka Connect on Raspberry Pi

I am trying to architect my first Kafka deployment and for our use-case, we have to read in a log file(s) from a Raspberry Pi to a Kafka cluster deployed on AWS. Since the Raspberry Pi (RPi) is behind our institute firewall, we cannot access it from the cloud, but I was hoping to be able to install Standalone Kafka Connect on the RPi. I was then planning to use the Filestream connector (or the FilePulse source connector) to stream the data to the cloud.

Does this design make sense? Any feedback or areas of improvement?

And finally, how would I go about creating a Dockerfile or docker image for Kafka Connect on RPi? No ARM64 compatible Kafka Connect image exists.

hey @chintanp

could you describe a little more in detail what you plan to do?
I guess reading some logfiles (webserver etc) on the RasperryPI and send them to Kafka?

afaik you could use docker buildx to build some multi-architecture images

hth,
michael

@mmuehlbeyer : Thank you for your reply.

The Raspberry Pi is collecting sensor data and we are writing it to a log file first and then stream this data to other systems, like a database etc. We plan to use Kafka as a broker. This allows us to decouple the sensor-data-collection development with the data processing and analysis piece. The data is mostly floats and timestamps.


ts1, 10.0, 11.4, 12, 12
ts2, 11, 14.4, 13, 14 
....

In the above sample dataset: ts2 is always greater than ts1 and the period (ts2 - ts1) is usually of the order of 1s but can be less too, like 10 ms.

I am considering installing Kafka Connect on the RasPi as this device is behind our firewall that only allows outgoing connections. So the local Connect can push the data out, rather than cloud Connect instance pulling the data in. Does this approach make sense?

I will explore docker buildx and update my progress here.

Thanks

ok I see.
one thing:
what about writing the data directly to Kafka instead of first writing into a logfile?

basically yes, though as mentioned above: is it possible to push the data directly to Kafka instead of using Kafka connect to read it from a logfile?

best
michael

1 Like

@mmuehlbeyer Thank you for your reply.

Sending directly to Kafka seems like the easiest and least complicated thing to do. And we will plan to implement that.

The main issue with this is this creates a strong coupling between Kafka and our client application. And now the client developer needs to “know” about Kafka.

Thanks