FileStreamSourceConnector

"connector.class": "org.apache.kafka.connect.file.FileStreamSourceConnector",
"file": "/data/connect/5_source.txt",
"tasks.max": "1",
"name": "file-source-a-05",
"topic": "file-topic-a-05"

When I add data to the 5_source.txt file with the cat command or echo command, the source connector fetches the data and puts it into the topic.

$cat >> 5_source.txt
yeah
hello
$echo "hello world" >> 5_source.txt

But if I add data directly to the file with the vi editor, does the source connector not work???

Likely a chicken-egg problem here. By the time the VI editor closes, the file is not yet picked by Kafka Connect as it tries to create a filehandle for it after its bootstrap… as you can see here.

IMHO I think it is a waste of computing power using Kafka Connect to ingest files into Kafka. You don’t need yet another distributed system just for this. It would be best to use something smaller and lightweight as Elastic Filebeat for this. If you need an example of how to tail files and store its line into a Kafka topic — here is an example:

@riferrei

1 Like

I’ll echo what @riferrei stated. If you have files on a file system(block level access), use FileBeat. It’s going to be easier and more reliable in the long run.

Object stores, like S3 or HDFS, are a bit different as they have different qualities and notification systems.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.