My Source data is available in the Snowflake database(i.e. Producer). I want to fetch the data from the Snowflake database via Confluent and then push it to Kafka topic. Basically, My Producer is Snowflake database. So I don’t want to create a separate Producer application to fetch the data from the Snowflake database.
Note: I have read about the Snowflake Sink connector in the Confluent but it is supporting to only push the data to Snowflake database tables.
You have a couple of options here and I’ll try and hit the trade offs:
use the JDBC source connector. This requires you to have a timestamp and/or incrementing column in your snowflake schema. This will also create a relative constant load on your snowflake db. The JDBC source connector doesn’t seem to be extremely tested with snowflake, so you’ll have to try it out your self. The advantage here is it’s fairly low latency(think minutes).
use an ETL solution like Apache Camel or Airbyte/airflow. This is more batch style, the latency will depend on how often you run the job, how often you run the job will depend on how much load you want to put on snowflake.