So we are creating a Data Lake POC using Apache Kafka as the streaming event platform. I need suggestions as to what Database should we use as the source and the target, Apache Kafka staying in the middle, streaming data.
What do you think which one would be more feasible? Oracle database cloud or PostgreSQL database? If Oracle database cloud is used as source data feed and target database also, what connector should we use? JDBC Source/Sink connector?
Please give suggestions to the above queries as to what database should be more feasible in using with Kafka.
There’s no way you’re going to get one straightforward answer on this. Toss a coin. Throw a dart at a dashboard. Ask random people on the internet. All are going to provide the same level of confidence in the answer
What are your requirements and constraining factors? Do you already have skills in one or the other? Are you looking for fully-managed or self-managed? Bare-metal or containers? Cloud or Local? Free to use or commercially licensed? Community support or vendor support?
Both have huge adoption, both are battle-tested and proven. Both support CDC out into Kafka. Both support being loaded from Kafka with the JDBC connector.
For Oracle as a source you can use connectors including JDBC source connector, Debezium, or Confluent’s Oracle CDC connector.
Hi, thanks for replying. If its Oracle database then am choosing cloud. Yes, Free to use with community support.
So for instance what do we need to do if we take Oracle DB cloud as the source and target system. Can you share some links or resources as I am a total beginner here and am pretty confused as to what should be done in this setup using Kafka with Oracle db as source/target system?
So this is what we want to achieve. In the external data feed stream I want to place Oracle DB as the source , for this POC. The target might vary. Kafka might load the data straight into an application or it might get loaded to an external database. Anything you want to suggest?