Hello everyone,
I want to migrate a database to a newer postgres db with the help of Kafka Connect combined with a JDBC Source + Sink connector.
Instead of syncing every table from the legacy database, I only want to pick some of them and also limit the number of columns since I don’t need every column from these tables.
It’s also worth mentioning that these tables share some columns with the same name (e.g. table1.title, table2.title, …).
Now the problem:
There are some cases where I need the column with e.g. “title” and sometimes not. But I didn’t find a way to specify the target column more precisely than just add the column name to the “fields.whitelist” in my sink connector.
I think one ‘possible’ solution would be to split my one sink connector, which actually includes multiple tables with the help of the “topics” property. In this way, my “fields.whitelist” property would be unique in its context and wouldn’t lead to the described ambiguous problem.
But would this be efficient? Does Kafka Connect optimize cases like these? What would be the best practice in such a case. Is creating/using multiple sink connectors on the same database a thing or should you avoid this?
Thank you very much for your answers.
Best Regards