Day 0 scripts to populate Kafka topics

How should the existing GBs or probably TBs of data in a large enterprise be made available for immediate availability in Kafka topics?

While I know that we can use the CDC capabilities of Kafka to build an event log or DB replica, with CDC the Kafka topics will be filled only whenever there is a change in the backend DB.

What is the best approach to build a Kafka layer which replaces the need to hit the real backend databases. How will the millions of records in existing database tables be available in Kafka topics as soon as we deploy our application integrated with Kafka CDC connector?

You’ll need to hit the database, one way or the other. That means read the oplog or use a driver to periodically query it in batches. Neither will immediately offer you the full database, and only the oplog option (i.e. Debezium or Oracle Goldengate, as popular options) will give you full information about operations ran on the database rather than just a point in time snapshot of every row.