JDBC Source Custom Query

  • Custom Query : The source connector supports using custom queries instead of copying whole tables. With a custom query, one of the other update automatic update modes can be used as long as the necessary WHERE clause can be correctly appended to the query. Alternatively, the specified query may handle filtering to new updates itself; however, note that no offset tracking will be performed (unlike the automatic modes where incrementing and/or timestamp column values are recorded for each record), so the query must track offsets itself.

Due to the way our system is designed based on databases we have only one options. We want to move data to Kafka But I understand that a custom query doesn’t track offsets.

What does this sentence mean ?

With a custom query, one of the other update automatic update modes can be used as long as the necessary WHERE clause can be correctly appended to the query.

This also means that offsets won’t be tracked ? Any misses should be monitored by us using some other means ?

Thanks

Hi @mohanr . Where did you see that sentence? Can you share the link to it?

Nevermind. I found it in the docs here: https://docs.confluent.io/kafka-connect-jdbc/current/source-connector/index.html#jdbc-source-incremental-query-modes.

In order to filter for new rows in a custom query, the connector needs to be able to append a WHERE statement saying something like WHERE ts > x AND ts < y. Basically what the above is saying is that you cannot have your own WHERE statement at the end of your query that will conflict with the connector-appended WHERE clause – there can’t be two! Note that as a workaround you can have your own WHERE clause nested in the query, just not at the very end.

1 Like

I understand. In our context this could mean that the query has to be simpler than I thought. We have customer preferences tables that can be joined with other tables so that the notification need not be sent. That means these messages need not reach Kafka.
Now I have to filter at the consumer that sends notifications.
Since we have many topics filtering using the Streams API and putting each one in another topic will create duplicate topics.
Please point out if I have misunderstood this limitation.

Thanks

Not necessarily. I had a custom query in place that joined three tables and filtered for a few things and still worked with the WHERE clause constraints.

You’d likely need to do something like the following:

SELECT 
    *
FROM (
    SELECT 
        A.timestamp,
        A.id,
        A.field0,
        B.field1,
        B.field2
    FROM A 
    JOIN B 
    ON A.id = B.id
    WHERE A.field0 IN ('ONE', 'TWO', 'THREE')
);

The above query is nested, allowing you to do all of the joins and filtering you need to do while also allowing the connector to append the WHERE clause for filtering the timestamp and/or incrementing fields.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.