Hello Kafkateers,
Noticed an issue with JDBC Source connectors and long transactions, which affect all operating modes including “incrementing” and “timestamp+incrementing”, which are claimed to be stable.
There is a tracked outbox table Table1
as following:
CREATE TABLE Table1 (
i INTEGER NOT NULL,
t TIMESTAMP NOT NULL,
v VARCHAR2(2000)
);
Let’s imaging there are 2 sessions.
Session 1 inserts a row to the table, but doesn’t commit the transaction yet:
INSERT INTO Table1 VALUES (1, SYSTIMESTAMP, 'row1');
Session 2 inserts another row and commits it immediately:
INSERT INTO Table1 VALUES (2, SYSTIMESTAMP, 'row2');
COMMIT;
The connector sees “row2” and syncs it to our kafka topic.
Now Session 1 commits its transaction:
COMMIT;
But “row1” is not seen by the connector and never synced to Kafka, despite being inserted later to the table, because it is behind connector’s stored offset already.
The issue may seem artificial, but in fact it’s very real. When you have a concurrent environment and transactions can be long enough, it happens often enough with columns, populated with a sequence value, when an earlier transaction finishes later, than the other one, which started later.
Are there any “good” workaround for the issue?
I found only one using materialized views, and I don’t like it, probably other ideas?