We are having some issues with performance in the query, and i wanted to tune it. We are using mode “timestamp”. So reading through the documentation it states:
Different query modes may still be used for incremental updates, but in order to properly construct the incremental query, it must be possible to append a WHERE clause to this query (i.e. no WHERE clauses may be used). If you use a WHERE clause, it must handle incremental queries itself.
I really don’t get what does it exactly mean with “If you use a WHERE clause, it must handle incremental queries itself” so shall i write the full query? Is there some placeholder i shall use for timestamp?
Just for sake of sample (our query is way more complex), if I would like so set a query like:
select * from myTable where date_1 between %1 and %2 or date_2 between %1 and %2
Would something like that be a possibility? So i would like the connector to just provide the timestamps, nothing else. Is something like that supported?
IMHO that would also help avoiding things like: select * from ( another query ) where COALESCE(field1, field2)
You are right that the docs are currently not totally clear on this, and we have a ticket underway to try and improve them. You will probably find this issue useful in exploring how to use the WHERE clause.
Hi @rmoff ,
You mean attempting to do that pseudo sql injection by using “–” at the end of the query?
We are already using a nested query as mentioned in that thread. Am I missing something?
IMO that thread doesn’t still clarify for me the usage of WHERE clause, using “–” sounds more like a hack, I could give it a try, but before that I will ask a refactor on the tables the connector is consuming, so we can avoid thing like COALESCE and simplify the query in order to not need nested queries at all.
Anyway I think it would be great if connectors would allow WHERE clause and placeholders for the params. I have been reading through the connectors code, but so far haven’t realized if there is a better alternative already supported.
Thanks for your help!
My understanding of it is that as a general pattern you write a query to be used in incremental/timestamp mode like this:
SELECT TIMESTAMP_COL, COL1, COL2 FROM
(SELECT TIMESTAMP_COL,
A.COL1 AS COL1,
B.COL2 AS COL2
FROM TABLE_A
INNER JOIN TABLE_B
ON PK=FK
WHERE A.COL1='FOO');
This exposes the TIMESTAMP_COL that you can then use in your config, thus
query=SELECT TIMESTAMP_COL, COL1, COL2 FROM (SELECT TIMESTAMP_COL, A.COL1 AS COL1, B.COL2 AS COL2 FROM TABLE_A INNER JOIN TABLE_B ON PK=FK WHERE A.COL1='FOO')
mode=timestamp
timestamp.column.name=TIMESTAMP_COL
In your example I don’t think you can have an upper-bound applied by the connector to a date predicate, only lower. So you could do something like this with a hardcoded upper bound (per your BETWEEN example):
query=select * from (select date1,* from myTable where date_1 < "2021-03-25 13:50:47")
mode=timestamp
timestamp.column.name=date1
I’d expect the connector to then execute the query as
select * from (select date1,* from myTable where date_1 < "2021-03-25 13:50:47")
where date_1 > {last_poll_timestamp}