Hey KSQLDB community:
I have several streams of events from Oracle Golden Gate that I
wish to join in ksqldb (v0.18.0).
There are “Plan” events that we get via CDC from Oracle Golden Gate:
As you might be aware, CDC events from OGG have a field called
op_ts to denote the timestamp which the change was made in Oracle. This “Plan” data is manifested into a ksqldb stream
The other is Plan-Agency “reference data” - this table cross references a “Plan” to an “Agency” for specific date range. This is also from OGG, and I have captured it into a ksqldb table for use
in a join:
Ideally, I would want my join to allow me to get the value of
agency_id where the
plan values are equal and the
is between the
end of the join table:
select plan.plan, plan.name, plan_agency.agency from plan left join plan_agency on plan.plan_id = plan_agency.plan_id and plan.op_ts between plan_agency.start and plan_agency.end;
However, it appears that ksqldb does not (yet) support multiple conditions in the
on clause of a join.
I have attempted to rekey the data in
end. I have used the scalar function
PARSE_TIMESTAMP to convert the
plan into a
TIMESTAMP, as well as on
end, in the hopes I could use the
between clause on those.
We are also exploring upstream how the data might be simplified.
In the meantime, I thought it might be useful to reach out to you good people. I’m open to any good ideas.