Handling of Windowing function in KSQL like ROW_NUMMBER() , RANK() and DENSE_RANK()

tchopra501 · 9 February 2021 12:53

Is there any way to handle windowing function like ROW_NUMBER() , RANK() and DENSE_RANK() in KSQL , tried below example its not working.

SELECT ROW_NUMBER() OVER(ORDER BY MBDAT_TS) as rnum FROM JSON_LIMITED_2_LIPS_S1 EMIT CHANGES LIMIT 1;

rmoff · 9 February 2021 13:12

Hi @tchopra501, ksqlDB doesn’t support those functions yet. You could raise an enhancement request at Sign in to GitHub · GitHub to suggest them though. I found this one which is also related.

Charlla · 9 February 2021 21:18

@tchopra501 Do you need row num and rank in ksqldb or in the app consuming the resulting topic? Might sound silly, but if you do a normal select ordered by mbdat_ts, you get events in row number order, so in your consumer you could simply add a counter. For rank and dense rank, same applies, however you will get aggregate results and you would probably want your “over” fields in the header and increment your counter when the header combination changes. You will have the aggregate returned as the event so for dense rank, simply do a diff with the previous aggregate to see if the counter must remain the same for ties.

tchopra501 · 10 February 2021 06:21

Thanks Charlie , I would like to see how diffrent windowing functions will be used in ksql as we need to simulate the queries in KSQL, could you please provide more ellobratively how we can use this with an example , In case if i need to write UDF , do we have classes in confluent kafka packages to handle the same.

rmoff · 10 February 2021 07:59

You can find docs on UDF/UDAF/UDTF here: User-defined functions - ksqlDB Documentation

From the sounds of it you’re looking to port an existing workload like-for-like from an RDBMS to ksqlDB, is that right?

tchopra501 · 10 February 2021 09:09

Yes Robin , your understanding is correct,

Regards,
Tarun

rmoff · 10 February 2021 09:26

What’s your current workload - is it analytics, or transactional? Something to bear in mind is that you’ll probably find that batch workloads from RDBMS don’t always translate 1:1 to streaming workloads. Even if the SQL support were the same the semantics of batch vs stream can change how one thinks about things.

tchopra501 · 10 February 2021 09:46

Its analytical workloads , totally understand that 1:1 it can’t be translated but there should be some workaround to implement the same.

tchopra501 · 16 July 2021 07:35

Hi Charlla,
Thanks for your reply , in my usecase wanted to implement handling of row_number with partition_clause in ksqlDB , something similar in the link given below
ROW_NUMBER (Transact-SQL) - SQL Server | Microsoft Docs.

example

SELECT 
  ROW_NUMBER() OVER(PARTITION BY recovery_model_desc ORDER BY name ASC) 
    AS Row#,
  name, recovery_model_desc
FROM sys.databases WHERE database_id < 5;

Here is the result set.

TABLE 3

Row#	name	recovery_model_desc
1	model	FULL
1	master	SIMPLE
2	msdb	SIMPLE
3	tempdb	SIMPLE

Will appreciate if you suggest approach to handle at ksqlDB.

Regards,
Tarun

Topic		Replies	Views
UDF Implementation of Rank Function ksqlDB	1	3356	11 February 2021
Window of message batch ksqlDB	6	3264	20 May 2022
A question about event ordering ksqlDB	1	37	17 December 2024
Batch processing in kstreams Kafka Streams	9	5838	19 September 2021
Aggregations on Windowed KTables Kafka Streams	2	3361	7 March 2022

Handling of Windowing function in KSQL like ROW_NUMMBER() , RANK() and DENSE_RANK()

Related topics