Handling of Windowing function in KSQL like ROW_NUMMBER() , RANK() and DENSE_RANK()

Is there any way to handle windowing function like ROW_NUMBER() , RANK() and DENSE_RANK() in KSQL , tried below example its not working.

SELECT ROW_NUMBER() OVER(ORDER BY MBDAT_TS) as rnum FROM JSON_LIMITED_2_LIPS_S1 EMIT CHANGES LIMIT 1;

Hi @tchopra501, ksqlDB doesn’t support those functions yet. You could raise an enhancement request at Sign in to GitHub · GitHub to suggest them though. I found this one which is also related.

@tchopra501 Do you need row num and rank in ksqldb or in the app consuming the resulting topic? Might sound silly, but if you do a normal select ordered by mbdat_ts, you get events in row number order, so in your consumer you could simply add a counter. For rank and dense rank, same applies, however you will get aggregate results and you would probably want your “over” fields in the header and increment your counter when the header combination changes. You will have the aggregate returned as the event so for dense rank, simply do a diff with the previous aggregate to see if the counter must remain the same for ties.

1 Like

Thanks Charlie , I would like to see how diffrent windowing functions will be used in ksql as we need to simulate the queries in KSQL, could you please provide more ellobratively how we can use this with an example , In case if i need to write UDF , do we have classes in confluent kafka packages to handle the same.

You can find docs on UDF/UDAF/UDTF here: User-defined functions - ksqlDB Documentation

From the sounds of it you’re looking to port an existing workload like-for-like from an RDBMS to ksqlDB, is that right?

Yes Robin , your understanding is correct,

Regards,
Tarun

What’s your current workload - is it analytics, or transactional? Something to bear in mind is that you’ll probably find that batch workloads from RDBMS don’t always translate 1:1 to streaming workloads. Even if the SQL support were the same the semantics of batch vs stream can change how one thinks about things.

Its analytical workloads , totally understand that 1:1 it can’t be translated but there should be some workaround to implement the same.

Hi Charlla,
Thanks for your reply , in my usecase wanted to implement handling of row_number with partition_clause in ksqlDB , something similar in the link given below
ROW_NUMBER (Transact-SQL) - SQL Server | Microsoft Docs.

example

SELECT 
  ROW_NUMBER() OVER(PARTITION BY recovery_model_desc ORDER BY name ASC) 
    AS Row#,
  name, recovery_model_desc
FROM sys.databases WHERE database_id < 5;

Here is the result set.

TABLE 3

Row# name recovery_model_desc
1 model FULL
1 master SIMPLE
2 msdb SIMPLE
3 tempdb SIMPLE

Will appreciate if you suggest approach to handle at ksqlDB.

Regards,
Tarun