I have a custom global state store being used in my application. At times, the store does get update with frequent records. My use-case involves querying the store using a prefix key (when these updates arrive) to get the matching records and forward it down.
With this context, I see a strange behavior using store. prefixScan(“prefix”, new StringSerializer) - Post the update received, the store is updated, and in some cases we see that prefixScan does not return all the expected records based on the prefix. However, if I perform store.all() and then try to manually filter the record based on prefix I can see the correct record present.
Trying to understand what causes the store.prefixScan() to behave separately and not return all the records even if they are present in the store (because only in that case store.all() would be able to return the records).
Hard to say, but prefixScan is based on serialized byte[] lexicographical order… So there could be a difference how rows are ordered compared to “String” type, and for this case, the prefixScan which does a range scan internally, might search an incorrect range?
The given prefix (serializes as bytes[]) is used a lower bound for the rang scan and an additional to upper bound is computed.
You could check what all() returns, and for debugging inspect all rows – it would also return data in byte[] lexicographical order. Doe prefixScan start at the right row? Is the computed upper bound correct? Is the correct result returned by all() really a single range (or might it find data scattered around)?
Thank you for your reply.
I understand the prefixScan uses range scan internally to fetch the results separately. Let me try the exercise you suggested and get more details on the behavior of both all() and prefixScan().
Just thinking bit ahead, let us say they both give different results, is there any other approach to fetch data from the store based on a prefix? Currently, as workaround I am using all() and then filtering out the records manually. But I understand this is not the right approach as it uses lot of memory and CPU every time we query store and filter records. Hence, looking for a more efficient approach.
Hard to say… You would first need to understand why prefix-scan does currently not give you the expected result, and if you understand why, you might be able to use a different strategy (eg, using different prefix you pass into the method, or maybe do multiple prefix-scans to stitch the correct result together), to avoid using all().