Scaling kafka streams application

Hi, I would like to get your input/advice on how to approach scalability for an application using kafka streams.

We are developing a distributed java application that uses kafka as event broker.
We have a fix number of topics, 20, each having a configurable number of partitions, 6 as default. Each topic in our case is linked to a step to execute in our business workflow.
The initial trigger events are produced by an external system.
The application is subscribed to all 20 topics. When the initial trigger event is produced, the application consumes it, executes some business logic, then produces another event in a different topic.
The execution of the business logic takes different amounts of time for different topics (for some topics, processing events will be quick, for others may take more time, ranging from ms to mins).
Depending on the workflow route, full processing (from the initial trigger event to the end) usually takes about 5 execution step cycles (consume, handle, produce)

We use kafka streams in our workflow. We have defined one sub-topology for every input topic, as shown below:

for (String topic : topics) {, Consumed.with(keySerde, valueSerde))
		.peek((key, value) -> consumer.accept(value))
		.filter((key, value) -> workflow.hasNextStep(value))
		.to((key, value, record) -> workflow.getNextTopic(value), Produced.with(keySerde, valueSerde));

The number of stream partitions is going to be equal to the input topic partitions.
We have enabled transactionality.
We use kubernetes to try to configure the elastic scalability of the app.

What is the best/proper way to scale the application?

The maximum number of (non-idle) stream threads can be 120 (20 topics x 6 partitions). For this number, each thread will be assigned by kafka exactly one task.
We could start with a lower number (say 20 threads) and depending on consumption rate, cpu, mem, etc. we could scale up/down (increase/decrease number of threads per app node), or scale out/in (create/delete other pods with app nodes).

Ideally, we would like to configure kafka streams to start with 20 threads, and then configure the internal consumers to subscribe only to one specific topic. In other words, we would have 20 internal consumers, each consuming from all 6 partitions of a single topic, instead of 6 partitions from 6 different topics. In this scenario, we would monitor each of the topics individually and if needed we could in theory increase the number of consumers for a specific topic, so that more consumers share the workload for those 6 partitions of that specific topic (this would mean a cap of 6 consumers per topic).
We would prefer this option, as it gives us more control, as scaling is done per topic. However, as far as I understand from the docs, changing the partitioning assignment algorithm is not supported for the kafka streams, which means I cannot control which partitions is the consumer going to be subscribed to. Is there any workaround that we could apply to get this desired behavior?

Are there better ways to tackle scalability in our case?

If you want to do Option 2, you could create one application per topic. The code you deploy could be the same for each app, but you configure them with different application.ids, to isolate the apps from each other. This way, you can scale each app independently.