🎧 Streaming Analytics on 50M Events Per Day with Confluent Cloud at Picnic

alice.richardson · 5 May 2022 07:15

There’s a new Streaming Audio episode - check it out!

What are useful practices for migrating a system to Apache Kafka® and Confluent Cloud, and why use Confluent to modernize your architecture?

Dima Kalashnikov (Technical Lead, Picnic Technologies) is part of a small analytics platform team at Picnic, an online-only, European grocery store that processes around 45 million customer events and five million internal events daily. An underlying goal at Picnic is to try and make decisions as data-driven as possible, so Dima's team collects events on all aspects of the company—from new stock arriving at the warehouse, to customer behavior on their websites, to statistics related to delivery trucks. Data is sent to internal systems and to a data warehouse.

Picnic recently migrated from their existing solution to Confluent Cloud for several reasons:

Ecosystem and community: Picnic liked the tooling present in the Kafka ecosystem. Since being a small team means they aren't able to devote extra time to building boilerplate-type code such as connectors for their data sources or functionality for extensive monitoring capabilities. Picnic also has analysts that use SQL so appreciated the processing capabilities of ksqlDB. Finally, they found that help isn't hard to locate if one gets stuck.
Monitoring: They wanted better monitoring; specifically they found it challenging to measure for SLAs with their former system as they couldn't easily detect the positions of consumers in their streams.
Scaling and data retention times: Picnic is growing so they needed to scale horizontally without having to worry about manual reassignment. They also hit a wall with their previous streaming solution with respect to the length of time they could save data, which is a serious issue for a company that makes data-first decisions.
Cloud: Another factor of being a small team is that they don't have resources for extensive maintenance of their tooling.

Dima's team was extremely careful and took their time with the migration. They ran a pilot system simultaneously with the old system, in order to make sure it could achieve their fundamental performance goals: complete stability, zero data loss, and no performance degradation. They also wanted to check it for costs.

The pilot was successful and they actually have a second, IoT pilot in the works that uses Confluent Cloud and Debezium to track the robotics data emanating from their automatic fulfillment center. And it's a lot of data, Dima mentions that the robots in the center generate data sets as large as their customer events streams.

EPISODE LINKS

Listen to the episode

Topic		Replies	Views
🎧 5 Years of Event Streaming and Counting ft. Gwen Shapira, Ben Stopford, and Michael Noll News and Blogs	0	3263	8 May 2021
Recording ready to view: SPEAKER Q&A THREAD: 21 April 2022- Apache Kafka® The Core Technology Events	0	3434	28 April 2022
Getting started with Confluent Cloud for free (and how do I skip the paywall?) Confluent Cloud	1	4463	27 August 2025
🎧 Scaling an Apache Kafka Based Architecture at Therapie Clinic News and Blogs	0	2801	7 April 2022
Hello, this is Si from Tokyo, Japan Lounge	1	3450	4 February 2021

🎧 Streaming Analytics on 50M Events Per Day with Confluent Cloud at Picnic

Related topics