🎧 Next-Gen Data Modeling, Integrity, and Governance with YODA

alice.richardson · 7 March 2023 08:11

There’s a new Streaming Audio episode - check it out!

In this episode, Kris interviews Doron Porat, Director of Infrastructure at Yotpo, and Liran Yogev, Director of Engineering at ZipRecruiter (formerly at Yotpo), about their experiences and strategies in dealing with data modeling at scale.

Yotpo has a vast and active data lake, comprising thousands of datasets that are processed by different engines, primarily Apache Spark™. They wanted to provide users with self-service tools for generating and utilizing data with maximum flexibility, but encountered difficulties, including poor standardization, low data reusability, limited data lineage, and unreliable datasets.

The team realized that Yotpo's modeling layer, which defines the structure and relationships of the data, needed to be separated from the execution layer, which defines and processes operations on the data.

This separation would give programmers better visibility into data pipelines across all execution engines, storage methods, and formats, as well as more governance control for exploration and automation.

To address these issues, they developed YODA, an internal tool that combines excellent developer experience, DBT, Databricks, Airflow, Looker and more, with a strong CI/CD and orchestration layer.

Yotpo is a B2B, SaaS e-commerce marketing platform that provides businesses with the necessary tools for accurate customer analytics, remarketing, support messaging, and more.

ZipRecruiter is a job site that utilizes AI matching to help businesses find the right candidates for their open roles.

EPISODE LINKS

Listen to the episode

Topic		Replies	Views
Kafka's role in a data mesh Architecture Architecture and Design	5	4490	19 January 2022
🎧 Real-Time Data Transformation and Analytics with dbt Labs News and Blogs	0	2702	22 February 2023
🎧 Placing Apache Kafka at the Heart of a Data Revolution at Saxo Bank News and Blogs	0	3121	19 August 2021
✍️ Saxo Bank’s Best Practices for a Distributed Domain-Driven Architecture Founded on the Data Mesh News and Blogs	0	3169	23 June 2021
Recording ready to view: SPEAKER Q&A THREAD: 21 April 2022- Apache Kafka® The Core Technology Events	0	3434	28 April 2022

🎧 Next-Gen Data Modeling, Integrity, and Governance with YODA

Related topics