Member-only story

How we deal with Data Quality using Circuit Breakers

Sandeep Uttamchandani
7 min readOct 8, 2018

--

Imagine a business metric showing a sudden spike — is the spike real or is it a data quality problem? Analysts and Data Engineers today will spend hours, days, and even weeks analyzing whether a given metric is correct! In other words, Time-to-Reliable-Insights today are unbounded and are a widespread pain-point across the industry. At Intuit, we are working on addressing the data quality problem at scale and presented our platform (called QuickData SuperGlue) at the in New York, 2018.

Analogous to using the architecture, we are designing circuit-breakers for . In the presence of data quality issues, the circuit opens preventing low-quality data from propagating to downstream processes. The result is that data will be in the reports for time-periods of low quality, but if present, it is guaranteed to be correct. This proactive approach makes Time-to-Reliable-Insights bounded to mins by automating data availability to be directly proportional to data quality. This approach also eliminates the required for verifying-&-fixing metrics/reports on a case-by-case basis. The rest of the blog describes details for implementing and deploying circuit breakers and divided into three sections:

  • Data Pipelines Ground realities
  • Circuit Breaker Pattern for Data Pipelines
  • Implementing Circuit Breakers in Production

--

--

Sandeep Uttamchandani
Sandeep Uttamchandani

Written by Sandeep Uttamchandani

Sharing 20+ years of real-world exec experience leading Data, Analytics, AI & SW Products. O’Reilly book author. Founder . #Mentor #Advise

Responses (4)