Sixteen years building production analytics in regulated healthcare — where measurement errors have regulatory and financial consequences.

In healthcare, a wrong number has external consequences — regulatory, contractual, financial. I build systems designed around that reality.

me@zaherkarp.com  ·  linkedin.com/in/zkarp

resume_db=# \x Expanded display is on. resume_db=# SELECT * FROM zaher; -[ RECORD 1 ]----------------------------- name | Zaher Karp title | Lead Data Engineer focus | Analytics Engineering · Data Platform domain | Medicare Advantage · HEDIS · CMS Stars stack | SQL · Python · dbt · AWS · Azure · Databricks status | building connections · open to collaboration resume_db=#

I started in writing and editing, spent nearly a decade in mixed-methods health services research at UW–Madison beginning in 2009, and moved into data engineering when I realized the tools I needed to study complex healthcare systems didn't exist yet. The methodological thread runs from grounded theory in qualitative research through logistic and linear regression in population health, through time series analysis in Stars forecasting — different instruments, the same underlying question about how complex systems behave under measurement.

Right now that means Medicare Star Ratings, HEDIS pipelines, and data integration across claims and eligibility sources. I care most about analytics that ship as products — systems with audit trails and validated outputs, not reports that cannot be traced to their source. I am leading a data function and moving deliberately into organizational leadership, because the problems I want to work on require change at a scale that individual technical contribution cannot reach.

The return to public writing in late 2025 was deliberate. Five years of building production systems for other organizations left a lot of thinking that was not mine to publish. It is now.

career arc
Writing & Editing Research — UW–Madison healthfinch acquired 2020 Health Catalyst AWS → Azure · Databricks · dbt Baltimore Health Analytics 2007 2009 2014 2017 2020 2025 now
by the numbers
nMeasure
16+Years experience
50+Clinical organizations served
4EHR platforms
6Peer-reviewed publications
17Conference presentations

2026-03-14 Building a Browser-Based Star Rating Predictor: Methodology & Evidence Technical documentation supporting the ordinal logistic regression model used in the interactive Medicare Star Rating predictor.
2026-02-11 The ECDS Shock Index: Modeling Distribution Risk in Medicare Advantage Stars ECDS adoption is not just a data modernization effort. It introduces measurable distribution risk into the Stars ecosystem.
2025-12-29 Concurrency: The Story We Tell vs. the System We Run In real data systems concurrency is mostly coordination under constraint.
2025-12-26 Why Pull Requests Changed Everything in Healthcare Data Science How version control transformed healthcare analytics from chaotic file management into transparent, auditable systems.
2025-12-24 Why You Probably Shouldn't Version Your Fact Table Most teams version their fact tables because they've modeled their dimensions incorrectly.
2025-12-02 The Gravity of Star Ratings How outcomes shape programs they never appear in — Stars operates less like a measurement program and more like a financial force.

View all writing →


Lead Data Engineer
Baltimore Health Analytics Nov 2025 – Present

I lead the data engineering and analytics methodology function for a Medicare Advantage quality platform. The role spans direct management, code review across the engineering team, and coordination of data science and QA across two geographies — with the product roadmap and the CMS-to-organizational-priorities translation sitting in the middle of all of it.

CMS Star Rating methodology is published in Technical Notes that run to hundreds of pages — regulatory language describing measure construction, denominator logic, significance testing, and improvement score calculation. The work is translating that regulatory text into executable Python and SQL — using pandas for data transformation, scipy and numpy for statistical components including robust exponential smoothing for time series forecasting, and Selenium for web automation — producing auditable outputs where every threshold and weighted average is traceable to its source.

HEDIS hybrid measures add a separate class of problem. The denominator is constructed from claims and eligibility data arriving from multiple health plans in inconsistent formats. Normalizing that into a unified analytic layer while preserving the audit trail requires treating every source format as a suspected deviation from the specification until proven otherwise.

pandas · scipy · numpy · dbt · SQL · Selenium · Python

Healthcare Analytics Manager, Embedded Refills and Care Gaps
Health Catalyst (acquired healthfinch, 2020) Aug 2020 – Aug 2025

In 2020 Health Catalyst acquired healthfinch, and I migrated with the product — moving from a three-person analytics function to a clinical quality platform serving health systems across Epic, Cerner, athenahealth, and Veradigm, four Electronic Health Record environments with different data models, different extract behaviors, and different interpretations of the same clinical concept. The acquisition also meant migrating the underlying infrastructure from AWS to Azure and Databricks, an architectural shift that required rebuilding the analytics layer while maintaining continuity of service for existing clients.

More detailLess

Data governance came down to one question: does a medication adherence rate mean the same thing across Epic, Cerner, athenahealth, and Veradigm? The ELT pipeline I inherited ran for multiple days and produced outputs that were difficult to trace to their source. The redesign was an auditability and data governance decision. Medallion architecture — bronze for raw ingestion, silver for documented business logic, gold for analytic output — means every transformation has a name, a location, and a test. Same-day runtime was a side effect; the design goal was auditability — every metric could be traced back through silver to its raw source. RxNorm validation across the platform serving fifty-plus health systems means the question "why does this adherence rate look wrong for this plan" has a traceable answer.

HIPAA compliance was an architectural constraint throughout — not a certification but a set of requirements that shaped every decision about data storage, access control, and transmission across a platform handling protected health information at scale.

dbt · Redshift · AWS → Azure · Databricks · Python · Tableau · Power BI

Healthcare Analytics Manager · Specialist
healthfinch (acquired by Health Catalyst, 2020) Dec 2017 – Jul 2020

First analytics hire meant the infrastructure did not exist. I built it under HIPAA and HITRUST compliance requirements on AWS — which meant that every data governance decision, from access control to audit logging to retention policy, was mine to make without a prior framework to inherit. Promoted to manager after one year to lead cross-functional work across product, engineering, and customer success.

ROI modeling was the commercially critical work — linear regression on clinical workflow data, translated into client-facing reports that supported over a million dollars in recurring revenue. Built dashboards that drove sevenfold growth in internal user adoption and eliminated four hundred hours of annual manual reporting preparation.

SQL · Python · Sisense · AWS · HIPAA · HITRUST

Researcher — Five Roles Over Nine Years
University of Wisconsin–Madison, Department of Family Medicine and Community Health Sep 2009 – Jun 2018

Nine years embedded in federally-funded primary care research, advancing through five positions from research specialist to assistant researcher — building the technical infrastructure for studies funded by the National Institute on Aging, the Wisconsin Partnership Program, the Josiah Macy Jr. Foundation, and multiple UW Institute for Clinical and Translational Research awards.

More detailLess

The Wisconsin Longitudinal Study — a fifty-year cohort of ten thousand adults integrating survey, health, and administrative records — taught me what longitudinal data quality problems actually look like: cohort attrition, measurement drift, linkage failure across administrative sources that were never designed to be linked. Methodological training included Contemporary Qualitative Interviewing Methods at the University of Oxford (2014).

A sustained research thread on primary care redesign used grounded theory for qualitative analysis of field notes and focus group transcripts, and interrupted time series analysis to measure the effect of care delivery change initiatives on clinic panel data.

The ACO cost research integrated EMR, claims, and patient satisfaction data in Stata and SAS to produce cohort analyses showing that higher-baseline-cost organizations were more likely to achieve shared savings — published in the International Journal of Healthcare Management in 2018.

Stata · SAS · NVivo · SPSS · REDCap

Principal
Sustainable Clarity 2007 – 2014

Editorial services practice specializing in environmental, health, and policy content. Managed up to eight copy editors, graphic designers, and photographers. Wrote articles syndicated through Thomson Reuters, LexisNexis, and the New York Times wire. Edited and indexed client manuscripts published as books, peer-reviewed journals, grants, and dissertations.


01 Client-Side Stars Analytics Dashboard

The design constraint — no data leaves the user's machine — was a compliance requirement, not an aesthetic one. Implemented as a single HTML file using Chart.js against uploaded CSV data; jsPDF generates the report without a server round-trip. Nothing to deploy, update, or maintain.

Chart.js · PapaParse · jsPDF · client-side

02 Healthcare Workforce Transition Platform

The original question was about healthcare workforce shortages — which adjacent roles can clinical and administrative staff reskill into. O*NET occupation and skill data provided the structure; logistic regression calibrated the transition probability estimates. Produces Ready Now, Trainable, and Long-Term Reskill recommendations with gap analysis by skill domain.

FastAPI · PostgreSQL · scikit-learn · logistic regression · O*NET

03 Medicare Advantage Insight Engine

A local news monitor that fetches public Medicare Advantage sources, scores items for analytic relevance using keyword and domain heuristics, and posts structured alerts to a Teams-compatible webhook. Distinguishing a CMS rulemaking notice from a press release that mentions Medicare Advantage requires knowing what questions a Stars analyst is actually trying to answer.

Python · automation · webhook

04 ECDS Shock Index

ECDS adoption introduces a distributional shift into the Stars ecosystem that most health plans are not yet modeling. This repository implements the shock index methodology — quantifying the expected change in measure rate distributions when ECDS replaces legacy ED visit coding, and estimating the downstream effect on Stars cutpoint crossings at the plan level.

Python · Medicare Advantage · Stars methodology

05 Care Delivery Workflow Changes

Analyzed organization-wide care delivery changes using interrupted time series analysis on clinic panel data. The methodological question was whether observed changes in care patterns were attributable to redesign initiatives or to secular trends — which requires a study design that can separate the two. The design combined segmentation with regression across multiple clinic sites to isolate the redesign effect from background drift.

Stata · SAS · interrupted time series · outpatient analytics

06 Practice Automation Analytics — healthfinch Charlie

I built the analytics for a case study of healthfinch's Charlie deployment across multiple community health centers at OCHIN. The case study asked two questions: how did clinician workflows change after Charlie deployment, and what was the ROI for the health centers that adopted it? Linear regression on the workflow data produced quantified outcome measures that the commercial team used in renewal conversations and new-customer pitches.

linear regression · Sisense · Epic Clarity · SQL

2019 Influence of environmental design on team interactions across 3 family medicine clinics Karp Z, Kamnetz S, Wietfeldt N, Sinsky C, Molfetter T, Pandhi N Health Environments Research & Design Journal 12(4):159–173 PubMed ResearchGate
2019 Broadening medical students' exposure to the range of illness experiences: a pilot experimental curriculum trial Pandhi N, Gaines ME, Deci D, Schlesinger M, Culp C, Karp Z et al. Academic Medicine PubMed
2018 Medicare Shared Savings Programs: higher cost accountable care organizations are more likely to achieve savings Berkson S, Davis S, Karp Z, Jaffery J, Flood G, Pandhi N International Journal of Healthcare Management
2016 An efficient process of gathering diverse community opinions to inform an intervention Pandhi N, Jacobson N, Serrano N, Hernandez A, Zeidler-Schreiter E, Wietfeldt N, Karp Z Implementation Science 11(Suppl 1):A13 Full text
2014 Approaches and challenges to optimizing primary care teams' electronic health record usage Pandhi N, Yang WL, Karp Z, Young A, Beasley JW, Kraft S, Carayon P Journal of Innovation in Health Informatics 21(3):142–51 PubMed
2012 Approaches and challenges to optimizing the use of EHRs in primary care (preliminary findings) Yang W, Pandhi N, Karp Z, Young A, Beasley J, Kraft S, Carayon P Proceedings of World Conference on E-Learning, Montréal Full text

Nine podium presentations and workshops at national venues between 2010 and 2017, including the Society for Implementation Research Collaboration Conference, the AAMC Integrating Quality Meeting, and the World Conference on E-Learning in Corporate, Government, Healthcare, and Higher Education. Four appearances at the National Collaborative for Improving Primary Care through Industrial and Systems Engineering. Eight additional posters at national and regional venues. A poster co-authored with Nancy Pandhi received the Patient Choice Award (2 of 45) at the North American Primary Care Research Group Conference, 2015.

Society for Implementation Research Collaboration Conference AAMC Integrating Quality Meeting World Conference on E-Learning (Montréal, 2012) National Collaborative for Improving Primary Care through ISE (×4) Access, Quality, and Outcomes Research Network

"He consistently pushes himself to deliver thoughtful, high-quality work because he genuinely wants to make a difference — for the team, for the client, and for healthcare and patients."

Joanna Laucirica, PMP — Director, Customer Operations, Health Catalyst
Read full testimonial

"It's rare to come across someone like Zaher — not just for his intelligence, but for the care, curiosity, and sense of responsibility he brings to everything he does. He consistently pushes himself to deliver thoughtful, high-quality work because he genuinely wants to make a difference — for the team, for the client, and for healthcare and patients. Zaher has a natural ability to think deeply about problems, often catching nuances others miss, and he balances that with a strong commitment to execution. He leads with integrity and consistently aims to do what's right — even when it takes more effort."

Joanna Laucirica, PMP — Director, Customer Operations, Health Catalyst

"Despite being the only engineer on the team, Zaher consistently delivered high-quality work. He is intelligent, thorough, and deeply committed to understanding customer needs."

Jessica McCay — Director of Customer Success, Health Catalyst
Read full testimonial

"Zaher was solely responsible for ensuring timely, accurate data delivery across multiple Electronic Health Record environments — Cerner, Epic, athenahealth, and Veradigm. He successfully led the migration of analytics from Sisense to Pop Insights, and implemented automated weekly data refreshes for Cerner, significantly improving efficiency and reliability. Despite being the only engineer on the team, Zaher consistently delivered high-quality work. He is intelligent, thorough, and deeply committed to understanding customer needs. He often joined client calls to clarify requests and wasn't afraid to push back when necessary to protect data integrity and long-term scalability."

Jessica McCay — Director of Customer Success, Health Catalyst

2013–2015 Master of Public Health (MPH), Biostatistics University of Wisconsin–Madison Health Innovation Program Research Trainee. Committee: Nancy Pandhi MD MPH PhD, Sandra Kamnetz MD, Todd Molfetter PhD. Designed and wrote a grant proposal awarded $18,000; produced peer-reviewed research on clinic environments.
2014–2015 Graduate Certificate, Patient Safety — Industrial & Systems Engineering University of Wisconsin–Madison Advisor: Pascale Carayon PhD. Trained through the AHRQ-funded Systems Engineering Initiative for Patient Safety.
2014 Contemporary Qualitative Interviewing Methods University of Oxford Intensive methods training in advanced qualitative interviewing techniques for health services research.
2015 Wisconsin Entrepreneurial Boot Camp Fellow University of Wisconsin School of Business Competitive fellowship in technology entrepreneurship for graduate students in sciences, engineering, and mathematics.
2003–2007 Bachelor of Arts, English Literature University of Wisconsin–Madison

Board of Directors The Road Home of Dane County — 2016–2020 Board service at a nonprofit providing homelessness prevention and housing stability services in Dane County, Wisconsin.
Community Advisory Board Chair WORT 89.9 FM — 2013–2015 Elected board chair for two terms at an independent community radio station in Madison, Wisconsin.
Peer Reviewer IISE Transactions on Healthcare Systems Engineering · Health Environments Research & Design Journal — 2019–present
Undergraduate Research Mentor UW School of Medicine and Public Health — 2014–2017 Mentored three undergraduate research scholars in data cleaning, programming, data visualization, statistical analysis, and literature review.
Alumni Hall Service Award Department of Family Medicine & Community Health, UW SMPH — 2017
Certified Six Sigma Yellow Belt American Society for Quality — 2015