Sixteen years building production analytics in regulated healthcare — where measurement errors have regulatory and financial consequences.
In healthcare, a wrong number has external consequences — regulatory, contractual, financial. I build systems designed around that reality.
me@zaherkarp.com · linkedin.com/in/zkarp
I started in writing and editing, spent nearly a decade in mixed-methods health services research at UW–Madison beginning in 2009, and moved into data engineering when I realized the tools I needed to study complex healthcare systems didn't exist yet. The methodological thread runs from grounded theory in qualitative research through logistic and linear regression in population health, through time series analysis in Stars forecasting — different instruments, the same underlying question about how complex systems behave under measurement.
Right now that means Medicare Star Ratings, HEDIS pipelines, and data integration across claims and eligibility sources. I care most about analytics that ship as products — systems with audit trails and validated outputs, not reports that cannot be traced to their source. I am leading a data function and moving deliberately into organizational leadership, because the problems I want to work on require change at a scale that individual technical contribution cannot reach.
The return to public writing in late 2025 was deliberate. Five years of building production systems for other organizations left a lot of thinking that was not mine to publish. It is now.
| n | Measure |
|---|---|
| 16+ | Years experience |
| 50+ | Clinical organizations served |
| 4 | EHR platforms |
| 6 | Peer-reviewed publications |
| 17 | Conference presentations |
I lead the data engineering and analytics methodology function for a Medicare Advantage quality platform. The role spans direct management, code review across the engineering team, and coordination of data science and QA across two geographies — with the product roadmap and the CMS-to-organizational-priorities translation sitting in the middle of all of it.
CMS Star Rating methodology is published in Technical Notes that run to hundreds of pages — regulatory language describing measure construction, denominator logic, significance testing, and improvement score calculation. The work is translating that regulatory text into executable Python and SQL — using pandas for data transformation, scipy and numpy for statistical components including robust exponential smoothing for time series forecasting, and Selenium for web automation — producing auditable outputs where every threshold and weighted average is traceable to its source.
HEDIS hybrid measures add a separate class of problem. The denominator is constructed from claims and eligibility data arriving from multiple health plans in inconsistent formats. Normalizing that into a unified analytic layer while preserving the audit trail requires treating every source format as a suspected deviation from the specification until proven otherwise.
In 2020 Health Catalyst acquired healthfinch, and I migrated with the product — moving from a three-person analytics function to a clinical quality platform serving health systems across Epic, Cerner, athenahealth, and Veradigm, four Electronic Health Record environments with different data models, different extract behaviors, and different interpretations of the same clinical concept. The acquisition also meant migrating the underlying infrastructure from AWS to Azure and Databricks, an architectural shift that required rebuilding the analytics layer while maintaining continuity of service for existing clients.
Data governance came down to one question: does a medication adherence rate mean the same thing across Epic, Cerner, athenahealth, and Veradigm? The ELT pipeline I inherited ran for multiple days and produced outputs that were difficult to trace to their source. The redesign was an auditability and data governance decision. Medallion architecture — bronze for raw ingestion, silver for documented business logic, gold for analytic output — means every transformation has a name, a location, and a test. Same-day runtime was a side effect; the design goal was auditability — every metric could be traced back through silver to its raw source. RxNorm validation across the platform serving fifty-plus health systems means the question "why does this adherence rate look wrong for this plan" has a traceable answer.
HIPAA compliance was an architectural constraint throughout — not a certification but a set of requirements that shaped every decision about data storage, access control, and transmission across a platform handling protected health information at scale.
First analytics hire meant the infrastructure did not exist. I built it under HIPAA and HITRUST compliance requirements on AWS — which meant that every data governance decision, from access control to audit logging to retention policy, was mine to make without a prior framework to inherit. Promoted to manager after one year to lead cross-functional work across product, engineering, and customer success.
ROI modeling was the commercially critical work — linear regression on clinical workflow data, translated into client-facing reports that supported over a million dollars in recurring revenue. Built dashboards that drove sevenfold growth in internal user adoption and eliminated four hundred hours of annual manual reporting preparation.
Nine years embedded in federally-funded primary care research, advancing through five positions from research specialist to assistant researcher — building the technical infrastructure for studies funded by the National Institute on Aging, the Wisconsin Partnership Program, the Josiah Macy Jr. Foundation, and multiple UW Institute for Clinical and Translational Research awards.
The Wisconsin Longitudinal Study — a fifty-year cohort of ten thousand adults integrating survey, health, and administrative records — taught me what longitudinal data quality problems actually look like: cohort attrition, measurement drift, linkage failure across administrative sources that were never designed to be linked. Methodological training included Contemporary Qualitative Interviewing Methods at the University of Oxford (2014).
A sustained research thread on primary care redesign used grounded theory for qualitative analysis of field notes and focus group transcripts, and interrupted time series analysis to measure the effect of care delivery change initiatives on clinic panel data.
The ACO cost research integrated EMR, claims, and patient satisfaction data in Stata and SAS to produce cohort analyses showing that higher-baseline-cost organizations were more likely to achieve shared savings — published in the International Journal of Healthcare Management in 2018.
Editorial services practice specializing in environmental, health, and policy content. Managed up to eight copy editors, graphic designers, and photographers. Wrote articles syndicated through Thomson Reuters, LexisNexis, and the New York Times wire. Edited and indexed client manuscripts published as books, peer-reviewed journals, grants, and dissertations.
The design constraint — no data leaves the user's machine — was a compliance requirement, not an aesthetic one. Implemented as a single HTML file using Chart.js against uploaded CSV data; jsPDF generates the report without a server round-trip. Nothing to deploy, update, or maintain.
The original question was about healthcare workforce shortages — which adjacent roles can clinical and administrative staff reskill into. O*NET occupation and skill data provided the structure; logistic regression calibrated the transition probability estimates. Produces Ready Now, Trainable, and Long-Term Reskill recommendations with gap analysis by skill domain.
A local news monitor that fetches public Medicare Advantage sources, scores items for analytic relevance using keyword and domain heuristics, and posts structured alerts to a Teams-compatible webhook. Distinguishing a CMS rulemaking notice from a press release that mentions Medicare Advantage requires knowing what questions a Stars analyst is actually trying to answer.
ECDS adoption introduces a distributional shift into the Stars ecosystem that most health plans are not yet modeling. This repository implements the shock index methodology — quantifying the expected change in measure rate distributions when ECDS replaces legacy ED visit coding, and estimating the downstream effect on Stars cutpoint crossings at the plan level.
Analyzed organization-wide care delivery changes using interrupted time series analysis on clinic panel data. The methodological question was whether observed changes in care patterns were attributable to redesign initiatives or to secular trends — which requires a study design that can separate the two. The design combined segmentation with regression across multiple clinic sites to isolate the redesign effect from background drift.
I built the analytics for a case study of healthfinch's Charlie deployment across multiple community health centers at OCHIN. The case study asked two questions: how did clinician workflows change after Charlie deployment, and what was the ROI for the health centers that adopted it? Linear regression on the workflow data produced quantified outcome measures that the commercial team used in renewal conversations and new-customer pitches.
Nine podium presentations and workshops at national venues between 2010 and 2017, including the Society for Implementation Research Collaboration Conference, the AAMC Integrating Quality Meeting, and the World Conference on E-Learning in Corporate, Government, Healthcare, and Higher Education. Four appearances at the National Collaborative for Improving Primary Care through Industrial and Systems Engineering. Eight additional posters at national and regional venues. A poster co-authored with Nancy Pandhi received the Patient Choice Award (2 of 45) at the North American Primary Care Research Group Conference, 2015.
"He consistently pushes himself to deliver thoughtful, high-quality work because he genuinely wants to make a difference — for the team, for the client, and for healthcare and patients."
Joanna Laucirica, PMP — Director, Customer Operations, Health Catalyst"It's rare to come across someone like Zaher — not just for his intelligence, but for the care, curiosity, and sense of responsibility he brings to everything he does. He consistently pushes himself to deliver thoughtful, high-quality work because he genuinely wants to make a difference — for the team, for the client, and for healthcare and patients. Zaher has a natural ability to think deeply about problems, often catching nuances others miss, and he balances that with a strong commitment to execution. He leads with integrity and consistently aims to do what's right — even when it takes more effort."
Joanna Laucirica, PMP — Director, Customer Operations, Health Catalyst"Despite being the only engineer on the team, Zaher consistently delivered high-quality work. He is intelligent, thorough, and deeply committed to understanding customer needs."
Jessica McCay — Director of Customer Success, Health Catalyst"Zaher was solely responsible for ensuring timely, accurate data delivery across multiple Electronic Health Record environments — Cerner, Epic, athenahealth, and Veradigm. He successfully led the migration of analytics from Sisense to Pop Insights, and implemented automated weekly data refreshes for Cerner, significantly improving efficiency and reliability. Despite being the only engineer on the team, Zaher consistently delivered high-quality work. He is intelligent, thorough, and deeply committed to understanding customer needs. He often joined client calls to clarify requests and wasn't afraid to push back when necessary to protect data integrity and long-term scalability."
Jessica McCay — Director of Customer Success, Health Catalyst