Zaher Karp

Healthcare data engineering and Medicare Advantage analytics.

Career arc, 2007 to present.

About

I started in writing and editing, spent nearly a decade in mixed-methods health services research at UW–Madison beginning in 2009, and moved into data engineering when I realized the tools I needed to study complex healthcare systems didn't exist yet. The methodological thread runs from grounded theoryGrounded theory, developed by Glaser and Strauss in 1967, builds theory inductively from coded qualitative data rather than testing hypotheses against it. In practice this means reading transcripts, labeling passages, comparing labels across transcripts, and letting structure emerge. in qualitative research, through logistic and linear regression in population health, through time series analysis in Stars forecasting. Different instruments, the same underlying question about how complex systems behave under measurement.

Right now that means Medicare Star Ratings, HEDIS pipelines, and data integration across claims and eligibility sources.⊕Current role: Lead Data Engineer at Baltimore Health Analytics, Madison WI. Managing US-based engineering with work spanning analytics methodology, code review, and CMS-to-roadmap translation. I am leading a data function and moving deliberately into organizational leadership, because the decisions I want to influence happen above the individual contributor level.

Peer-reviewed publications and conference presentations, 2010 to 2019. Publications sized by citation count.

Writing

24 weeks of activity 23 posts⊕Writing resumed in late 2025 after a multi-year pause. The early weeks of the window are sparse by design, not by neglect; recent weeks show a steady run.

2026-04-22 Two States, One Pathogen: A Browser-Side Stochastic SEIRV Simulator ⊕ Tau-leaping is a stochastic simulation method for chemical / compartmental systems where each reaction is assumed to fire approximately Poisson-many times over a short interval. Faster than Gillespie at the cost of a small accuracy error for stiff systems.

Methodology and intuition behind the browser-based SEIRV epidemic simulator on this site. Poisson tau-leaping, state-level coverage, and why the ribbon matters more than any single run.

2026-04-19 The Slowest Way to Learn Web Design Is the Only Way That Worked for Me

A personal history of building and rebuilding a website across three eras, and what the increasing friction of each step actually taught me.

2026-04-12 How the Stars Cliff Simulator Works

A short explainer for the Stars Cliff Simulator: why ordinal logistic regression is the right tool for a 1 to 5 star outcome, and how P(clearing 4.0 stars) falls out of the model for free.

2026-04-05 The Stars Cliff Simulator: Methodology and Evidence ⊕ The 4.0 star Quality Bonus Payment threshold creates a sharp non-linearity in Medicare Advantage plan economics. Crossing it is worth roughly $50M to a mid-size contract.

The statistical methodology and published literature behind the Stars Cliff Simulator, a public teaching-oriented tool focused on the 4.0 star Quality Bonus Payment threshold.

2026-03-29 HEDIS Measure-Level ETL Patterns: Building the Pipeline from Claims to Measure Rates

A practitioner's walkthrough of how HEDIS measure pipelines actually work: eligible population, numerator matching, exclusions, supplemental data, and rate calculation.

2026-03-22 Two Ways to Build a Web App (and What You're Actually Choosing)

Most applications boil down to server-rendered or API-driven designs. The difference isn't just technical. It shapes how your system evolves.

View all writing

Experience

Lead Data Engineer

Baltimore Health Analytics · Nov 2025 to Present

I lead the data engineering and analytics methodology function for a Medicare Advantage quality platform. The role spans direct management, code review across the engineering team, and coordination of data science and QA across two geographies, plus the product roadmap and translation between CMS requirements and organizational priorities.

More detail

CMS Star Rating methodology is published in Technical Notes that run to hundreds of pages: regulatory language describing measure construction, denominator logic, significance testing, and improvement score calculation.The CMS Technical Notes are revised each rating year. A single measure rewrite can propagate into denominator changes, hold-harmless logic changes, and cutpoint regeneration, all of which need to round-trip through the analytics layer without drift. The work is translating that regulatory text into executable Python and SQL, using pandas for data transformation, scipy and numpy for statistical components including robust exponential smoothing for time series forecasting, and Selenium for web automation, producing auditable outputs where every threshold and weighted average is traceable to its source.

ψ_k(r) = rif |r| ≤ k k · sign(r)if |r| > k

Huber's ψ-function. Residuals smaller than the threshold k contribute linearly to the gradient; residuals larger than k are clipped, so a single COVID-era shock cannot dominate the smoothing parameter estimates.

HEDIS hybrid measures add a separate class of problem. The denominator is constructed from claims and eligibility data arriving from multiple health plans in inconsistent formats. Normalizing that into a unified analytic layer while preserving the audit trail requires treating every source format as a suspected deviation from the specification until proven otherwise.

pandas · scipy · numpy · dbt · SQL · Selenium · Python

Healthcare Analytics Manager, Embedded Refills and Care Gaps

Health Catalyst (acquired healthfinch, 2020) · Aug 2020 to Aug 2025

In 2020 Health Catalyst acquired healthfinch, and I migrated with the product. The three-person analytics function became a clinical quality platform serving health systems across Epic, Cerner, athenahealth, and Veradigm.Four EHRs with different data models, different extract behaviours, and different interpretations of the same clinical concept. A medication adherence rate is not portable across these systems until someone sits down and defines it in a source-specific way. RxNorm validation across the platform cut client-audit discrepancies from roughly 30% to under 5%. The acquisition also meant migrating infrastructure from AWS to Azure and Databricks, an architectural shift that required rebuilding the analytics layer while maintaining continuity of service for existing clients.

More detail

Data governance came down to one question: does a medication adherence rate mean the same thing across Epic, Cerner, athenahealth, and Veradigm. The ELT pipeline I inherited ran for multiple days and produced outputs that were difficult to trace to their source. The redesign was an auditability and governance decision, not a performance one.Medallion architecture: bronze for raw ingestion, silver for documented business logic, gold for analytic output. Every transformation has a name, a location, and a test. Same-day runtime was a side effect of the design, not the goal.

HIPAA compliance was an architectural constraint throughout. Not a certification, but a set of requirements that shaped every decision about data storage, access control, and transmission across a platform handling protected health information at scale.

dbt · Redshift · AWS to Azure · Databricks · Python · Tableau · Power BI

Healthcare Analytics Manager, Specialist

healthfinch (acquired by Health Catalyst, 2020) · Dec 2017 to Jul 2020

First analytics hire meant the infrastructure did not exist. I built it under HIPAA and HITRUST compliance requirements on AWS, which meant that every data governance decision, from access control to audit logging to retention policy, was mine to make without a prior framework to inherit. Promoted to manager after one year to lead cross-functional work across product, engineering, and customer success.

More detail

ROI modeling was the commercially critical work: linear regression on clinical workflow data, translated into client-facing reports that supported over a million dollars in recurring revenue. Built dashboards that drove sevenfold growth in internal user adoption and eliminated four hundred hours of annual manual reporting preparation.

SQL · Python · Sisense · AWS · HIPAA · HITRUST

Researcher, Five Roles Over Nine Years

University of Wisconsin-Madison, Department of Family Medicine and Community Health · Sep 2009 to Jun 2018

Nine years embedded in federally-funded primary care research, advancing through five positions from research specialist to assistant researcher. I built the technical infrastructure for studies funded by the National Institute on Aging, the Wisconsin Partnership Program, the Josiah Macy Jr. Foundation, and multiple UW Institute for Clinical and Translational Research awards.

More detail

The Wisconsin Longitudinal Study, a fifty-year cohort of ten thousand adults integrating survey, health, and administrative records, taught me what longitudinal data quality problems actually look like: cohort attrition, measurement drift, linkage failure across administrative sources that were never designed to be linked.Methodological training included Contemporary Qualitative Interviewing Methods at the University of Oxford (2014).

A sustained research thread on primary care redesign used grounded theory for qualitative analysis of field notes and focus group transcripts, and interrupted time series analysis to measure the effect of care delivery change initiatives on clinic panel data.

The ACO cost research integrated EMR, claims, and patient satisfaction data in Stata and SAS to produce cohort analyses showing that higher-baseline-cost organizations were more likely to achieve shared savings, published in the International Journal of Healthcare Management in 2018.

Stata · SAS · NVivo · SPSS · REDCap

Principal

Sustainable Clarity · 2007 to 2014

Editorial services practice specializing in environmental, health, and policy content. Managed up to eight copy editors, graphic designers, and photographers. Wrote articles syndicated through Thomson Reuters, LexisNexis, and the New York Times wire. Edited and indexed client manuscripts published as books, peer-reviewed journals, grants, and dissertations.

Projects

Client-Side Stars Rating Predictor

A cut-point dashboard built at Baltimore Health Analytics for internal Stars forecasting across our client contracts. The design constraint, no member-level data leaves the analyst's machine, was a compliance requirement, not an aesthetic one.Running the model client-side in the browser keeps PHI on the analyst's machine and sidesteps an entire class of data-transit and data-residency concerns that would otherwise need a server-side review. It runs ordinal logistic regression on live measure feeds entirely in the browser, projects cut-point crossings at the contract level, and surfaces which measures are closest to their next tier for remediation planning. Source is private.

Ordinal logistic regression · cut-point projection · client-side · internal tool

Stars Cliff Simulator

A public, teaching-oriented companion to the internal predictor. Single-page interactive demo built around one number, the 4.0 star QBP cliff that separates Medicare Advantage plans that qualify for Quality Bonus Payments from the 3.5 to 3.99 star "dead zone" that does not.⊕For a mid-size MA contract, clearing 4.0 is worth roughly $50M relative to a 3.5 star rating. A tenth of a star literally changes the plan's financial structure. An ordinal logistic regression calibrated to CMS 2025 weights runs in the browser; four sliders collapse the 42-measure surface down to its highest-leverage inputs, and a cut-point visualization exposes the mechanism that makes a tenth of a star worth $50 million. No data leaves the user's machine.

Approximate distribution of Medicare Advantage contract Star Ratings, showing the 4.0 QBP threshold.

Live demo Methodology post

Ordinal logistic regression · vanilla JS · no dependencies · client-side

Healthcare Workforce Transition Platform

The original question was about healthcare workforce shortages, which adjacent roles can clinical and administrative staff reskill into. O*NET occupation and skill data provided the structure; logistic regression calibrated the transition probability estimates. Produces Ready Now, Trainable, and Long-Term Reskill recommendations with gap analysis by skill domain.

Example transition pathway for a Medical Assistant. Bar widths reflect the share of candidates routed to each target occupation; stroke widths echo the same magnitude.

Live demo GitHub

FastAPI · PostgreSQL · scikit-learn · logistic regression · O*NET

Medicare Advantage Insight Engine

A local news monitor that fetches public Medicare Advantage sources, scores items for analytic relevance using keyword and domain heuristics, and posts structured alerts to a Teams-compatible webhook. Distinguishing a CMS rulemaking notice from a press release that mentions Medicare Advantage requires knowing what questions a Stars analyst is actually trying to answer.

GitHub

Python · automation · webhook

ECDS Shock Index

ECDS adoption introduces a distributional shift into the Stars ecosystem that most health plans are not yet modeling.⊕ECDS, Electronic Clinical Data Systems, is the HEDIS reporting method that allows structured clinical data to supplement or replace claims-based measure reporting. This repository implements the shock index methodology, quantifying the expected change in measure rate distributions when ECDS replaces legacy ED visit coding, and estimating the downstream effect on Stars cutpoint crossings at the plan level.

GitHub

Python · Medicare Advantage · Stars methodology

Care Delivery Workflow Changes

Analyzed organization-wide care delivery changes using interrupted time series analysis on clinic panel data. The methodological question was whether observed changes in care patterns were attributable to redesign initiatives or to secular trends, which requires a study design that can separate the two. The design combined segmentation with regression across multiple clinic sites to isolate the redesign effect from background drift.

Stata · SAS · interrupted time series · outpatient analytics

Practice Automation Analytics, healthfinch Charlie

I built the analytics for a case study of healthfinch's Charlie deployment across multiple community health centers at OCHIN. The case study asked two questions: how did clinician workflows change after Charlie deployment, and what was the ROI for the health centers that adopted it? Linear regression on the workflow data produced quantified outcome measures that the commercial team used in renewal conversations and new-customer pitches.

linear regression · Sisense · Epic Clarity · SQL

Publications

2019 Influence of environmental design on team interactions across 3 family medicine clinics ⊕ PubMed · ResearchGate
11 citations

Karp Z, Kamnetz S, Wietfeldt N, Sinsky C, Molfetter T, Pandhi N.
Health Environments Research & Design Journal 12(4):159-173.

2019 Broadening medical students' exposure to the range of illness experiences: a pilot experimental curriculum trial ⊕ PubMed
4 citations

Pandhi N, Gaines ME, Deci D, Schlesinger M, Culp C, Karp Z et al.
Academic Medicine.

2018 Medicare Shared Savings Programs: higher cost accountable care organizations are more likely to achieve savings ⊕ Published in the International Journal of Healthcare Management, 2018. First large-N analysis of the cost-to-savings relationship in early Medicare Shared Savings Program cohorts.

Berkson S, Davis S, Karp Z, Jaffery J, Flood G, Pandhi N.
International Journal of Healthcare Management.

2016 An efficient process of gathering diverse community opinions to inform an intervention ⊕ Full text

Pandhi N, Jacobson N, Serrano N, Hernandez A, Zeidler-Schreiter E, Wietfeldt N, Karp Z.
Implementation Science 11(Suppl 1):A13.

2014 Approaches and challenges to optimizing primary care teams' electronic health record usage ⊕ PubMed
23 citations

Pandhi N, Yang WL, Karp Z, Young A, Beasley JW, Kraft S, Carayon P.
Journal of Innovation in Health Informatics 21(3):142-51.

2012 Approaches and challenges to optimizing the use of EHRs in primary care (preliminary findings) ⊕ Full text

Yang W, Pandhi N, Karp Z, Young A, Beasley J, Kraft S, Carayon P.
Proceedings of World Conference on E-Learning, Montréal.

Speaking

Seventeen podium presentations, workshops, and posters at national and regional venues between 2010 and 2017, in healthcare services research and primary care systems engineering.⊕A co-authored poster received the Patient Choice Award (2 of 45) at the North American Primary Care Research Group Conference, 2015.

Full list (17 presentations, 2010 to 2017)

2017National Collaborative for Improving Primary Care Through Industrial and Systems EngineeringMadison, WI
2017UW Institute for Clinical and Translational Research, Dissemination and Implementation Short CourseMadison, WI
2016AAMC Integrating Quality MeetingChicago, IL
2016National Collaborative for Improving Primary Care Through Industrial and Systems EngineeringMadison, WI
2015Field Innovation Team's BootcampProvo, UT
2015Access, Quality, and Outcomes Research NetworkAppleton, WI
2015Society for Implementation Research Collaboration ConferenceSeattle, WA
2015National Collaborative for Improving Primary Care Through Industrial and Systems EngineeringMadison, WI
2015UW Health Quality WeekMadison, WI
2015North American Primary Care Research Group ConferenceCancún, Mexico
2015Wisconsin Research and Education Network Convocation of PracticesOshkosh, WI
2014National Collaborative for Improving Primary Care Through Industrial and Systems EngineeringMadison, WI
2014AAMC Integrating Quality MeetingChicago, IL
2014Pharmacy Society of Wisconsin Annual MeetingMadison, WI
2012World Conference on E-Learning in Corporate, Government, Healthcare, and Higher EducationMontréal, QC
2012UW Health organizational leadershipMiddleton, WI
2010Wisconsin Primary Care Research and Quality Improvement ForumMiddleton, WI

Testimonials

He consistently pushes himself to deliver thoughtful, high-quality work because he genuinely wants to make a difference, for the team, for the client, and for healthcare and patients.

Joanna Laucirica, PMP, Director, Customer Operations, Health Catalyst

It's rare to come across someone like Zaher, not just for his intelligence, but for the care, curiosity, and sense of responsibility he brings to everything he does. Zaher has a natural ability to think deeply about problems, often catching nuances others miss, and he balances that with a strong commitment to execution. He leads with integrity and consistently aims to do what's right, even when it takes more effort.

Despite being the only engineer on the team, Zaher consistently delivered high-quality work. He is intelligent, thorough, and deeply committed to understanding customer needs.

Jessica McCay, Director of Customer Success, Health Catalyst

Zaher was solely responsible for ensuring timely, accurate data delivery across multiple Electronic Health Record environments: Cerner, Epic, athenahealth, and Veradigm. He successfully led the migration of analytics from Sisense to Pop Insights, and implemented automated weekly data refreshes for Cerner, significantly improving efficiency and reliability. He often joined client calls to clarify requests and wasn't afraid to push back when necessary to protect data integrity and long-term scalability.

Education credentials and service commitments, 2003 to present. Squares mark single-year items; bars mark date ranges.

Education

2013 to 2015

Master of Public Health (MPH), Biostatistics University of Wisconsin-Madison Health Innovation Program Research Trainee. Committee: Nancy Pandhi MD MPH PhD, Sandra Kamnetz MD, Todd Molfetter PhD. Designed and wrote a grant proposal awarded $18,000; produced peer-reviewed research on clinic environments.

2014 to 2015

Graduate Certificate, Patient Safety, Industrial and Systems Engineering University of Wisconsin-Madison Advisor: Pascale Carayon PhD. Trained through the AHRQ-funded Systems Engineering Initiative for Patient Safety.

2014

Contemporary Qualitative Interviewing Methods University of Oxford Intensive methods training in advanced qualitative interviewing techniques for health services research.

2015

Wisconsin Entrepreneurial Boot Camp Fellow University of Wisconsin School of Business Competitive fellowship in technology entrepreneurship for graduate students in sciences, engineering, and mathematics.

2003 to 2007

Bachelor of Arts, English Literature University of Wisconsin-Madison

Service and Recognition

2016 to 2020

Board of Directors The Road Home of Dane County Board service at a nonprofit providing homelessness prevention and housing stability services in Dane County, Wisconsin.

2013 to 2015

Community Advisory Board Chair WORT 89.9 FM Elected board chair for two terms at an independent community radio station in Madison, Wisconsin.

2019 to present

Peer Reviewer IISE Transactions on Healthcare Systems Engineering · Health Environments Research and Design Journal

2014 to 2017

Undergraduate Research Mentor UW School of Medicine and Public Health Mentored three undergraduate research scholars in data cleaning, programming, data visualization, statistical analysis, and literature review.

2017

Alumni Hall Service Award Department of Family Medicine and Community Health, UW SMPH

2015

Certified Six Sigma Yellow Belt American Society for Quality

Contact

emailme@zaherkarp.com

linkedinlinkedin.com/in/zkarp

githubgithub.com/zaherkarp

scholarGoogle Scholar

tableaupublic.tableau.com/zaher.karp

resumePDF download