All case studies
Value-Based Primary Care·9 min read · 2025

Hypertension Risk Prediction: identifying the quiet 15% before they become the expensive 15%

A value-based primary care group wanted to find patients whose hypertension was silently progressing toward a cardiac or renal event. We built a clinical ML model on existing chart data that flags rising-risk patients six to nine months before a typical physician would catch it by chart review alone.

0.87
AUROC on 12-month event prediction
+38%
earlier intervention rate
6 to 9 mo
lead time vs. standard chart review
01 · The problem

What the practice was dealing with.

In a value-based care contract, the practice is financially responsible for avoidable cardiac and renal events in its hypertensive population. Every stroke, every MI, every CKD progression hits the P&L.

The challenge: hypertension is common, and most patients look "fine" on paper, until they do not. The patients who cause 80% of the avoidable cost are usually quiet clinically until something acute happens, because they are the ones skipping follow-up visits or self-titrating their medication.

Traditional chart review caught these patients, but usually only after a BP reading or a lab result had already crossed a threshold. By then the practice had already lost the intervention window. The care team wanted to move detection upstream, before the labs looked bad.

02 · Our approach

How we scoped and built it.

We aggregated four years of de-identified chart data across roughly 14,000 adult hypertensive patients: demographics, BP readings, labs (eGFR, A1c, lipids), medications, visit patterns, social determinants, and documented comorbidities. All of it already existed in the EHR, nobody had ever pulled it into one frame.

The outcome variable was any hypertension-related event within a 12-month window: stroke, MI, CKD stage progression, hypertensive urgency, or a new CHF diagnosis. We held out the most recent 12 months to validate honestly, not to grade our own homework.

We tested logistic regression, gradient-boosted trees (XGBoost), and a small tabular transformer. XGBoost won on AUROC and calibration, and gave us clean SHAP attribution, which turned out to be the deciding factor. The clinicians had to trust it, and "the model said so" was not going to be enough.

The model runs nightly on the patient panel. Every morning the care team opens a dashboard showing the 30 highest-risk patients who have not been seen in the past 90 days, and the top three reasons each one is flagged. No black box. Every score is explainable down to the feature contributions.

Risk score distribution · hypertensive panel, n≈14k
32%
28%
23%
18%
15%
11%
7%
4%
2%
1%
D1
D2
D3
D4
D5
D6
D7
D8
D9
D10
Deciles 9–10 captured 72% of 12-month events on held-out data. Care team workflow targets these patients first.

"The model does not replace our judgment. It tells us which twenty charts to review this week instead of which four hundred."

Medical Director, Value-Based Primary Care (name withheld by agreement)
03 · The outcome

What changed in the practice.

On held-out data the model achieved AUROC 0.87 with strong calibration in the top two deciles, exactly where the clinical decisions happen. High-specificity at the top of the list was the point, not a good-looking ROC curve overall.

In the first six months of production use, the care team intervened with 38% more patients in the pre-event window compared to the prior year. Each "intervention" is a real thing: an appointment, a medication change, or a care-management touch. Something happened before an event, not after.

Importantly, the care team reported that the model changed which patients they worried about. Several of the top-flagged patients had normal-looking recent BPs but trajectories the model picked up that a human chart reviewer would not. The model became a second set of eyes, not a replacement pair.

Model comparison · AUROC on held-out set
0.62
0.74
0.87
0.85
Baseline
LogReg
XGBoost
Tab-TF
XGBoost won on AUROC and calibration. Clean SHAP attribution was the deciding factor for clinical trust.
04 · Stack

What we used.

PythonXGBoostSHAPPostgreSQLFastAPIHIPAA-compliant analytics environment
Your practice, next

Scope a similar engagement for your practice.

Book a consultation