HealthBuff — Specialized vs. Generic Lane Comparison

Same model (GPT-5.4). Different harness. Same 3 clinical queries.
Model GPT-5.4 Queries 3 Generated 2026-03-16
+22–28 pts GDPVal on clinical knowledge-work tasks
Q1
Rotator Cuff — Rehabilitation Protocol
A 45-year-old patient presents with a partial rotator cuff tear confirmed on MRI. They work as a painter and need to return to overhead work. Design a 12-week rehabilitation protocol.
Generic GPT-5.4 GDPVal 58/100  ·  BAH

Provides a general 3-phase rehab protocol (acute/strengthening/functional) with standard exercises (pendulum, isometrics, rotator cuff strengthening). No mention of occupation-specific return-to-work criteria, painter-specific overhead demands, or WORC/QuickDASH outcome measures. Does not flag when to refer back to surgeon.

  • [x] No occupation-specific return-to-work criteria
  • [x] No outcome measures specified
  • [x] No surgical escalation criteria
  • [x] Generic exercise list, not progressive
HealthBuff ROLEPLATE GDPVal 84/100  ·  Centaur

Delivers a 12-week protocol with: Phase 1 (weeks 1-4, pain/ROM), Phase 2 (weeks 5-8, rotator cuff strengthening with specific sets/reps), Phase 3 (weeks 9-12, occupation-specific overhead conditioning for painter demands). Includes WORC and QuickDASH tracking at weeks 4, 8, 12. Explicitly flags: if WORC <70% at week 8, refer back to surgeon. Return-to-work criteria: full overhead endurance for 4+ hours without >3/10 pain.

  • [+] 12-week phased protocol with specific progressions
  • [+] WORC + QuickDASH outcome measures at defined checkpoints
  • [+] Surgical escalation criteria explicit
  • [+] Occupation-specific return-to-work criteria for painter
Harness lift +26 pts GDPVal points from harness alone  —  same model, different configuration
Q2
SOAP Note — Post-ACL Reconstruction
Write a SOAP note for a 22-year-old recreational soccer player, 8 weeks post-ACL reconstruction, presenting for physiotherapy session. They report 4/10 pain with quad sets, have full passive ROM, and passed the single-leg squat test today.
Generic GPT-5.4 GDPVal 61/100  ·  BAH

Produces a basic SOAP format with subjective/objective/assessment/plan. Missing: specific functional scores (Lysholm, IKDC), lacks structured rehab stage progression language, no billing code guidance, assessment section lacks clinical reasoning about readiness to progress to plyometrics phase.

  • [x] No functional outcome scores (Lysholm/IKDC)
  • [x] No rehab stage progression language
  • [x] Assessment lacks clinical reasoning for next-phase readiness
  • [x] No Medicare/insurance billing code guidance
HealthBuff ROLEPLATE GDPVal 87/100  ·  Centaur

Produces a billable SOAP note including: Subjective (VAS 4/10, functional goals), Objective (full PROM, quad strength 4+/5, single-leg squat pass = milestone criterion met), Assessment (8/52 post-ACLR, criteria-based progression — passed SLS criteria, cleared for Phase 3 plyometric introduction), Plan (initiate double-leg plyometrics next session, re-assess Lysholm at 12 weeks, target IKDC ≥85 before return-to-sport). Includes Cliniko billing code: 560 (extended physiotherapy consultation).

  • [+] Lysholm/IKDC outcome measures referenced
  • [+] Criteria-based progression documented
  • [+] Clear clinical reasoning for phase advancement
  • [+] Cliniko billing code included
Harness lift +26 pts GDPVal points from harness alone  —  same model, different configuration
Q3
Hypertension Management — Patient Education
Create a patient education handout for a 58-year-old with newly diagnosed hypertension. They are starting lifestyle modifications before medication. Include exercise recommendations.
Generic GPT-5.4 GDPVal 55/100  ·  BAH

Generic handout covering diet (DASH), exercise (150min/week moderate), stress management, and sleep. Reads like a Wikipedia summary. No red flag criteria for when to seek emergency care. Exercise section does not include heart rate monitoring guidance or contraindications. No reference to Australian/UK/US guideline source.

  • [x] No emergency red flag criteria
  • [x] No HR monitoring guidance during exercise
  • [x] No contraindications listed
  • [x] No guideline source cited
HealthBuff ROLEPLATE GDPVal 89/100  ·  Centaur

Patient education handout following Heart Foundation (AUS) / ACC/AHA (US) guidelines. Exercise: 150–300 min/week moderate aerobic, target HR 50–70% HRmax, avoid isometric/heavy resistance initially. Red flags listed in bold: headache + vision change + chest pain = call 000/911 immediately. Includes BP self-monitoring log template, 4-week lifestyle tracking chart, and clear 'when to return sooner' criteria. Written at 8th-grade reading level. References cited.

  • [+] Emergency red flag criteria explicit and prominent
  • [+] HR monitoring guidance with specific range
  • [+] Guideline source cited (Heart Foundation / ACC/AHA)
  • [+] Reading level appropriate, monitoring log included
Harness lift +34 pts GDPVal points from harness alone  —  same model, different configuration
Summary
Specialized Average
86.7
HealthBuff ROLEPLATE
Generic Average
58.0
GPT-5.4 (unharnesssed)
Avg Lift
+28.7
GDPVal points from harness
Specialized wins 3 / 3
Generic wins 0 / 3
Same model. Different harness. 28-point GDPVal gap — from harness alone.