{"id":25066,"date":"2026-05-18T00:10:37","date_gmt":"2026-05-18T00:10:37","guid":{"rendered":"https:\/\/thisbiginfluence.com\/?p=25066"},"modified":"2026-05-18T00:10:38","modified_gmt":"2026-05-18T00:10:38","slug":"moving-beyond-accuracy-to-safety-and-fairness","status":"publish","type":"post","link":"https:\/\/thisbiginfluence.com\/?p=25066","title":{"rendered":"Moving Beyond Accuracy to Safety and Fairness"},"content":{"rendered":"<p> <br \/>\n<\/p>\n<div>\n<figure class=\"wp-block-image size-full is-resized is-style-rounded\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/hitconsultant.net\/wp-content\/uploads\/2026\/05\/Vikram_Venkat_25B1362_Web.jpg\" alt=\"\" class=\"wp-image-96349\" width=\"502\" height=\"669\" srcset=\"https:\/\/hitconsultant.net\/wp-content\/uploads\/2026\/05\/Vikram_Venkat_25B1362_Web.jpg 900w, https:\/\/hitconsultant.net\/wp-content\/uploads\/2026\/05\/Vikram_Venkat_25B1362_Web-225x300.jpg 225w, https:\/\/hitconsultant.net\/wp-content\/uploads\/2026\/05\/Vikram_Venkat_25B1362_Web-218x290.jpg 218w, https:\/\/hitconsultant.net\/wp-content\/uploads\/2026\/05\/Vikram_Venkat_25B1362_Web-768x1024.jpg 768w\" sizes=\"auto, (max-width: 502px) 100vw, 502px\"\/><figcaption><strong>Vikram Venkat, Principal at Cota Capital<\/strong><\/figcaption><\/figure>\n<p>AI adoption is quickly rising in healthcare throughout all the pieces from medical documentation to diagnostic imaging, income cycle administration, and affected person engagement. As per the <a href=\"https:\/\/healthit.gov\/data\/data-briefs\/hospital-trends-use-evaluation-and-governance-predictive-ai-2023-2024\/\">2023-24 American Hospital Association Information Technology Supplement<\/a>, predictive AI built-in with EHR programs had been already utilized in 71% of hospitals; this has elevated quickly with the appearance of generative AI.<\/p>\n<p>Nevertheless, many AI deployments are inclined to fail in the actual world, and don&#8217;t ship the anticipated enhancements in medical worth and operational effectivity. This is because of a rising disconnect between how these AI programs are evaluated and the way they carry out in the actual world. Most evaluations depend on primary machine studying metrics (AUROC, F1 scores, AUPRC) that measure accuracy, precision, and recall. Nevertheless, accuracy measured on reflection is critical however not enough for real-world deployments; evaluations also needs to be sure that the AI fashions are secure, honest, correctly calibrated, workflow-compatible, and operationally dependable when people work together with them.\u00a0<\/p>\n<p>Numerous research have highlighted this hole. <a href=\"https:\/\/jamanetwork.com\/journals\/jama\/fullarticle\/2825147\">Bedi, Liu, Orr-Ewing et al<\/a> discovered that the majority analysis research (95.4%) primarily centered on accuracy, however equity, bias, and toxicity (15.8%), deployment issues (4.6%), and calibration and uncertainty (1.2%) had been sometimes measured. Additional, solely 5% of the trials studied used actual affected person care information for analysis. A <a href=\"https:\/\/www.sph.umn.edu\/news\/new-study-analyzes-hospitals-use-of-ai-assisted-predictive-tools-for-accuracy-and-biases\/\">recent study by the University of Minnesota<\/a> additionally discovered that lower than half of the US hospitals utilizing AI-assisted predictive instruments measured them for bias. The dangers related to such fashions is large \u2013 <a href=\"https:\/\/jamanetwork.com\/journals\/jama\/fullarticle\/2812908\">Jabbour, Fohey, Sheppard et al<\/a> discovered that diagnostic accuracy worsened by 11.3% when clinicians had been proven biased AI mannequin predictions.<\/p>\n<p><strong>Why accuracy just isn&#8217;t sufficient<\/strong><\/p>\n<p>There are a number of causes accuracy-focused analysis fails in the actual world.<\/p>\n<p>First, accuracy can cover poor calibration and uncertainty. Most accuracy measures take a look at relative rating \u2013 for instance, rating which of a pair of sufferers are greater danger, or which of two claims usually tend to be denied. Nevertheless, most healthcare selections rely upon thresholds and absolute values \u2013 for instance, figuring out whether or not a affected person\u2019s danger is enough to set off intervention. Consequently, calibration and uncertainty are extra essential measures that determine the usability of a mannequin\u2019s prediction for medical or operational use circumstances.<\/p>\n<p>Second, totally different healthcare environments differ by case combine, EHR configuration, workflows, affected person demographics, and a number of other different traits. Consequently, primary exterior validation is inadequate and might solely signify a snapshot-in-time measure; steady analysis throughout the AI lifecycle is required as a substitute.<\/p>\n<p>Third, common efficiency or accuracy measurements can cover variances for various subgroups. A mannequin can carry out properly general, however nonetheless fail for uncommon ailments or displays, minority subgroups, or any classes which are under-represented within the coaching dataset underlying the mannequin. Any analysis ought to report each common and subgroup-specific efficiency to forestall unfairness, bias, or toxicity; additional, the checklist of subgroups analyzed needs to be as complete as attainable.<\/p>\n<p>Fourth, there are a number of operational failures the place the implementation layer breaks even when the fashions are statistically correct. This could possibly be as a consequence of stale or incorrect information, incorrect context mapping, lagging information feeds, incorrect routing, and even downtime; all these points cut back mannequin reliability and have medical and operational penalties.\u00a0<\/p>\n<p>Lastly, most measures solely consider the efficiency of AI, however not of the whole system that features people interacting with AI. Customers might over- or under-trust AI outputs, and behavioral adjustments creep in as soon as AI options are deployed. To really measure efficacy, security, and reliability, the human-plus-AI group needs to be evaluated somewhat than simply the mannequin. <a href=\"https:\/\/www.nature.com\/articles\/s41746-025-01784-y\">Morey, Rayo, and Woods<\/a> demonstrated that measuring AI capabilities alone doesn&#8217;t assure security and effectiveness of joint human-AI deployments.<\/p>\n<p><strong>A playbook for true analysis<\/strong><\/p>\n<p>The backbone of a complete healthcare AI analysis framework stays measures of technical and statistical validity. Nevertheless, these needs to be complete and measure rating (corresponding to AUROC, F1), calibration, uncertainty, in addition to sensitivity and specificity.<\/p>\n<p>The framework also needs to guarantee measurement of subgroup-level efficiency. To obviously take a look at real-world efficiency, temporal validation, the place fashions are examined on beforehand unseen information (distinct from the dataset the mannequin was skilled on) needs to be performed. Equally, the mannequin needs to be examined on native datasets, particular to the establishment and use case being deployed for.<\/p>\n<p>One other essential, crucial step earlier than full deployment is silent trial analysis. Earlier than full deployment, fashions needs to be run in reside or near-live environments with out affecting care or operations; predictions made are then in contrast in opposition to noticed outcomes to measure reliability throughout the whole human-plus-AI unit in real-world utilization. This helps determine statistical, operational, and behavioral dangers and failure modes earlier than the mannequin is deployed. Current analysis from <a href=\"https:\/\/www.nature.com\/articles\/s44360-025-00048-z\">Tikhomirov, Semmler, Prizant, et al<\/a> highlighted the significance of silent trials, but additionally identified its low utilization in precise deployments.<\/p>\n<p>In such evaluations, human elements also needs to be measured \u2013 response latency, AI-suggestion acceptance and override charges, workload results, and belief within the system. These measurements ought to take a look at affect on outcomes, and never simply on the particular duties being carried out; to take action requires separating mannequin efficacy and implementation efficacy. Nevertheless, this can be a crucial step to make sure AI drives enhancements in customary of care, claims processing accuracy, and different key healthcare measures.<\/p>\n<p>Lastly, it&#8217;s essential to make sure steady post-deployment monitoring. Healthcare information shift is fixed \u2013 seasonal illness patterns, staffing turnovers, coding adjustments, new gadgets, programs, or workflows all trigger adjustments. Steady monitoring ought to take a look at for function, efficiency, calibration drift each for the whole inhabitants and for particular subgroups; any variations needs to be fastidiously investigated.<\/p>\n<p><strong>Conclusion<\/strong><\/p>\n<p>The healthcare business presently talks about AI as if fashions fail primarily as a result of they&#8217;re inaccurate. In apply, many fashions are already fairly correct; real-world failures are induced as a consequence of ineffective calibration, poor localization, weak monitoring, poor integrations, and different related elements. Till analysis frameworks replicate the realities of the environments they&#8217;re deployed in \u2013 workflow complexity, human habits, information instability, and system dangers \u2013 healthcare AI deployments will lack the reliability wanted to really ship constant medical worth and outcomes.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<p><strong>About Vikram Venkat<\/strong><\/p>\n<p><a href=\"https:\/\/www.cotacapital.com\/team\/vikram-venkat\/\">Vikram Venkat<\/a> is a Principal at <a href=\"https:\/\/www.cotacapital.com\/\">Cota Capital<\/a>, an early-stage enterprise capital agency the place he invests throughout healthcare and AI. Vikram has earlier labored in healthcare and AI as a marketing consultant on the Boston Consulting Group, and throughout three different enterprise capital corporations.<\/p>\n<\/div>\n<p><br \/>\n<br \/><a href=\"https:\/\/hitconsultant.net\/2026\/05\/15\/healthcare-ai-evaluation-playbook-beyond-accuracy\/\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Vikram Venkat, Principal at Cota Capital AI adoption is quickly rising in healthcare throughout all the pieces from medical documentation to diagnostic imaging, income cycle administration, and affected person engagement. As per the 2023-24 American Hospital Association Information Technology Supplement, predictive AI built-in with EHR programs had been already utilized in 71% of hospitals; this [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":25068,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5],"tags":[5096,15711,6408,3372],"class_list":["post-25066","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-health","tag-accuracy","tag-fairness","tag-moving","tag-safety"],"_links":{"self":[{"href":"https:\/\/thisbiginfluence.com\/index.php?rest_route=\/wp\/v2\/posts\/25066","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/thisbiginfluence.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/thisbiginfluence.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/thisbiginfluence.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/thisbiginfluence.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=25066"}],"version-history":[{"count":1,"href":"https:\/\/thisbiginfluence.com\/index.php?rest_route=\/wp\/v2\/posts\/25066\/revisions"}],"predecessor-version":[{"id":25067,"href":"https:\/\/thisbiginfluence.com\/index.php?rest_route=\/wp\/v2\/posts\/25066\/revisions\/25067"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/thisbiginfluence.com\/index.php?rest_route=\/wp\/v2\/media\/25068"}],"wp:attachment":[{"href":"https:\/\/thisbiginfluence.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=25066"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/thisbiginfluence.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=25066"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/thisbiginfluence.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=25066"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}