I have been hearing for some time, discourse about the ills AI will wrought on patients and consumers if it is used in the healthcare context. Most of the fear or scepticism comes from a good place with an intent to ensure that AI is safe and efficient when used in healthcare. However, sometimes you come across fearmongering, gatekeeping, and reality-denying statements that make you shake your head in dismay and question the motives of these statements. Please don't get me wrong, my work over the past many years has been devoted to safe and effective, aka translatable, AI in healthcare as evidenced in my publications and projects. I am all for constructive discourse and work that brings about the appropriate and safe adoption of AI in healthcare. However, in a field (i.e. healthcare) that has fallen way behind other sectors in the adoption of AI, irrational fearmongering (if not doomsaying) in the garb of ethics, patient safety, and regulation is of no benefit to anyone, especially for the most critical population in healthcare i.e. the patients.
There are two famous cases that AI (in healthcare) critics cite to indicate the harm AI will wreak on patients and consumers of healthcare (yes, just two notable ones in over 40 years of the use of AI in medicine). One is from some years back and the other is a recent case. Let us review both to understand if it was AI intrinsically at fault or if it was the way the AI technology was constructed (by humans) and used (read "human oversight" and "policy") that was at fault.
In this famous and widely cited study from 2019, the authors identified that an algorithm that was widely being used in US hospitals was discriminating against African Americans keeping them away from much-needed personalised care. I will allow you to read the study and its findings, which I am not disputing. What I am critiquing is how the findings are being used by a minority to demonise AI. Here are the main arguments, for why this interpretation is shallow and of a poor understanding of the context.
1. The algorithm was trained to predict future healthcare costs, not directly predict health needs. Using cost as a proxy for health needs is what introduced racial bias because less money is spent on Black patients with the same level of illness. The algorithm was accurately predicting costs, just not illness.
2. When the algorithms were trained to directly predict measures of health instead of costs, such as the number of active chronic conditions, the racial disparities in predictions were greatly reduced. This suggests the algorithm methodology itself was sound, but the choice to predict costs rather than health was problematic.
3. The algorithm excluded race from its features and predictions. Racial bias arose due to differences in costs by race conditional on health needs. So, race was not directly encoded, but indirect effects introduced bias.
4. The study also notes that doctors, when using the algorithm's output, do redress some of the bias, but not to the extent that would eliminate the disparities. This suggests that the way healthcare professionals interpret and act on the algorithm's predictions also contributes to the outcome.
5. The authors note this approach of predicting future costs as a proxy for health needs is very common and used by non-profit hospitals, academics, and government agencies. So, it seems to be an industry-wide issue, not something unique to this manufacturer's methodology.
6. When the authors collaborated with the manufacturer to change the algorithm's label to an index of both predicted costs and predicted health needs, it greatly reduced bias. This suggests the algorithm can predict health accurately when properly configured.
As an overview, key limitations seem to have been in the problem formulation and real-world application of the algorithm to guide decisions, not in fundamental deficiencies in the algorithm itself or the manufacturer's approach. The issues introduced racial disparities, but an algorithm tuned to directly predict health could avoid such disparities.
Now, let us review a more recent case, which is generating headlines and a lawsuit. In this instance, UnitedHealthcare, which is the largest health insurance company in the US, has been alleged to have used a faulty algorithm to deny critical care for elderly patients. The lawsuit against the group alleges that elderly patients were prematurely removed from rehabilitation and care centres based on the algorithm's recommendation while overriding physician advice. Now, I will allow you to read the details in this well-written piece. If indeed, the information being presented is confirmed, it is an unfortunate and shocking episode. However, returning to the crux of my argument: is AI as a tool the key driver behind the misfortunate episode or how it was constructed and is used by the organisation the real issue? Now reviewing the details available from the media, these aspects become obvious.
1. The AI tool is claimed to have a very high error rate of around 90%, incorrectly deeming patients as not needing further care. However, an AI with proper training and validation should not have such an egregiously high error rate if accurately predicting patient needs.
2. UnitedHealth Group set utilization management policies that seem to provide incentives to deny claims and limit care. The AI tool may simply be enforcing those harsh policies rather than independently deciding care needs.
3. Based on the available information, there are not enough details provided on how the AI was developed, trained, and validated before being deployed in real healthcare decisions (the organisation has not provided access to the algorithm to the media). Proper processes may not have been followed, leading to poor performance.
4. The federal lawsuits note UnitedHealth Group did not ensure accuracy or remove bias before deploying the tool. This lack of due diligence introduces preventable errors.
No AI exists in a vacuum - it requires ongoing human monitoring and course correction. If the high denial rates were noticed, why was it not fixed or discontinued? The lack of accountability enables problems to persist. While further technical details on the AI system itself would be needed to thoroughly assess it, the high denial rates, perverse financial incentives, lack of validation procedures, and lack of human accountability strongly indicate issues with the overall organization, policies, and deployment - not necessarily flaws with AI technology itself.
I often in my presentations when discussing AI safety in healthcare state "AI is a tool, don't award it sentience and autonomy when it is not technically capable of so. Even if it were to assume these capabilities, it still should be under a proper governance framework". Now let us consider both above stated cases and consider where the key issue is. As with most new technologies, the humans who implement them bear responsibility for doing so properly and ethically. In these cases, it's important to assess whether the AI was designed with a specific objective that it is accurately meeting, but that objective or the data it's based on might be leading to unintended consequences. For example, if an AI is programmed to optimize for cost savings, it might do so effectively, but this could inadvertently lead to denying necessary care. If the AI is trained on historical data that contains biases or reflects systemic inequalities, the algorithm might perpetuate or amplify these issues. The problem, in this case, would be the data and the historical context it represents, not the algorithm itself. Also, the way an AI is used within an organizational structure can significantly affect its impact. If policies or management decisions lead to an over-reliance on AI without sufficient human oversight or if the AI's recommendations are used without considering individual patient needs, the issue would be with the organizational practices rather than the AI.
While the AI might be functioning as designed, the broader context in which it is deployed – including the data it's trained on, the policies guiding its use, and the human decisions made based on its output – could be contributing to any negative outcomes. These factors should be carefully examined and addressed to mitigate any harm and ensure that AI tools are used responsibly and ethically in healthcare settings. I have over the years, along with colleagues, anticipating such issues, laid out guidance as indicated here, here,here and here. Like many well-intentioned critics of AI, I do acknowledge the limitations of AI and the harm, AI can wrought when it is used in an unsafe manner. However, shunning AI, for what are faults that are obviously not its doing, and doomsaying, is like blaming the vehicle for a road traffic accident even when the human driver is clearly at fault. Speaking of vehicles, another thing please don't continue to use the Tesla autopilot mishap that occurred in 2019 as an example of the ills AI will wrought in healthcare, it is lazy and not of context to healthcare. In any case, a jury has ruled the Tesla autopilot was not at fault for the fatal crash.
"For a better, more accessible and safer healthcare system with active input from AI"-Sandeep Reddy
Health System Academic