How to kick off and establish an AI in a healthcare career (if you are not a computer engineer/scientist)
I am often approached by clinicians, healthcare managers, and many others with a non-engineering/computer science background as to how to get started with an applied AI role in healthcare. To the disappointment of many, my advice is not very palatable as it presents many intricate steps to get one established in this area. Unfortunately, there is no quick and easy path to achieving your goals in this field. It involves hard work, but importantly, focusing on key areas to ensure you are on the right track. However, if one was clearly determined and willing to put in the necessary efforts, the benefits would be multifold.
Let me relate my journey, it may help add perspective to the advice I provide. Since the start of my professional career, I have interacted with and used various datasets and analytical techniques to inform my work and that of my employer organizations. Data included medical, public health, census, hospital performance, and financial information. I have used SPSS, SAS, Power BI, Excel, GIS, Tableau…etc. for analyzing and presenting the data in my various roles, especially when I was working for governmental agencies, across New Zealand, Denmark, and Australia. However, it was during the days of my PhD study that I came across machine learning as a powerful analytical tool. I was developing a statistical correlational model linking emergency department performance to that of hospital funding for my PhD study. Although I ended up using a simple linear equation and trend analysis for my statistical model, my interest in machine learning persisted.
A few years later, having completed my PhD, I approached some established AI researchers and machine learning engineers, to seek advice on establishing oneself in this area. However, none seemed interested in offering time or advice. I wouldn’t entirely blame them, as my profile (a medical doctor with a public health medicine/healthcare management career) did not seem to align with their idea of an AI practitioner. However, what was not obvious to them was my experience and interest in data analysis and my determination to establish myself in this area.
The first thing I did was spend considerable time reading machine learning and AI books (I must have spent over a thousand dollars in that period stocking up my library with such books), followed by enrollment in machine learning/data analysis courses (online and on-site). I also separately started training in Python programming (I never felt inclined to learn R) and practised what I learned by building simple machine learning models (the classic image classification exercises J). It was now the time to consider how you could apply what I learned in medicine/healthcare. I quickly realized that AI as a technology does not operate isolated in healthcare, it must be part of a digital ecosystem. So, this meant I had to be across details of health IT. Fortunately, in Australia, I had a pathway to be certified in Health IT, i.e., Certified Health Informatician Australasia (CHIA). Preparing for the certification, meant reading up on all aspects of health IT theory and implementation, followed by a vigorous exam. When I did get certified, it gave me a significant morale boost to progress to the next steps to establish myself in this field.
It is often said that the best way to educate oneself is to write. In my case, this is very accurate. I committed myself to writing not only to educate myself but also others. I was very fortunate to be working in an environment/role that allowed me to pledge to this endeavour. Likewise, I started my publication in this field with a book chapter about AI and its application in healthcare, which has since been followed by three books, four book chapters, thirty or so articles (peer-reviewed and non-peer-reviewed), and obviously, many more to come. Writing gets you to be critical of not just your topic, but also your understanding of the topic, especially when it goes through a peer review process. This is why I prioritize publication in journals, as the peer review process adds vigour and credibility to your writing outputs.
During the early period of my AI career, I encountered resistance and unwillingness from various others to engage in AI projects. I had noted, through my research, that there were many areas in healthcare that would benefit from AI applications. As they say, if you can’t get the mountain to come to you, you must go to the mountain. In this instance, to fulfil my AI aspirations and obtain experience implementing AI projects in medicine, I established my own AI focused company. Since then, I have had the fortune of other like-minded individuals joining the company. Involvement in the company and several AI projects meant I obtained invaluable hands-on experience developing and deploying AI models. Now, I only play a governance role with the company, but involvement in an industry role brings you expertise and experience that pure research roles may not.
As time passed, the momentum built through my earlier steps positively impacted my AI research/academic side I now engage in the entire continuum of the AI pathway-from development to deployment of models to policy and governance. This is nicely captured in my current academic projects (as of August 2023).
By no means, I have attained my peak as an AI in healthcare practitioner, and as Robert Frost put it, I “have promises to keep and miles to go before I sleep” but I feel a certain momentum or foundation has been established based on my two decades of data analysis and approximately a decade of dedicated AI work, that I see a clearer path to my involvement in AI in healthcare. I am certainly optimistic about the prospects of AI in medicine/healthcare and feel that we will lose a significant opportunity to address healthcare’s intractable problems if we ignore or do not adopt AI technology. As I look back on my journey and small accomplishments so far in this field (below illustration), I feel that I have a role to play in the coming years to enable this adoption, and I hope to find collaborators to join me in this journey.
In the past many years, having sat on several advisory boards, research planning and funding review meetings, I have been concerned by how many policymakers, researchers and organisational leaders know so little of evaluation science and conflate it with health economic assessment and implementation science. Far more concerning, some of them think a health economic assessment supersedes an evaluation and negates the commissioning of an evaluation process. Obviously, I speak here of a healthcare services research context, but this issue may extend beyond this discipline. Therefore, I make a case here for stakeholders, including funding decision-makers and researchers alike, to distinguish between the disciplines of implementation science and evaluation science and the processes of evaluation and health economic assessment.
The healthcare sector is vast and complex, comprising multiple dimensions of study and implementation. It is driven by a constant need to maximize patient outcomes while ensuring the efficient use of resources. Implementation science and the evaluation process are both critical components within the realm of health and social sciences. They both aim to contribute to improved practices, interventions, and policies within various disciplines, particularly healthcare. While they may appear similar, they encompass different objectives, methodologies, and stages in implementing and assessing evidence-based interventions.
Implementation science, also known as knowledge translation, is a multidisciplinary field that systematically integrates research findings and evidence-based practices into routine and everyday use. The goal is to improve the quality and effectiveness of health services, social programs, and policies. The fundamental underpinning of implementation science is to bridge the gap between research and practice, often referred to as the "know-do" gap.
Implementation science utilizes various theories, models, and frameworks to identify, understand, and address barriers to evidence-based interventions' adoption, adaptation, integration, scale-up, and sustainability. It takes into account numerous factors, including the complexities of health systems, the variability of human behaviours, and the diversity of social and political contexts.
On the other hand, the evaluation process is a systematic method used to assess the design, implementation, and utility of programs, interventions, or policies. The primary purpose of the evaluation is to judge the value or worth of something to guide decision-making and improve effectiveness.
There are different types of evaluation, such as formative, summative, process, and outcome. Each type focuses on various stages of a program's life cycle. Evaluations may consider the fidelity of implementation, the outcomes of the intervention, and the cost-effectiveness, among other aspects. While implementation science and evaluation processes utilize mixed methods, their methodological emphases differ. Implementation science prioritizes process-oriented investigations, employing qualitative research to understand human behaviours and system complexities. In contrast, the evaluation process often emphasizes outcome measures, using quantitative methods to assess the degree to which program goals and objectives have been met.
Health economic assessment, often referred to as health economic evaluation, is a tool used to compare the cost-effectiveness of different health interventions. This method evaluates different health programs' benefits relative to their costs, aiming to maximize health outcomes given a particular budget constraint. The health economic assessment primarily adopts a macroeconomic perspective, focusing on the healthcare system's overall cost-effectiveness. This assessment typically employs methods such as cost-effectiveness analysis (CEA), cost-utility analysis (CUA), or cost-benefit analysis (CBA).
The value of health economic assessment lies in its ability to provide a comparative analysis of the efficiency of different healthcare interventions. Quantifying costs and outcomes (often in quality-adjusted life years or QALYs) provide a comprehensive view of the economic value of different healthcare choices. This is particularly valuable when resources are scarce, and there is a need to allocate them in a way that can yield the most significant health benefits for the most considerable number of people.
On the other hand, health service evaluation focuses on assessing the quality of care delivered in a specific healthcare setting or by a particular healthcare service. It's a process that takes a micro view, examining individual services, care pathways, or providers. It aims to identify areas of improvement and highlight best practices, focusing on effectiveness, efficiency, and equity in care delivery. The measures in health service evaluation might include patient satisfaction, accessibility of care, timeliness, and adherence to clinical guidelines, among others. Health service evaluation aids in pinpointing gaps or deficiencies in care delivery that might not be evident from a macroeconomic view. This detailed scrutiny of specific services can lead to improvements in patient care, satisfaction, and overall health outcomes. It can also highlight systemic issues that might need addressing at a policy level.
The principal differences between health economic assessment and health service evaluation stem from their varying focuses. The former adopts a broader perspective, taking into account the entire health system's economic balance. It often involves policy-level decisions concerning resource allocation, seeking to achieve the maximum health benefits per unit of cost across different healthcare interventions. Conversely, health service evaluation narrows its lens to the individual service, provider, or care pathway level. Its main goal is to improve the quality of care and patient satisfaction within specific healthcare services, not necessarily accounting for the broader economic implications of these improvements. It may, however, indirectly influence economic evaluations by identifying more effective or efficient practices that can then be implemented more broadly, thus improving cost-effectiveness.
Thus, while each field (implementation science, health economic assessment, and the evaluation process) have a critical role in improving the quality and outcomes of health service delivery, they are very distinct from each other. One cannot afford to conflate each other but only at the expense of sound research and assessment. It is important for evaluators to be clear and loud in making the difference known while standing up for appropriate evaluation processes to be considered in health service projects and programs.
In last week's Google I/O developer conference, there was an announcement that PaLM 2 (Google's latest generational large language model) will have multimodal capabilities. This means PaLM 2 can also interpret images and videos in addition to text interpretation and generation. Previously, Open AI announced that GPT-4 would have these capabilities too. In other words, the new generation of large language models will have multi-modal capabilities as a standard offering. How is this significant to the healthcare domain in which I operate?
Medical practice, by default, operates on multi-modal functionality. A clinician must interpret the patient or laboratory records, take an oral history, undertake visual examination, and interpret waveform and radiological investigations. Collectively, these inform the clinician's diagnosis or management of the patient. The previous generation of AI models could only contribute to a narrow set of medical tasks, say electronic record analysis or medical image interpretation, not in combination. This was mainly due to how the machine learning models were trained (supervised learning/annotated/labelled process) and the intrinsic limitation of the algorithms (even advanced ones) to perform accurately on multi-modal datasets. While regulatory authorities and vendors had a relatively easy task of having the application certified for its task boundaries and safety, they really fulfilled a narrow set of the customers (health services, medical doctors...etc.) requirements. Considering the need to integrate these applications into existing information systems, the economies of scale and ROI were minimal, if not non-existent.
The availability of multi-modal (and potentially multi-outcome) functionality may considerably change AI in the healthcare landscape. An ability of a single AI application to not only analyse a radiological investigation but link it back to the patient's history derived from analysis of the electronic health record and pathology investigations will be revolutionary. This will negate the need for stakeholders/purchasers to source multiple AI applications and become more accessible for the health service to set up a governance mechanism to monitor the deployment and delivery of AI-enabled services. At a clinical level, by utilizing multi-modal AI, physicians have a more comprehensive view when making a diagnosis.
Now such applications are not far from entering the commercial space. In the research domain, I was last year fascinated by this study from South Korea, where the authors demonstrated a multi-modal algorithm which adopts a BERT-based architecture to maximize generalization performance for both vision-language understanding tasks (diagnosis classification, medical image-report retrieval, medical visual question answering) and vision-language generation task (radiology report generation). As you may know, BERT is a masked language model based on the Transformer architecture. Since this study, I have seen a wave of studies showcasing the efficacy of multimodal AI, such as this and this.
Back to Google's announcement last week, as part of the customised offering of PaLM 2 in various domains, Med-PaLM 2 developed to generate medical analysis was demonstrated too. As per this blog, Med-PaLM 2 will interpret and generate text (answer questions and summarise insights) and have multi-modal functionalities to analyse/interpret medical image modalities. Considering GPT-4 can analyse images and offers access to their API to external developers, it is not hard to foresee multi-modal medical AI applications in the market. Of course, as I see it, multi-modal AI is not going to be restricted to LLM architecture, and there are different ways to develop such applications. Also, it is not enough to have multi-modal functionality; you also need to have multi-outcome features.
I write this article not only to signal to healthcare stakeholders (policymakers, funders, health services. Etc.) about the future of medical AI software but also to forewarn narrow use case medical AI developers to pivot their development strategy to multi-modal AI functionality or be swept away as the floodgates of multi-modal AI is unleased.
This week, the Future of Life Institute (FLI) issued an open letter calling for technology businesses and Artificial Intelligence (AI) research laboratories to halt any work on any AI more advanced than GP4. The letter, signed by some famous names, warns about the dangers advanced AI can pose without appropriate governance and quotes from the Asilomar AI principles issued in 2017. While well-intentioned in warning about the dangers posed by advanced AI, the letter is a bit premature as their target, the Large Language Models (LLMs), are no closer to Artificial General Intelligence (AGI) than we are closer to humans settling on Mars. Let me explain why.
If we consider human intelligence the benchmark for how AI systems are modelled, we must first understand how humans learn. This process is succinctly captured in the below illustration (Greenleaf & Wells-Papanek, 2005).
Here we can observe that we utilise our senses to draw upon inputs from the environment and then utilise our cognitive process to relate the information to previous memories or learning and then apply it to the current situation and act accordingly. While the actual process in the human brain, incorporating short-term and long-term memories and the versatile cognitive abilities of different parts of the brain, is more complex, it is essential to note that the key steps are the 'relation' and 'connection' to 'memories' or 'existing knowledge, leading to insights as illustrated in the below figure (Albers et al., 2012).
Now let's look at how LLMs operate. These models process data by breaking it into smaller, more manageable tokens. These tokens are then converted into numerical representations that the model can work with using tokenization. Once the data has been tokenized, the model uses complex mathematical functions and algorithms to analyze and understand the relationships between the tokens. This process is called training, and it involves feeding the model large amounts of data and adjusting its internal parameters until it can accurately predict the next token in a sequence, given a certain input. When the model is presented with new data, it uses its trained parameters to generate outputs by predicting the most likely sequence of tokens following the input. This output can take many forms, depending on the application - for example, it could be a text response to a user's query or a summary of a longer text. Overall, large language models use a combination of statistical analysis, machine learning, and natural language processing techniques to process data and generate outputs that mimic human language. This process is illustrated in this representation of GPT4 architecture, where in addition to text, images are utilised as input (source: TheAIEdge.io)
AGI refers to the ability of an AI system to perform any intellectual task that a human can. While language is an essential aspect of human intelligence, only one component of the broader spectrum of capabilities defines AGI. In other words, language models may be proficient at language tasks but lack the versatility and flexibility to perform tasks outside their training data.
One of the primary limitations of large language models is their lack of generalization. These models are trained on large amounts of data and can generate impressive results within their trained domain. However, they struggle to apply this knowledge to new and unseen tasks. This limitation is because language models are trained through supervised learning, giving them a specific task and corresponding data to learn from. As a result, these models cannot reason or make decisions based on broader contexts. Another limitation of language models is their lack of common sense. While these models can generate coherent text and answer some basic factual questions, they cannot understand the world as humans do. For instance, they may be able to generate a recipe for a cake, but they cannot understand the implications of adding too much salt or sugar to the recipe. Furthermore, language models cannot interact with the physical world. AGI systems must be able to interact with the world as humans do. They must be able to perceive their surroundings, reason about the objects and people around them, and take appropriate actions. Language models are limited to processing text and cannot interact with the world meaningfully.
Importantly, language models cannot retain memories (whether short-term or long-term), which are so essential to human learning and intelligence. So an autoregressive approach that language models adopt by analysing their training data is not a substitute for human learning.
The road to AGI for large language developers is to create larger models supported by significant computational resources. These models are not just complex in their parameters but are environmentally unfriendly. Critically, they are black-box models, which even currently available explainable AI frameworks cannot scrutinise. With some LLM developers indicating they will not make the architecture and training process available to the public, it amounts to a selfish move and a scary development for the general public and the AI community. LLMs can be used to generate text that is designed to mislead or deceive people. This could spread false information, manipulate public opinion, or incite violence. LLMs can be used to create deep fakes that are very realistic, which could be used to damage someone's reputation or spread misinformation. This could lead to job losses and economic disruption. It could also lead to a concentration of power in the hands of a few companies that control the LLMs. LLMs are trained on data collected from the real world, which can contain biases. If these biases are not identified and addressed, they could be embedded in the LLMs and lead to biased systems against certain groups of people. LLMs are complex systems that are difficult to understand and secure. This makes them vulnerable to attacks by malicious actors. These issues may have led to the aforementioned letter, but to assume that LLMs are the next step to AGI is incomprehensible.
First, LLMs cannot understand the meaning of language in the same way humans do. They can generate text that is grammatically correct and factually accurate, but they do not have the same level of understanding of the world as humans. Second, LLMs are not able to generalize their knowledge to new situations. They are trained on a specific set of data and can only perform tasks they have been trained on. Third, LLMs cannot learn and adapt to new information in the same way humans do. They are trained on a fixed set of data and cannot learn new things without being explicitly programmed to do so.
Does intelligence have to be modelled regarding how humans learn? Couldn't alternative models of intelligence be as well as useful? I have argued for this in the past, but is this something we want? If we can't comprehend how an intelligence model works, it is a recipe for disaster if we can't control it anymore (read AI singularity). The most practical and human-friendly approach is developing intelligence models that align with human learning. While daunting and perhaps not linear, this path presents a more benign approach vis a vis explainability, transparency, humane, and climate-friendly principles.
On a Sunday morning in our Aussie summer, as I mull about the week and year ahead, I thought I would stretch my mind to consider how AI would be used in various industries in the future. I generally focus on AI applications in healthcare in my practice and will continue to do so but for once I wanted to hazard some predictions about the impact of AI generally ten years from now.
Consider it is 29th January 2033 and as you cast your analytical mind across the business, healthcare, automotive, finance, judicial, and arts sectors, you note the following.
Robots: The presence of robots has expanded beyond the industrial sector. Domestic robots are ubiquitous with their use in various domestic tasks. Also, robots are being used for security patrols, home deliveries and providing companionship and care to relevant people.
Virtual Actors: Human actors are now competing with AI (virtual actors) copyrighted to studios or companies reminiscent of the early twentieth century when studios owned actors. AI-powered animation, NLP and special effects have advanced to the point, one cannot distinguish between human and AI-generated actors on the screen diminishing the need to rely on fickle celebrities for screen productions.
Art: AI-generated art has become a phenomenon of its own, with 'augmented art' becoming sought after. With it becoming practically impossible to distinguish between human and AI-generated art, it is accepted that any painting produced after 2030 are entirely AI-generated or a hybrid of human and AI talent. To enable the generation of high-quality art, competitions, where human artists equipped with AI software are required to deliver art as per themes, are offered.
Judiciary: Most of the non-serious or civil litigations are analysed and abjudicated by 'AI Judges'. With the backlog of cases in many civil and family courts, authorities have introduced AI-driven applications to screen and make recommendations or rulings. These applications draw upon jurisprudence and best practice to suggest recommendations or make rulings. To make these 'AI Judges' acceptable to the community, human judiciary panels provide oversight.
Robo-Taxis: In most developed countries' urban centres, taxis are now self-driven with a centralised command centre directing the vehicles to customers' to-and-fro destinations upon request. These robot taxis in addition to electric powered also have hydrogen fuel options and can cover a large range of distances.
AI Clinics: These multi-model and multi-outcome AI-driven health centres offer screening/triaging and low-risk clinical care to registered patients and have become the default clinics in many geographical areas across the world. Continued healthcare workforce shortages and raising healthcare expenditure led authorities in the UK and China to pilot these centres in their cities in 2029-2030. Independent evaluation and peer-reviewed studies published in the Lancet and NEJM in 2031 indicated efficacious, safe and high-quality care for certain medical conditions delivered at low cost. Drawing upon these several entrepreneurs and companies have developed portable environmentally friendly facilities with integrated multi-modal AI and telefacilities. Governments have negotiated with these suppliers to trial these facilities in their urban and regional centres.
Change to the Name: Well, AI is still called AI in 2033, but when expanded it is described as 'Augmented Intelligence'. Experiments to integrate AI into the human brain as brain implants and offered as augmented tools through mixed reality devices have led to an international consensus for AI to be described as 'Augmented Intelligence'.
Universal Income: With much of the blue-collar and significant white-collar jobs being delivered by AI/Robots, some governments have introduced legislation to protect the earning capacity of their citizens through the 'Universal Income' framework, where all of their citizens (unless they opt out) draw upon a legislated income. Revenue for this spend is generated from a mix of taxes, royalties, and trade income. The availability of universal income has led many of their citizens to pursue their real interests and has spawned an era of innovations and inventions.
Post-Note: A year ago, I would have read this article and placed it in the basket of 'Science Fiction'. While not purporting to have absolute certainty of the future, especially as a follower of Quantum Physics/Mechanics, I do consider the current progress with AI and Robotics will lead us to these outcomes in some form or another. In any case, if you and I are around in 2033 let us revisit this article :-)
As medical care evolves, clinicians and researchers are exploring the use of technology to improve the quality and effectiveness of medical care. In this regard, technology is being used to deliver precision medicine. This form of medicine is a new approach that focuses on using genomic, environmental and personal data to customize and deliver precise form of medical treatment. Hence the name ‘precision medicine’. One of the most influential factors, in recent years, in delivering precision medicine has been Artificial Intelligence (AI). In specific one of its forms Machine Learning (ML). ML, which uses computation to analyze and interpret various forms of medical data to identify patterns and predict outcomes has shown increasing success in various areas of healthcare delivery. In this article, I discuss how computer vision and natural language processing, which use ML can be used to deliver precision medicine. I also discuss the technical and ethical challenges associated with the approaches and what the future holds if the challenges are addressed.
Various forms of medical imaging techniques like X-rays, CT, MRI and Nuclear imaging techniques are being used by clinicians to assist their diagnosis and treatment of various conditions ranging from cancers to simple fractures. The importance of these techniques in devising specific treatments has become critical in recent years. However, the dependency on a limited subset of trained medical specialists (Radiologists) to interpret and confirm the images has meant in many instances increase with the diagnosis and treatment times. The task of classifying and segmenting medical images can not only be tedious but take a lot of time. Computer Vision (CV), a form of AI that enables computers to interpret images and relate what the images are, has in the recent years shown a lot of promise and success. CV is now being applied in medicine to interpret radiological, fundoscopic and histopathological images. The most publicized success of recent years has been the interpretation of retinopathy images to diagnose diabetic and hypertensive retinopathy. The use of CV, powered by neural networks (an advanced form of ML), is said to take over the tedious task of segmenting and classifying medical images and enable preliminary or differential diagnosis. This approach is stated not only to accelerate the process of diagnosis and treatment but also provide more time for the radiologists to focus on complex imaging interpretations.
Natural Language Processing
As with CV, Natural Language Processing (NLP) has had a great impact on society in the form of voice assistants, spam filters, and chat-bots. NLP applications are also being used in healthcare in the form of virtual health assistants and in recent years have been identified to have potential in analyzing clinical notes and spoken instructions from clinicians. This ability of NLP can lessen the burden for busy clinicians who are encumbered by a need to document all their patient care in electronic health records (EHRs). By freeing up the time in writing copious notes, NLP applications can enable clinicians to focus more of their time with patients. In the recent period NLP techniques have been used to analyze even unstructured (free form and written notes) data, which makes it useful in instances where written data is not available in the digital form or there are non-textual data. By integrating NLP applications in EHRs, the workflow and delivery of healthcare can be accelerated.
Combination of Approaches
Precision medicine is premised on customization of medical care based on individual profile of patients. By combining NLP and CV techniques, the ability to deliver precision medicine s greatly increased. For example, NLP techniques can scroll through past medical notes to identify previously diagnosed conditions and medical treatment and present the information in a summary to doctors even as the patient presents to the clinic or to the emergency department. Once in the clinic or emergency department, NLP voice recognition applications can analyze conversation between the patient and clinicians and document it in the form of patient notes for the doctor to review and confirm. This process can free up time for the doctor and ensure accuracy of notes. As the doctor identifies the condition affecting the patient and relies on confirmation through relevant medical imaging, automated or semi-automated CV techniques can accelerate the confirmation process. Thus, a cohesive process that can accelerate the time in which the patient receives necessary medical treatment.
Let us see how this works in a fictitious example. Mr Carlyle, an avid cyclist, meets with an accident on his way to work when an automobile swerves into the bike lane and flings him from his bicycle. The automobile driver calls in an ambulance when he notices Mr Carlyle seated and grimacing with pain. The ambulance after arrival having entered his unique patient identifier number, which is accessed from his smartwatch, rushes him to the nearest emergency department. The AI agent embedded in the hospital’s patient information system identifies Mr Carlyle through his patient identifier number and pulls out his medical details including his drug allergies. This information is available for the clinicians in the emergency department to review even as Mr Carlyle arrives. After being placed in an emergency department bay, the treating doctor uses an NLP application to record, analyze and document the conversation between her and Mr Carlyle. This option allows the doctor to focus most of her time on Mr Carlyle. The doctor suspects a fracture of the clavicle and has Mr Carlyle undergo an X-ray. The CV application embedded in the imaging information system has detected a mid-shaft clavicular fracture and relays the diagnosis back to the doctor. The doctor, prompted by an AI clinical decision support application embedded in the patient information system, recommends immobilization and a sling treatment for Mr Carlyle along with pain killers. His pain killer excludes NSAIDs as the AI agent has identified he is allergic to aspirin.
The above scenario while presenting a clear example of how AI, in specific CV and NLP applications, can be harnessed to deliver prompt and personalized medical care is yet contingent on the technologies to deliver such outcomes. Currently, CV techniques have not achieved the confidence of regulatory authorities nor clinicians to allow automated medical imaging diagnosis (except in minor instances such as diabetic retinopathy interpretation) and neither are NLP applications embedded in EHRs to allow automatic recording, analysis and recording of patient conversations. While some applications have been released in the market to analyze unstructured data, external validation and wide acceptance of these type of applications are some years away. Coupled with this technical and regulatory challenges is the ethical challenges of enabling autonomy of non-human agents to guide and deliver clinical care. Further issues may arise due to the use of patient identifiers to extract historical details even if it is for medical treatment if the patient hasn’t consented so. Yet, the challenges can be overcome as AI technology improves and governance structures to protect patient privacy, confidentiality and safety are established. As focus on the ethics of application of AI in healthcare increases and technological limitations of AI application get resolved, the fictitious scenario may become a reality not too far into the future.
There is a natural alignment between AI and precision medicine as the power of AI methods such as NLP and CV can be leveraged to analyze bio-metric data and deliver personalize medical treatment for patients. With appropriate safeguards, the use of AI in delivering precision medicine can only benefit both the patient and clinician community. One can based on the rapidly evolving AI technology predict the coming years will see wider adaption of precision care models in medicine and thus AI techniques.
With recent developments in regard to AI in Healthcare, one could be mistaken that the entry of AI in healthcare is inevitable. Recent developments include two major studies, one where machine learning classifiers used for hypothetico-deductive reasoning were found to be as accurate as paediatricians and the other one where a deep-learning based automated algorithm outperformed thoracic radiologists in accuracy and was externally validated in multiple sites. The first study is significant in that machine learning classifiers are now proven to be not only useful for medical imaging interpretation but also useful in extracting clinically relevant information from electronic patient records. The second study was significant in that the algorithm could detect multiple abnormalities in chest x-rays (useful in real world settings) and was validated multiple times using external data-sets. Coupled with these developments, we now have the FDA gearing up for the inevitable use of AI software in clinical practice by developing a draft framework anticipating modifications to AI medical software. Also, we now have medical professional bodies across the world welcoming the entry of AI in medicine albeit cautiously and by issuing guidelines. Like this one and this one. Compared to even a year ago, it seems AI has definitely had a resounding impact on healthcare. Even the venerable IEEE is keeping track of where AI is exceeding the performance of clinicians. However, I most certainly think we have yet seen the proper entry of AI in healthcare. Let me explain why and what needs to be done to enable this?
While there is strong evidence emerging about the usefulness of machine learning, especially neural networks in interpreting multiple medical modalities, the generalization of such successes is relatively uncommon. While there has been progress with the ability to minimize generalization error (through avoidance of over-fitting) and understanding how generalization and optimization of neural networks work, it still remains the fact that prediction of class labels outside trained data sets is not for certain. In medicine, this means deep learning algorithms that have shown success in certain contexts are not guaranteed to deliver the same success even with similar data in a different context. There is also the causal probabilistic approach of current machine learning algorithms, which do not necessarily align with the causal deterministic model of diagnostic medicine. I have covered this issue previously here. Even if we accept that machine learning/deep learning models with current limitations are useful in healthcare, there is the fact there is limited readiness of hospitals/health services to deploy these models in clinical practice. The lack of readiness spans infrastructure, policies/guidelines and education. Also, governments and regulatory bodies in many countries don't have specific policies and regulatory frameworks to guide the application of AI in healthcare. So, what has to be done?
As illustrated below, the following steps have to be adopted for us to see AI bloom in the healthcare context.
The first step is development and use of appropriate AI Technology in Medicine. This means ensuring there is validity and relevance of the algorithms being used to address the healthcare issues. For example, if a convolutional neural network model has shown success in screening pulmonary tuberculosis through chest x-ray interpretation it doesn't necessarily mean it is equipped to identify other chest x-ray abnormalities say atelectasis or pneumothorax. So the model should be used for the exact purpose it was trained. Also, the model trained with a labelled x-ray data-set from a particular region has to be validated with a data-set from another region and context. Another issue that pertains to technology is the type of machine learning model being used. While deep learning seems to be in-vogue, it is not necessarily appropriate in all medical contexts. Because of the limitations it poses with explainability, other machine learning models like Support Vector Machines, which lend themselves to interpretability should be considered.
The second step in facilitating the entry and establishment of AI in healthcare is Infrastructure. What do I mean by infrastructure? At this stage, even in developed countries, hospitals do not necessarily have the digital platforms and data warehouse structures for machine learning models to successfully operate. Many hospitals are still grappling with the roll-out of electronic health records. This platform will be essential for machine learning algorithms to mine and query patient data. Also, to train many machine learning models structured data is necessary (some models can work on unstructured data as this application). This data structuring process includes data labelling and creating data warehouses. Not all hospitals, facing budget crunches, have this infrastructure/capability. Further, the clinical and administrative workforce and patient community are to be educated about AI if AI applications will be used in clinical practice and healthcare delivery. How many healthcare organizations have this infrastructure readiness? I doubt many. So infrastructural issues are most certainly necessarily to be addressed before one can think of use of AI in the healthcare context.
The next step, Policy, is also critical. Policy covers both governmental and institutional strategies to guide the deployment of AI for healthcare delivery, and regulatory frameworks to facilitate the entry of and regulate AI medical software in the market. There is definitely progress here with many governments, national regulatory bodies, medical professional bodies and think tanks issuing guidance about this matter. Yet, there are gaps in that many of these guidance documents are theoretical or cursory in nature or not linked to existing infrastructure. Worse yet, is in countries where such policies and guidance don't exist at all. Also, an issue is limited funding mechanisms to support AI research and commercialization, which has significantly hampered innovation or indigenous development of AI medical applications.
The final step that needs to be considered is Governance. This step not only covers the regulation frameworks at the national level (necessary to scrutinize and validate AI applications) but also monitoring and evaluation frameworks at the institutional level. It also covers the requirement to mitigate risk involved in the application of AI in clinical care and the need to create patient-centric AI models. The latter two steps are vital in clinical governance and continuous quality improvement. Many institutions have issued ethical guidelines for the application of AI in Healthcare, but I am yet to see clinical governance models for the use of AI in clinical care. It is critical that clinical governance models for the application of AI in healthcare delivery are developed.
Addressing the steps, I list above: Technology, Infrastructure, Policy and Governance appropriately will most certainly facilitate the entry and establishment of AI in Healthcare. Also, with the accelerated developments in AI technology and increasing interest in AI by policy makers, clinical bodies and healthcare institutions, maybe we are not that far away from seeing this occur.
In recent years, there has been a great deal of coverage about the dearth of PhD qualified AI data-scientists and the level of salaries qualified candidates can gain. One such piece can be found here: NYTimes article. Then you have universities complaining how their PhD qualified AI scientists are being poached by the industry thus demonstrating the demand for PhD qualified AI scientists: Guardian Article. Also, you have many universities opening numerous funded AI PhD positions such as this university: Leeds University Isn't it then obvious, a PhD in AI technology should be on all data scientists to do list. Well, as one who contemplated briefly to do a second PhD (focusing on swarm intelligence and multi-agent system in Healthcare) and who spent some time researching the necessity of completing a PhD to be across AI, I found it detrimental to undertake a PhD focusing on a specific AI algorithmic approach. Let me explain why?
On 11th February, the US administration formalized the proposals made during last year's White House summit on AI through an executive order but there was no mention of the amount that will be set aside for investment in AI (except for statements about prioritization of investment in AI). This compared to official commitments by other countries/regions:
Health System Academic