existnt
Algorithmic bias and opacity in the healthcare sector
1974 words | About 15 minutesThis piece was originally submitted for the CCIR Re:think essay competition.
The recent adoption of artificial intelligence (AI) has transformed almost every aspect of life; there exist now incredibly versatile models which can fulfill jobs from providing artistic insight to solving complex mathematical problems. While AI has revolutionized many industries for the better, such as operations and communication, one of the most ethically challenging applications is in the healthcare sector. These algorithms are having, and will continue to have, measurable benefits for health outcomes. However, there are numerous concerns around their potential bias and opacity, which can lead to worse health outcomes for specific populations and worsen mistrust in medical institutions – a historically significant issue. Algorithmic solutions are also predominantly deployed in high-income countries, raising ethical concerns with respect to equitable access to AI. Despite these issues, AI is ultimately a powerful and useful tool in healthcare, making it critical to refine regulatory systems and increase access in order to mitigate these issues. In the following analysis, algorithmic bias and opacity will be evaluated in the context of healthcare, as a case study both in the problems which arise and the solutions which are necessary. This case study will be analyzed using the common morality theory presented in the Principles of Biomedical Ethics (Beauchamp and Childress) as a framework through which to appraise both the existing issues and prospective solutions.
To counteract the biases seen in algorithms used for medical applications, it must first be understood how they have emerged. The primary issue with regards to these biases is the availability of data: for several reasons, there is more available health data for white populations than for ethnic minorities, resulting in poorer health outcomes for these latter demographics. Because these algorithms can only make inferred decisions based on the existing medical information, when the majority of the dataset comes from one demographic, the algorithms may be unable to account for medically relevant differences between these populations, only reliably reaching diagnoses of the majority demographic. One of the factors leading to this is that many ethnic minorities have had a historical distrust in the medical system. Dismissive, exploitative and highly unethical behavior against minorities from medical professionals, seen clearly in cases such as the Slavery Hypertension Hypothesis, the treatment of Henrietta Lacks and the Tuskegee Syphilis Experiment, has been a key factor in the widespread lack of trust which certain populations have of the healthcare sector. In a collaborative research study between Clearview Research and Understanding Patient Data, investigating issues within the UK’s National Health Service (NHS), a collection of focus groups and community research opportunities found a “consistent trend towards not trusting the various NHS services with their data”(Wilson et al. 18) within Black and South Asian participants, and that even those who gave their health data to the NHS felt it was not being used to measurably “improve health outcomes in their communities” (20). In addition, the participants in this report raised another concern which may factor into algorithmic bias: the quality of data obtained. The study states that the participants “often found providing data on their ethnicity challenging” due to the limited nature of the questions being asked (29). This issue is likely to be in part due to the widespread use of the acronym “BAME” to relate to those with Black, Asian or minority ethnic backgrounds, as it falsely suggests a homogeneity within the groups (Khunti et al.). This suggests that the data collection processes used to train AI models may have systematic flaws or even confounding information. A similar example can be seen in the call for disaggregation of the health data of Asian-Americans, Native Hawaiians and Pacific Islanders, three often-grouped ethnic backgrounds (Ro and Yee), due to the complex heterogeneous health factors facing the different groups. Collectively, these factors have built up over decades to create a healthcare dataset which is largely unrepresentative and inaccurate, forming the basis of the current issues around algorithmic bias in AI.
Beauchamp and Childress argue that, according to the egalitarian principle of fair opportunity, individuals “should not be denied social benefits on the basis of undeserved disadvantageous properties” such as “gender, race, IQ, linguistic accent, ethnicity, national origin, and social status” (Beauchamp and Childress 262). They further argue that these individuals should “receive benefits that will ameliorate the unfortunate effects of life’s lottery” (263). The basis of this argument is essentially analogous to the definition of equity within healthcare: that individuals should not receive the same level of care, but rather individuals should receive different levels of care so as to ameliorate issues created by social determinants of ill health, in order to create equal outcomes. Through this ethical framework, it is clear that the current situation within AI in healthcare – whereby those with minority ethnic backgrounds often experience worse healthcare solutions than white folks – directly contravenes the aspect of justice and fair opportunity. The best ways to combat the bias in this field, then, would be ones which increase both access to and utility of algorithmic healthcare solutions for disadvantaged groups in order to create an equitable environment.
A literature review by Palaniappan et al. analyzed existing regulatory frameworks from seven major countries on algorithms in healthcare, looking at policies which are established in legislation as well as those supported by “professional guidelines” to investigate the current state of regulation (Palaniappan et al. 3). While many of the frameworks do mention bias in their recommendations, the advice is more theoretical than active: the Australian guidelines, for instance, mention that “bias in algorithmic design should be minimized . . . by considering how to avoid bias” (Kenny et al.). This statement is not incorrect, however the framework fails to analyze the causes of the bias or potential solutions. In addition, these guidelines exist mostly as extensions to existing, broad regulation around software as a medical device (SaMD) (Palaniappan et al. 4), and therefore cannot fully compensate for the unique challenges raised by AI in particular, such as the biases previously discussed and the ability of the algorithms to adapt and potentially improve themselves. It is clear that regulatory systems must be adapted in order to directly remedy the growing issue of bias in healthcare. As a first step, we must strive to create health metrics which are truly reflective of the populations which they measure: BMI, for instance, is a widely-used metric which was never intended for use in medical fields (Pray and Riskin). As well as being a largely inaccurate metric, failing to factor in body weight composition, for example, BMI fails to account for different race-based factors of health. Asian populations have been found to have a significant proportion of individuals “with risk factors for type 2 diabetes and cardiovascular disease . . . even below the existing WHO BMI cut-off point of 25kg/m^2” (WHO expert consultation). To combat this issue within the index, Pray and Riskin propose a new index which “considers height, sex and race differences, accounts for abdominal adiposity, and more accurately predicts the relationship between obesity, mortality and diseases” (Pray and Riskin). Indices and metrics such as these should be actively developed for the use of algorithms on the basis of providing more diverse datasets. By improving the quality of data which algorithms access, biases that stem from “bad data” can be effectively reduced.
In addition to improving data quality through generating medical indices which factor in ethnicity, there must be systems in place to ameliorate the lack of data for ethnic minority groups. One potential solution to this would be to establish nationwide, or even worldwide, data banks for electronic health records (EHRs). Having such a large database available to SaMD firms and healthcare algorithms would be an effective way of improving the availability of patient data, reducing the biases created when these algorithms draw conclusions from a largely incomplete set of data. This idea of a unified EHR system was attempted in the UK in 2005, with the National Programme for IT (NPfIT), however this was a widespread failure, with 40 different EHR suppliers now being used by the NHS (Morris). These failures, though, appear relatively easy to remedy with the current state of technology: the primary points of failure are cited by Morris as the unintuitive nature of the systems at the time and a lack of training. With 20 years of technological improvements, systems have become much more user-friendly, and technological literacy has increased dramatically, making this a much more viable option which should be reevaluated. This solution would not only improve health outcomes, but it would also make it easier for firms to create AI models, as data would be more easily available from which to create a model. Through these solutions, regulatory bodies can improve both the quality and quantity of data available to algorithms, while avoiding the suppression of new algorithms and innovation within this industry.
Healthcare algorithms also face issues with regards to their opacity, whereby those being served by AI are unable to fully understand its inner workings. A study performed by Fehr et al. investigating transparency in thirteen European radiology AI solutions found that while nearly every model had a 100% transparency rating with information about the intended use of the model, information availability about the models’ ethics averaged a score of 16.6%, with only one model mentioning the potential harm of misdiagnosis; and only seven of the thirteen models “reported caveats for deployment”, even then providing only minimal information (Fehr et al. 3.2.1-3.2.5). In an exploration of Explainable AI (XAI), Sadeghi et al. provide several methods of investigating both the internal and external transparency of a given AI model, however these methods are largely inaccessible to the average recipient of healthcare from an AI: in providing “feature-oriented methods”, for example, the paper uses game theory and an array of complex mathematical equations to find the “explainable” AI outcomes.(Sadeghi et al. 2.1) These two papers show that, not only are the workings of available AI models largely hidden from the public, but even the analysis of their transparency requires quite specialized skills. This major inaccessibility of data to the public presents an ethical issue under the “respect for autonomy” outlined by Beauchamp and Childress (101), in particular the aspect of explicit and informed consent. Beauchamp and Childress state that informed consent relies on both “the physician’s or researcher’s obligation to disclose information” and “the quality of a patient’s or subject’s understanding”(121). From Fehr’s study, it is clear that physicians, or in this case AI firms, are not disclosing necessary information, and this will directly limit the quality of patients’ understanding of the algorithms being used to assist in their treatments.
In rectifying the issue of opacity with these algorithms, we must establish frameworks through which transparency can be easily analyzed. A relatively simple set of basic informational requirements can be derived from Fehr et al.’s investigative criteria: intended use, development, ethics, clinical validation, and caveats(Fehr et al. 3.2). By requiring AI firms to provide information in all five of these categories, patients can provide informed consent to use these AI models in an ethical manner, as they will be able to fully understand the scope of the solution being used to assist in their care.
Overall, it must be said that AI is a rapidly-developing field, and for this reason it is difficult to establish regulations and frameworks which will apply to algorithm-based solutions in the future, as well as today. However, there are several ways to minimize the potential harm of opacity and bias in healthcare AI which do not require significant intervention in the field, making it easy for firms and providers to adhere to these guidelines. By expanding and collating a wide range of health data, as well as providing a baseline of transparency, AI solutions in healthcare can be developed in a more ethical and beneficial manner to ensure their continued use and usability.
References:
Beauchamp, Tom, and James Childress. Principles of Biomedical Ethics. 7th ed., Oxford University Press, 2012.
Fehr, Jana, et al. “A Trustworthy AI Reality-Check: The Lack of Transparency of Artificial Intelligence Products in Healthcare.” Frontiers in Digital Health, vol. 6, Feb. 2024, https://doi.org/10.3389/fdgth.2024.1267290.
Kenny, Lizbeth M., et al. “Ethics and Standards in the Use of Artificial Intelligence in Medicine on Behalf of the Royal Australian and New Zealand College of Radiologists.” Journal of Medical Imaging and Radiation Oncology, vol. 65, no. 5, Aug. 2021, pp. 486–94, https://doi.org/10.1111/1754-9485.13289.
Khunti, Kamlesh, et al. “The Language of Ethnicity.” The BMJ, no. 8270, Nov. 2020, https://doi.org/10.1136/bmj.m4493.
Morris, James. “A Call to Reconsider a Nationwide Electronic Health Record System: Correcting the Failures of the National Program for IT.” JMIR Medical Informatics, vol. 11, Sept. 2023, https://doi.org/10.2196/53112.
Palaniappan, Kavitha, et al. “Global Regulatory Frameworks for the Use of Artificial Intelligence (AI) in the Healthcare Services Sector.” Healthcare, vol. 12, no. 5, Feb. 2024, https://doi.org/10.3390/healthcare12050562.
Pray, Rachel, and Suzanne Riskin. “The History and Faults of the Body Mass Index and Where to Look Next: A Literature Review.” Cureus, vol. 15, no. 11, Nov. 2023, https://doi.org/10.7759/cureus.48230.
Ro, Marguerite, and Albert Yee. “Out of the Shadows: Asian Americans, Native Hawaiians, and Pacific Islanders.” American Journal of Public Health, no. 100, May 2010, pp. 776–78, https://ajph.aphapublications.org/doi/abs/10.2105/AJPH.2010.192229.
Sadeghi, Zahra, et al. “A Review of Explainable Artificial Intelligence in Healthcare.” Computers and Electrical Engineering, vol. 118, June 2024, https://doi.org/10.1016/j.compeleceng.2024.109370.
WHO expert consultation. “Appropriate Body-Mass Index for Asian Populations and Its Implications for Policy and Intervention Struggles.” The Lancet, vol. 363, no. 9403, Jan. 2004, pp. 157–63, https://doi.org/10.1016/S0140-6736(03)15268-3.
Wilson, Bailey, et al. Diverse Voices on Data. Clearview Research, Apr. 2022, p. 49, https://understandingpatientdata.org.uk/sites/default/files/2022-04/Diverse%20voices%20on%20Data%20-%20Main%20report_0.pdf.