Understanding and Addressing AI Hallucinations in Healthcare and Life Sciences
DOI:
https://doi.org/10.47941/ijhs.1862Keywords:
Hallucinations, Large Language Models, Artificial Intelligence, Healthcare, Life SciencesAbstract
Purpose: This paper investigates the phenomenon of "AI hallucinations" in healthcare and life sciences, where large language models (LLMs) produce outputs that, while coherent, are factually incorrect, irrelevant, or misleading. Understanding and mitigating such errors is critical given the high stakes of accurate and reliable information in healthcare and life sciences. We classify hallucinations into three types input-conflicting, context-conflicting, and fact-conflicting and examine their implications through real-world cases.
Methodology: Our methodology combines the Fact Score, Med-HALT, and adversarial testing to evaluate the fidelity of AI outputs. We propose several mitigation strategies, including Retrieval-Augmented Generation (RAG), Chain-of-Verification (CoVe), and Human-in-the-Loop (HITL) systems, to enhance model reliability.
Findings: As artificial intelligence continues to permeate various sectors of society, the issue of hallucinations in AI-generated text poses significant challenges, especially in contexts where precision and reliability are paramount. This paper has delineated the types of hallucinations commonly observed in AI systems input-conflicting, context-conflicting, and fact-conflicting and highlighted their potential to undermine trust and efficacy in critical domains such as healthcare and legal proceedings.
Unique contribution to theory, policy and practice: This study's unique contribution lies in its comprehensive analysis of AI hallucinations' types and impacts and the development of robust controls that advance theoretical understanding, practical application, and policy formulation in AI deployment. These efforts aim to foster safer, more effective AI integration across healthcare and life sciences sectors
Downloads
References
Yu, P., Xu, H., Hu, X., & Deng, C. (2023). Leveraging Generative AI and Large Language Models: A Comprehensive Roadmap for Healthcare Integration. Healthcare (Basel, Switzerland), 11(20), 2776. https://doi.org/10.3390/healthcare11202776
Maleki, N., Padmanabhan, B., and Dutta, K., "AI Hallucinations: A Misnomer Worth Clarifying", arXiv e-prints, 2024. doi:10.48550/arXiv.2401.06796.
Zhang, Yue., Li, Yafu., Cui, Leyang., Cai, Deng., Liu, Lemao., Fu, Tingchen., Huang, Xinting., Zhao, Enbo., Zhang, Yu., Chen, Yulong., Wang, Longyue., Luu, Anh., Bi, Wei., Shi, Freda., Shi, Shuming. (2023). Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models.
Cox, J. A Lesson From Seinfeld: How Generative AI Issues Remind Us to Be True to Our Oaths.
Pal, Ankit & Umapathi, Logesh & Sankarasubbu, Malaikannan. (2023). Med-HALT: Medical Domain Hallucination Test for Large Language Models. 314-334. 10.18653/v1/2023.conll-1.21.
Mishra, A., "Fine-grained Hallucination Detection and Editing for Language Models", arXiv e-prints, 2024. doi:10.48550/arXiv.2401.06855.
Ayala, Orlando Marquez. (2024). Reducing hallucination in structured outputs via Retrieval-Augmented Generation. doi: 10.18653/v1/2023.conll-1.21.
Dhuliawala, S., Komeili, M., Xu, J., Raileanu, R., Li, X., Celikyilmaz, A., & Weston, J. (2023). Chain-of-verification reduces hallucination in large language models. arXiv preprint arXiv:2309.11495.
Dhuliawala, S., Komeili, M., Xu, J., Raileanu, R., Li, X., Celikyilmaz, A., & Weston, J. (2023). Chain-of-verification reduces hallucination in large language models. arXiv preprint arXiv:2309.11495.
Wu, X., Xiao, L., Sun, Y., Zhang, J., Ma, T., & He, L. (2022). A survey of human-in-the-loop for machine learning. Future Generation Computer Systems, 135, 364-381.
Arshad, H. B., Butt, S. A., Khan, S. U., Javed, Z., & Nasir, K. (2023). ChatGPT and Artificial Intelligence in Hospital Level Research: Potential, Precautions, and Prospects. Methodist DeBakey Cardiovascular Journal, 19(5), 77.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Aditya Gadiko
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution (CC-BY) 4.0 License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.