Reliable AI for Healthcare: Enhancing Large Language Model Transparency with Uncertainty Quantification

Apply and key information  

This project is funded by:

    • Department for the Economy (DfE)
    • Vice Chancellor's Research Scholarship (VCRS)

Summary

The rise of wearable technology has enabled unprecedented continuous data collection, opening new frontiers in health monitoring, activity recognition, and personalized medicine. This abundant sensor data holds immense potential for improving healthcare but also presents significant modelling challenges. To address these, Large Language Models (LLMs) like GPT-4 and Llama are now being harnessed to interpret complex data patterns and analyse human behaviour. However, the opaque “black box” nature of LLMs makes it challenging to quantify uncertainty in their predictions—a critical limitation in healthcare, where transparency and reliability are paramount.

This project aims to develop a comprehensive framework for predictive uncertainty in LLMs, focusing on applications in medical NLP. By developing advanced uncertainty quantification (UQ) methods—including probability calibration, conformal prediction, and Bayesian techniques—this research seeks to reliably assess the accuracy of LLM predictions in healthcare. Key areas of investigation include understanding aleatoric uncertainty (randomness in sensor data) and epistemic uncertainty (model knowledge limitations), as well as examining how domain-specific fine-tuning affects uncertainty estimation.

The project’s methodologies will enable LLM-based healthcare applications to identify potentially erroneous outputs, interpret model reliability, and defer decisions in high-uncertainty scenarios, all while generating comprehensive responses to medical inquiries. Additionally, by integrating UQ, this research envisions healthcare chatbots and AI systems capable of effectively communicating their confidence levels, improving decision-making and trust in medical AI applications.
Through real-world testing, the project will validate the utility of these UQ methods, addressing the important need for robust uncertainty assessment in healthcare LLMs. Ultimately, this research will promote the responsible adoption of LLMs in sensor-driven healthcare, enhancing transparency, safety, and accountability in AI-assisted medical decision-making.

The School of Computing at Ulster University holds Athena Swan Bronze Award since 2016 and is committed to promote and advance gender equality in Higher Education. We particularly welcome female applicants, as they are under-represented within the School.

Essential criteria

Applicants should hold, or expect to obtain, a First or Upper Second Class Honours Degree in a subject relevant to the proposed area of study.

We may also consider applications from those who hold equivalent qualifications, for example, a Lower Second Class Honours Degree plus a Master’s Degree with Distinction.

In exceptional circumstances, the University may consider a portfolio of evidence from applicants who have appropriate professional experience which is equivalent to the learning outcomes of an Honours degree in lieu of academic qualifications.

  • Sound understanding of subject area as evidenced by a comprehensive research proposal
  • A demonstrable interest in the research area associated with the studentship

Desirable Criteria

If the University receives a large number of applicants for the project, the following desirable criteria may be applied to shortlist applicants for interview.

  • First Class Honours (1st) Degree
  • Masters at 70%
  • For VCRS Awards, Masters at 75%
  • Experience using research methods or other approaches relevant to the subject domain
  • Work experience relevant to the proposed project
  • Publications - peer-reviewed
  • Experience of presentation of research findings

Equal Opportunities

The University is an equal opportunities employer and welcomes applicants from all sections of the community, particularly from those with disabilities.

Appointment will be made on merit.

Funding and eligibility

This project is funded by:

  • Department for the Economy (DfE)
  • Vice Chancellor's Research Scholarship (VCRS)

Our fully funded PhD scholarships will cover tuition fees and provide a maintenance allowance of £19,237 (tbc) per annum for three years (subject to satisfactory academic performance).  A Research Training Support Grant (RTSG) of £900 per annum is also available.

These scholarships, funded via the Department for the Economy (DfE) and the Vice Chancellor’s Research Scholarships (VCRS), are open to applicants worldwide, regardless of residency or domicile.

Applicants who already hold a doctoral degree or who have been registered on a programme of research leading to the award of a doctoral degree on a full-time basis for more than one year (or part-time equivalent) are NOT eligible to apply for an award.

Due consideration should be given to financing your studies.

Recommended reading

Löfström, T., Yapicioglu, F.R., Stramiglio, A., Löfström, H. and Vitali, F., 2024. Fast Calibrated Explanations: Efficient and Uncertainty-Aware Explanations for Machine Learning Models. arXiv preprint arXiv:2410.21129.

Angelopoulos, A., Bates, S., Malik, J., & Jordan, M. I. (2020). Uncertainty Sets for Image Classifiers using Conformal Prediction. http://arxiv.org/abs/2009.14193

Chen, Y., Yuan, L., Cui, G., Liu, Z., & Ji, H. (2022). A Close Look into the Calibration of Pre-trained Language Models. http://arxiv.org/abs/2211.00151

Dusenberry, M. W., Tran, D., Choi, E., Kemp, J., Nixon, J., Jerfel, G., Heller, K., & Dai, A. M. (2020). Analyzing the Role of Model Uncertainty for Electronic Health Records. Proceedings of the ACM Conference on Health, Inference, and Learning, 204–213. https://doi.org/10.1145/3368555.3384457

Hussain, T., Nugent, C., Moore, A., Liu, J., & Beard, A. (2021). A Risk-Based IoT Decision-Making Framework Based on Literature Review with Human Activity Recognition Case Studies. Sensors, 21(13). https://doi.org/10.3390/s21134504

Kormilitzin, A., Vaci, N., Liu, Q., & Nevado-Holgado, A. (2021). Med7: A transferable clinical natural language processing model for electronic health records. Artificial Intelligence in Medicine, 118, 102086. https://doi.org/https://doi.org/10.1016/j.artmed.2021.102086
Lin, Z., Trivedi, S., & Sun, J. (2023). Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models. ArXiv E-Prints, arXiv:2305.19187. https://doi.org/10.48550/arXiv.2305.19187

Wang, G., Liu, X., Ying, Z., Yang, G., Chen, Z., Liu, Z., Zhang, M., Yan, H., Lu, Y., Gao, Y., Xue, K., Li, X., & Chen, Y. (2023). Optimized glycemic control of type 2 diabetes with reinforcement learning: a proof-of-concept trial. Nature Medicine, 29(10), 2633–2642. https://doi.org/10.1038/s41591-023-02552-9

Xiao, Y., Liang, P. P., Bhatt, U., Neiswanger, W., Salakhutdinov, R., & Morency, L.-P. (n.d.). Uncertainty Quantification with Pre-trained Language Models: A Large-Scale Empirical Analysis
Xiao, Y., & Wang, W. Y. (2018). Quantifying Uncertainties in Natural Language Processing Tasks. http://arxiv.org/abs/1811.07253

The Doctoral College at Ulster University

Key dates

Submission deadline
Monday 24 February 2025
04:00PM

Interview Date
April 2025

Preferred student start date
15 September 2025

Applying

Apply Online  

Contact supervisor

Dr Tazar Hussain

Other supervisors