3

📄 Research.

In machine learning and NLP.

PUBLICATIONS
2025
Enhancing Sentence-level Privacy Risk Classification in Electronic Health Records using Clinical BERT: A Comparative Machine Learning Approach

Computing Conference 2025 (Accepted)∙ This study extends an earlier novel approach to enhancing privacy risk classification in Electronic Health Records (EHRs) by using sentence-level analysis within clinical notes. Transitioning from traditional document-level classification, which often hinders necessary data sharing due to its conservative nature, this research aims to strike a balance between maintaining data utility and ensuring privacy protection.

2023
Exploring Text Representations for Online Misinformation

Queen's University Belfast ∙ This thesis tackles the growing issue of misinformation, especially in politics and healthcare. It develops new methods to extract features from news articles for detecting fake news, focusing on the differences in thematic coherence between true and false stories. The research shows that topic-based features are effective for detection, using both classification and clustering techniques. Clustering is particularly useful as it reduces the need for large labeled datasets. Overall, the work advances the use of Machine Learning and Natural Language Processing to better detect and understand misinformation.

2022
Classifying Cyber-Risky Clinical Notes by Employing Natural Language Processing

HICSS 2022 ∙ Clinical notes, which can be embedded into electronic medical records, document patient care delivery and summarize interactions between healthcare providers and patients. Recently, some states within the United States of America require patients to have open access to their clinical notes to improve the exchange of patient information for patient care. Thus, developing methods to assess the cyber risks of clinical notes before sharing and exchanging data is critical. To bridge this gap, this research investigates methods for identifying security/privacy risks within clinical notes.

2020
Exploring Thematic Coherence in Fake News

INRA 2020 @ ECML PKDD 2020 ∙ The spread of fake news remains a serious global issue; understanding and curtailing it is paramount. One way of differentiating between deceptive and truthful stories is by analyzing their coherence. This study explores the use of topic models to analyze the coherence of cross-domain news shared online. Experimental results on seven cross-domain datasets demonstrate that fake news shows a greater thematic deviation between its opening sentences and its remainder.

TALKS
2020
"Exploring Thematic Coherence in Fake News"

Web ∙ Paper presentation, at the 8th International Workshop on News Recommendation and Analytics (INRA 2020)

2020
"Computational Thinking with Wolfram Language"

Web ∙ Webinar on Computational Thinking, with Hamoye

2019
"Fake News Detection using Machine Learning"

Belfast, UK ∙ Poster presentation, at AI-CON 2019

2018
"Is it real? — Detecting fake news II"

Belfast, UK ∙ Workshop on misinformation detection for Year 9 pupils from various schools in NI, held at Queen's

2018
"Is it real? — Detecting fake news I"

Ballymena, UK ∙ Workshop on misinformation detection for Year 9 pupils, at Cambridge House Grammar School