Intelligent multi-document summarisation for extracting insights on racial inequalities from maternity incident investigation reports
Cosma, Georgina, Singh, Mohit Kumar ORCID: 0000-0001-7736-5583 , Waterson, Patrick, Jun, Gyuchan Thomas and Back, Jonathan (2024) Intelligent multi-document summarisation for extracting insights on racial inequalities from maternity incident investigation reports. In: Artificial Intelligence in Healthcare First International Conference, AIiH 2024, Swansea, UK, September 4th – 6th, 2024, Proceedings, Part II. Lecture Notes in Computer Science ((LNCS), 14976 . Springer, Cham, Switzerland, pp. 316-329. ISBN 978-3031672842; 978-3031672859 ISSN 0302-9743 (Print), 1611-3349 (Online) (doi:https://doi.org/10.1007/978-3-031-67285-9_23)
PDF (AAM)
48171_SINGH_Intelligent_multi-document_summarisation_for_extracting_insights_on_racial_inequalities_from_maternity_incident_investigation_reports.pdf - Accepted Version Restricted to Repository staff only Download (689kB) | Request a copy |
Abstract
In healthcare, thousands of safety incidents occur every year, but learning from these incidents is not effectively aggregated. Analysing incident reports using AI could uncover critical insights to prevent harm by identifying recurring patterns and contributing factors. To aggregate and extract valuable information, natural language processing (NLP) and machine learning techniques can be employed to summarise and mine unstructured data, potentially surfacing systemic issues and priority areas for improvement. This paper presents I-SIRch: CS, a framework designed to facilitate the aggregation and analysis of safety incident reports while ensuring traceability throughout the process. The framework integrates concept annotation using the Safety Intelligence Research (SIRch) taxonomy with clustering, summarisation, and analysis capabilities. Utilising a dataset of 188 anonymised maternity investigation reports annotated with 27 SIRch human factors concepts, I-SIRch:CS groups the annotated sentences into clusters using sentence embeddings and k-means clustering, maintaining traceability via file and sentence IDs. Summaries are generated for each cluster using offline state-of-the-art abstractive summarisation models (BART, DistilBART, T5), which are evaluated and compared using metrics assessing summary quality attributes. The generated summaries are linked back to the original file and sentence IDs, ensuring traceability and allowing for verification of the summarised information. Results demonstrate BART’s strengths in creating informative and concise summaries.
Item Type: | Conference Proceedings |
---|---|
Title of Proceedings: | Artificial Intelligence in Healthcare First International Conference, AIiH 2024, Swansea, UK, September 4th – 6th, 2024, Proceedings, Part II |
Additional Information: | Included in the following conference series: International Conference on AI in Healthcare. |
Uncontrolled Keywords: | dynamic clustering; abstractive summarisation; healthcare |
Subjects: | H Social Sciences > H Social Sciences (General) |
Faculty / School / Research Centre / Research Group: | Greenwich Business School Greenwich Business School > Networks and Urban Systems Centre (NUSC) Greenwich Business School > School of Business, Operations and Strategy |
Last Modified: | 26 Sep 2024 11:37 |
URI: | http://gala.gre.ac.uk/id/eprint/48171 |
Actions (login required)
View Item |
Downloads
Downloads per month over past year