Abstract:
Automac Anomaly Detecon in Health Registry Data by Dynamic, Unsupervised Time Series Clustering
Denmark has established a wealth of health registries used to monitor the quality of health care. Although this resource has enormous potential, data has become so complex and highdimensional that important insights in quality of care and patients’ safety may go unnoticed. There is, thus, a need for a dynamic, automated algorithm capable of flagging growing anomalies in registry data, helping health care personnel to rapidly discover important divergencies.
In this project we will develop and test a new algorithm based on dynamic, unsupervised time series clustering with anomaly detection for health care data. At each time point, the algorithm will cluster patients (using, e.g., hierarchical, t-SNE, or autoencoder clustering) based on a patient trajectory metric (e.g., Hamming distance or optimal matching) and the development of anomaly clusters will be monitored by significant change in a cluster dissimilarity measure (e.g., Jaccard distance or MONIC). Thisalgorithm’s output will consist of summaries of detected anomalies in a form that allows for a quick assessment by relevant health care professionals. These summaries will be evaluated by a team of experts, and the algorithm will be tuned based on their input. The algorithm will thus learn, through supervision, to predict expert interests. The algorithm will be developed and tested on the Danish Diabetes Database (DDiD).
Such an algorithm would greatly improve the health care system’s ability to react timely on both positive and negative trends in quality of care. Furthermore, the algorithm will be developed in a disease independent fashion, such that it can be implemented more generally and potentially be used to monitor other areas in critical need of attention.