Abstract
In the recent decades, there has been a shift in biological research towards data-driven analyses, where entire biological systems are investigated. System level analysis is applied in medical research with approaches such as personalized medicine and predictive medicine, where two of the goals are to predict diseases predispositions and the outcomes of medical treatments for each patient. In traditional medical and epidemiological research, medical conditions are investigated one at a time. This is done to eliminate predisposing effects of confounding factors such as other diseases and environmental components. In contrast, systems level research is often performed in a data-driven manner, where the aim is to analyze the impact of the full diseaseome instead of analyzing a disease as an isolated entity. This thesis presents four studies of health registry data, all with the aim to characterize how diseases correlate and develop throughout the entire population of Denmark. Three of the analyses have been carried out at a systems level, where groups of diagnoses have been examined to better understand disease relationships. It has been shown how disease progression over time can be analyzed with data-driven methods. This was done by identifying pairs of diagnoses that show strong temporal correlation and analyzing how patients progress in different trajectories of these diagnoses. Rather than focusing on a trajectory of a single disease, patterns of disease development across the full spectrum of pathology were identified. In another study included in this thesis, the hypothesis that gut bacteria plays a role in cardiovascular diseases is analyzed by comparing patients who have undergone full colectomy with patients who have their colon intact. Finally, a study shows how health registry data can be used to examine genetic diseases. The correlations between Mendelian and complex diseases have been analyzed to identify diseases that might share genetic etiology. The disease trajectories presented here is a step towards predicting outcome of medical treatment, by unraveling temporal disease correlations. While the correlations and trajectories are descriptive, the results represent ideal input to predictive models. Additional data concerning medical treatment and surgery, drug prescriptions and genotype can also be incorporated into such models. Thus, they can aid in the further development of personalized and predictive medicine.