Abstract
Genomic variation is the basis of interindividual differences in observable traits and disease susceptibility. Genetic studies are the driving force of personalized medicine, as many of the differences in treatment efficacy can be attributed to our genomic background. The rapid development of nextgeneration sequencing technologies accelerates the discovery of the complete landscape of human variation. The main limitation is not anymore the available genotyping technology or cost, but rather the lack of understanding of the functionality of individual variations. Single polymorphisms rarely explain a considerable amount of the phenotype variability, hence the major difficulty of interpretation lies in the complexity of molecular interactions. This PhD thesis describes the state-of-art of the functional human variation research (Chapter 1) and introduces childhood acute lymphoblastic leukaemia (ALL) as a model disease for studying pharmacogenomic effects (Chapter 2 and 3). Chapter 4 describes the current interpretations of variations’ effect and deleteriousness, accompanied by investigations of amino acid mutability compared to their deleteriousness presented in Paper I. Chapter 5 describes a pipeline used for calling variants from next-generation sequencing data and describes the common challenges encountered during analysis. Chapter 6 provides the motivation for a hypothesis-driven SNP selection and describes the publicly available resources used for this task. Following a review of the available large-scale genotyping techniques, Paper II introduces a novel cost-effective method for genotyping of a large custom SNP panel by means of multiplexed targeted sequencing and includes recommendations for efficient capture bait design. In Chapter 7 various methods of integrative analyses of genomic variations are presented, including testing of overrepresentation of rare variants, effects of multiple SNPs acting in the same biological pathway, contribution of coding variation to individual’s disease risk, as well as identifying groups of patients differing in disease mechanisms defined by aberrations in protein-protein complexes. Chapters 8, 9 and 10 contain three papers applying the methods presented in Chapters 5 - 7 to investigate the heterogeneity of treatment response (Paper III), risk of infections (Paper IV) and disease aetiology (Paper V) in childhood ALL patients. Chapter 11 summarizes the thesis and includes some final remarks on the perspectives of genomic variation research and personalized medicine. In summary, this thesis demonstrates the feasibility of integrative analyses of genomic variations and introduces large-scale hypothesis-driven SNP exploration studies as an emerging alternative to data-driven genome-wide association studies. Finally, the findings of the presented studies set new directions for future pharmacognenetic investigations and provide a framework for future implementation of personalized medicine.