Novel variation and de novo mutation rates in population-wide de novo assembled Danish trios
Abstract
Building a population-specific catalogue of single nucleotide variants (SNVs), indels andstructural variants (SVs) with frequencies, termed a national pan-genome, is critical forfurther advancing clinical and public health genetics in large cohorts. Here we report a Danishpan-genome obtained from sequencing 10 trios to high depth (50). We report 536k novelSNVs and 283k novel short indels from mapping approaches and develop a population-widede novo assembly approach to identify 132k novel indels larger than 10 nucleotides with lowfalse discovery rates. We identify a higher proportion of indels and SVs than previous effortsshowing the merits of high coverage and de novo assembly approaches. In addition, we usetrio information to identify de novo mutations and use a probabilistic method to providedirect estimates of 1.27e8 and 1.5e9 per nucleotide per generation for SNVs and indels, respectively.