Inherited and de novo variation in human genomes

Most human traits, ranging from physical appearance to behavior and disease susceptibility, are in part inherited through genetic material. Whole-genome sequencing has enabled the complete characterization of human genetic variation. While most of common DNA sequence variation has been observed in genetic studies from worldwide populations, rare genetic variation is more geographically clustered and requires many more individuals from diverse populations to be studied. In this thesis I describe the genetic variation in 250 Dutch parent-offspring families from the Genome of the Netherlands (GoN... Mehr ...

Verfasser: Francioli, L.C.
Dokumenttyp: Dissertation
Erscheinungsdatum: 2015
Verlag/Hrsg.: Utrecht University
Schlagwörter: dutch genetic mutation / variation / insertion / deletion / population rate / haplotype / genome
Sprache: Englisch
Permalink: https://search.fid-benelux.de/Record/base-27068239
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://dspace.library.uu.nl/handle/1874/310736

Most human traits, ranging from physical appearance to behavior and disease susceptibility, are in part inherited through genetic material. Whole-genome sequencing has enabled the complete characterization of human genetic variation. While most of common DNA sequence variation has been observed in genetic studies from worldwide populations, rare genetic variation is more geographically clustered and requires many more individuals from diverse populations to be studied. In this thesis I describe the genetic variation in 250 Dutch parent-offspring families from the Genome of the Netherlands (GoNL) Project obtained through whole-genome sequencing. A total of 20.4 million single nucleotide variants (SNVs), 1.2 million short insertions and deletions (indels) and 27.5 thousands structural variants (SVs) were discovered in these families. While most of the SNVs were known, the majority of the indels and almost all SVs are novel, partly due to the ability to identify mid-size deletions of size 30bp-500bp for the first time on a population scale. Taking advantage of the trio design, the SNVs and indels were phased into a highly accurate haplotype panel, which improves imputation accuracy especially for lower allele frequency alleles. In addition to describing the inherited DNA sequence variation in the Dutch population, I was also able to characterize de novo mutations at an unprecedented scale. Indeed, 11,020 de novo SNVs, 291 de novo indels and 41 de novo SVs were identified, from which a mutation rate of 1.15 x 10-8 SNVs/bp, 0.68 x 10-9 indels/bp and 0.16 SVs per generation can be estimated. Despite their much lower rate, de novo SVs affect 91 times more bases on average, including 52 times more protein coding bases, than de novo SNVs. This is in contrast with the relatively similar footprint of inherited SNVs and SVs, likely indicating much stronger selection pressure on SVs than on SNVs. Looking at the distribution of mutations across offspring, I confirmed the previously reported increase of de novo SNVs with ...