Versuchsplanung für genomgestützte Auswertungen unter Berücksichtigung der theoretischen Kovarianz zwischen SNPs

Contact: Dr. Dörte Wittenburg

Duration: 2017-2020

Funding: Deutsche Forschungsgemeinschaft, DFG WI 4450/1-1

In animal breeding, molecular data (e.g., single nucleotide polymorphisms; SNPs) are incorporated as predictor variables in statistical models to reach an improved genomic evaluation of animals. This leads to more precisely estimated breeding values of not-yet phenotyped animals, which is important for breeding purposes, and enables the genetic architecture of some traits to be elucidated. Not only is the position on the genome a relevant parameter but so too is the effect size. Particularly, as with the high-dimensional SNP data that are available today, a causative variant can be pinpointed to a specific base pair on the genome. Due to the curse of dimensionality, the precision of effect estimates can only be approximated with time-consuming computational methods; it is not given in an analytical formula. As precision of estimates influences the outcome of testing for significance, the reliable identification of causative variants is complicated. The project aims to theoretically determine the standard error of SNP-effect estimates and the power of testing their significance, particularly for situations where the number of SNPs exceeds the number of individuals. The theory developed in this project will be validated with simulated and empirical data that are publicly available. Furthermore, when planning new experiments, it is essential to aim at a sufficiently large power to test the SNP effects. Thus, at a second stage, the project will investigate how a minimum required sample size can be determined prior to any genomic evaluation in a target population. The target population may consist of a mixture of half- and/or full sib families. As genotypic data of target individuals are typically not available a-priori, we will employ the theoretical distribution of genotypes which can be inferred from parental genome information. Criteria for an optimal breeding design will be concluded by taking into account the trait-specific parameters, which have to be stipulated by the experimenter, and population-specific parameters, from which the theoretical distribution of SNPs is derived. The ultimate goal is to provide experimental designs useful for practical applications in which the genetic architecture of selected traits can be elucidated.