Kadarmideen, Haja N.
(Statistical Animal Genetics Group, Institute of Animal Science, Swiss Federal Institute of Technology ETH Zentrum)
,
Ilahi, H.
(Statistical Animal Genetics Group, Institute of Animal Science, Swiss Federal Institute of Technology ETH Zentrum)
Main objectives of this study were to investigate accuracy, bias and power of linear and threshold model segregation analysis methods for detection of major genes in categorical traits in farm animals. Maximum Likelihood Linear Model (MLLM), Bayesian Linear Model (BALM) and Bayesian Threshold Model ...
Main objectives of this study were to investigate accuracy, bias and power of linear and threshold model segregation analysis methods for detection of major genes in categorical traits in farm animals. Maximum Likelihood Linear Model (MLLM), Bayesian Linear Model (BALM) and Bayesian Threshold Model (BATM) were applied to simulated data on normal, categorical and binary scales as well as to disease data in pigs. Simulated data on the underlying normally distributed liability (NDL) were used to create categorical and binary data. MLLM method was applied to data on all scales (Normal, categorical and binary) and BATM method was developed and applied only to binary data. The MLLM analyses underestimated parameters for binary as well as categorical traits compared to normal traits; with the bias being very severe for binary traits. The accuracy of major gene and polygene parameter estimates was also very low for binary data compared with those for categorical data; the later gave results similar to normal data. When disease incidence (on binary scale) is close to 50%, segregation analysis has more accuracy and lesser bias, compared to diseases with rare incidences. NDL data were always better than categorical data. Under the MLLM method, the test statistics for categorical and binary data were consistently unusually very high (while the opposite is expected due to loss of information in categorical data), indicating high false discovery rates of major genes if linear models are applied to categorical traits. With Bayesian segregation analysis, 95% highest probability density regions of major gene variances were checked if they included the value of zero (boundary parameter); by nature of this difference between likelihood and Bayesian approaches, the Bayesian methods are likely to be more reliable for categorical data. The BATM segregation analysis of binary data also showed a significant advantage over MLLM in terms of higher accuracy. Based on the results, threshold models are recommended when the trait distributions are discontinuous. Further, segregation analysis could be used in an initial scan of the data for evidence of major genes before embarking on molecular genome mapping.
Main objectives of this study were to investigate accuracy, bias and power of linear and threshold model segregation analysis methods for detection of major genes in categorical traits in farm animals. Maximum Likelihood Linear Model (MLLM), Bayesian Linear Model (BALM) and Bayesian Threshold Model (BATM) were applied to simulated data on normal, categorical and binary scales as well as to disease data in pigs. Simulated data on the underlying normally distributed liability (NDL) were used to create categorical and binary data. MLLM method was applied to data on all scales (Normal, categorical and binary) and BATM method was developed and applied only to binary data. The MLLM analyses underestimated parameters for binary as well as categorical traits compared to normal traits; with the bias being very severe for binary traits. The accuracy of major gene and polygene parameter estimates was also very low for binary data compared with those for categorical data; the later gave results similar to normal data. When disease incidence (on binary scale) is close to 50%, segregation analysis has more accuracy and lesser bias, compared to diseases with rare incidences. NDL data were always better than categorical data. Under the MLLM method, the test statistics for categorical and binary data were consistently unusually very high (while the opposite is expected due to loss of information in categorical data), indicating high false discovery rates of major genes if linear models are applied to categorical traits. With Bayesian segregation analysis, 95% highest probability density regions of major gene variances were checked if they included the value of zero (boundary parameter); by nature of this difference between likelihood and Bayesian approaches, the Bayesian methods are likely to be more reliable for categorical data. The BATM segregation analysis of binary data also showed a significant advantage over MLLM in terms of higher accuracy. Based on the results, threshold models are recommended when the trait distributions are discontinuous. Further, segregation analysis could be used in an initial scan of the data for evidence of major genes before embarking on molecular genome mapping.
* AI 자동 식별 결과로 적합하지 않은 문장이 있을 수 있으니, 이용에 유의하시기 바랍니다.
제안 방법
Ten replicates of Gibbs chains of 50,000 cycles were run, using a spacing of 50 cycles, obtaining 1,000 Gibbs samples per chain and 10,000 samples in total for each trait. A burn-in period of 1,000 cycles was used to allow the Gibbs chains to reach the equilibrium.
The main objectives of this study were to compare linear models (LM) and threshold models (TM) segregation analyses based on maximum likelihood and Bayesian methods, respectively.
대상 데이터
However, it is not the case in the present study. The empirical means of the test statistic were 511.40, 729.55 and 1, 160.83 for categorical, 40 and 15% incidences binary data sets respectively. The assumptions of normality for discrete traits considerably increase the test statistic values and may therefore lead to false inference of a segregating major gene.
데이터처리
These estimates of model parameters are based on 10, 000 Gibbs samples from ten replicated chains. Tests for convergence of the Gibbs sampler were performed by comparison of multiple chain output using ANOVA on the total samples. These tests showed that Gibbs samples of parameters (for major gene effect, genotype frequencies and all variances) were not able to achieve a good stationary phase.
이론/모형
1 Obtained by transforming true values on NDL scale to observed scale using Robertson and Lerner (1949) formula. True values for 15% incidence could not be derived.
Table 4. Estimated major gene and polygenetic parameters for osteochondral disease in pigs by mixed inheritance models, using Bayesian Linear Models (BALM) and Bayesian Threshold Models (BATM). Results are based on 10, 000 Gibbs samples from three replicated chains
, 1991). In a Bayesian inference framework, the Gibbs sampler algorithm was adapted by Guo and Thompson (1994) in order to solve computing problems in complex pedigrees in animal genetics. The Gibbs sampling algorithms have now found a wide-spread use in genetic analysis of quantitative traits recorded in pedigreed animal populations, due to its flexibility in solving complex and demanding statistical models, especially for categorical traits (e.
ii. Investigate the impact of different incidences of binary trait on the accuracy and power of detection of major genes by segregation analysis under both MLLM and Bayesian Threshold Model (BATM) method.
i. Investigate the impact of distribution of the trait (normal versus categorical or binary data) on the accuracy and power of detecting major genes in the population, using maximum likelihood linear model (MLLM) method.
The analyses were carried out on the same simulated binary data sets (with 15 and 40% incidences) using a Bayesian threshold model (BATM) with Gibbs sampling. MAGGIC software package (Janss, 1998) was used to estimate the genetic parameters of the population.
The estimation of parameters maximising the likelihoods was carried out using the Gauss-Hermit quadrature (D01BAF) and optimization (E04JBF) subroutines of the NAG FORTRAN Library (1990) with a quasi-Newton algorithm in which the derivatives were estimated by finite differences.
MAGGIC software package (Janss, 1998) was used to estimate the genetic parameters of the population. This method constructs Monte Carlo chains of realizations of the model parameters through Gibbs-sampling. These samples constitute the marginal posterior distributions of the model parameters, from which Bayesian inferences on these parameters can be drawn.
성능/효과
Bayesian methods based on linear models (BALM) applied to CMF observed on original scales (scores 1-5) versus those based on threshold models (BATM) applied to transformed binary scales (0/1) are given in Table 4. Both methods showed a presence of major gene with significant additive effect at the major gene (0.587 for BALM and 4.358 for BATM) and the very high additive genetic variance (0.0477 for BALM and 17.993 for BATM) compared to the polygenic variance for disease. Therefore the heritability at the major gene, h;, was much higher than the heritability at the polygenes, h;.
, 2000b). In this study the genetic variability of simulated trait was almost explained by the major gene effect, this may have reduced the magnitude of parameter estimates of polygenic background when the incidence increases.
It should be noted that, under both Ho and H1 hypotheses, the estimated variance components and genetic parameters obtained for discrete trait were lower than those for continuous trait (Table 1), as would be expected. These results are similar to those reported in the previous study (Le Roy and Elsen, 1991).
The liability data were simulated using heritability, h2, of 0.41 and repeatability, r, of 0.52 on the liability scale (the total heritability and repeatability taking into account the major gene effect were 0.78 and 0.82 respectively). The genotype of the offspring was determined according to the Mendelian transmission probabilities.
In general, the probability of HPD region (lowest interval with zero) at 95% level was small for both methods. The posterior SDs for all parameters and for both methods was high, confirming that estimation of genetic parameters for categorical (ordinal or binary) traits are difficult to have good precision.
Tests for convergence of the Gibbs sampler were performed by comparison of multiple chain output using ANOVA on the total samples. These tests showed that Gibbs samples of parameters (for major gene effect, genotype frequencies and all variances) were not able to achieve a good stationary phase. The density estimates were higher than the true values.
참고문헌 (42)
Bodin, L., M. San Cristobal-Gaudy, F. Lecerf, P. Mulsant, B. Bibe, D. Lajous, J. P. Belloc, F. Eychenne, Y. Amigues and J. M. Elsen. 2002. Segregation of major gene influencing ovulation in progeny of Lacaune meat sheep. Genet. Sel. Evol. 34:447-464.
Box, G. E. P. and G. C. Tiao. 1973. Bayesian Inference in Statistical Analysis. Reading-Mass. Addison-Wesley.
Falconer, D. S. and T. F. C. Mackay. 1996. Introduction to quantitative genetics. 4th edn, Longman, Harlow, London.
Elsen, J. M. and P. Le Roy. 1990. Detection of major genes and determination of genotypes application to discrete variables. In Proc, 4th World Congress. Genet. Appl. Livest. Prod. Edinburgh, 23-27 July, 15:37-49.
Gelfand, A. E. and A. F. M. Smith. 1990. Sampling-based approaches to calculating marginal densities. J. Am. Stat. Assoc. 85:398-409.
Geman, S. and D. Geman. 1984. Stochastic relaxation, Gibbs distributions and Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 6:721-741.
Gianola, D.1982. Theory and analysis of threshold characters. J. Anim. Sci. 56:1079-1096.
Gianola, D. and J. L. Folley. 1983. Sire evaluation for ordered categorical trait with a threshold models. Genet. Sel. Evol. 17:359-368.
Guo, S., W. and E. A. Thompson. 1994. Monte Carlo estimation of mixed models for large complex pedigrees. Biometrics 50:417-432.
Hagger, C., L. L. G. Janss, H. N. Kadarmideen and G. Stranzinger. 2004. Bayesian Inference on Major Loci in Related Multi Generation Selection Lines of Laying Hens. Poult. Sci. 83:1932-1939.
Hanset, R. and C. Michaux. 1985. On the genetic determinism of muscular hypertrophy in the Belgian White and Blue cattle breed: I. Experimental data. Genet. Sel. Evol. 15:201-224.
Hill, W. G. and S. A. Knott. 1990. Identification of genes with large effects. In: Advances in Statistical Methods for Genetic Improvement of Livestock. Gianola, D., Hammond, K. Springer Verlag, New York. p. 477.
Ilahi, H. 1999. Variabilite genetique du debit de traite chez les caprins laitiers. Ph.D Thesis, INRA-ENSA de Rennes.
Ilahi, H., E. Manfredi, P. Chastin, F. Monod, J. M. Elsen and P. Le Roy. 2000. Genetic variability in milking speed of dairy goats. Genet. Res. 75:315-319.
Ilahi, H. and H. N. Kadarmideen. 2004. Bayesian Segregation Analysis of Milk Flow in Swiss Dairy Cattle using Gibbs Sampling. Genet. Sel. Evol. 36:563-576.
Janss, L. L. G., R. Thompson and J. A. M. Van Arendonk. 1995. Application of Gibbs sampling for inference in a mixed major gene-polygenic inheritance model in animal populations. Theor. Appl. Genet. 91:1137-1147.
Janss, L. L. G., J. A. M. Van Arendonk and E. W. Brascamp. 1997. Bayesian statistical analyses for presence of single major genes affecting meat quality traits in crossed pig population. Genetics 145:395-408.
Janss, L. L. G. 2004. MaGGic 4.1.: A package of subroutines for genetic analyses with Gibbs sampling.
Kadarmideen, H. N., R. Thompson and G. Simm. 2000a. Linear and Threshold Model Genetic Parameters for Disease, Fertility and Milk Production in dairy cattle. Anim. Sci. 71:411-419.
Kadarmideen, H. N., L. L. G. Janss and J. C. M. Dekkers. 2000b. Power of quantitative trait locus mapping for polygenic binary traits using generalized and regression interval mapping. Genet. Res. 76:305-317.
Kadarmideen, H. N., R. Rekaya, D. Gianola. 2001. Genetic parameters for clinical mastitis in Holstein-Friesians: a Bayesian analysis. Anim. Sci. 73:229-240.
Kadarmideen, H. N. and J. C. M. Dekkers. 2001. Generalized marker regression and interval QTL mapping methods for binary traits in half-sib family designs. J. Anim. Breed. Genet. 118:297-309.
Kadarmideen, H. N. and L. L. G. Janss. 2003. Liability interval mapping of quantitative trait loci for complex polygenic binary diseases under gene by environmental interactions. Proceedings of the XIX International Congress of Genetics, Melbourne, Australia. Section: Animal breeding and Cloning. 6.C.0967: p. 134.
Kadarmideen, H. N., D. Schworer, H. Ilahi, M. Malek and A. Hofer. 2004. Genetics of osteochondral disease and its relationship with meat quality and quantity, growth and feed conversion traits in pigs. J. Anim. Sci. 82:3118-3127.
Kim, J. W., S. I. Park and J. S. Yeo. 2003. Linkage Mapping and QTL on Chromosome 6 in Hanwoo (Korean cattle). Asian-Aust. J. Anim. Sci. 16:1402-1405.
Knott, S. A., C. S. Haley and R. Thompson. 1991. Methods of segregation analysis for animal breeding data. a comparison of power. Heredity 68:299-311.
Lee, D. H. 2002. Estimation of Genetic Parameters for Calving Ease by Heifers and Cows Using Multi-trait Threshold Animal Models with Bayesian Approach. Asian-Aust. J. Anim. Sci. 15:1085-1090.
Le Roy, P., J. M. Elsen and S. Knott. 1989. Comparison of four statistical methods for detection of a major gene in a progeny test design. Genet. Sel. Evol. 21:341-357.
Le Roy, P., J. Naveau, J. M. Elsen and P. Sellier. 1990. Evidence for a new major gene influencing meat quality in pigs. Genet. Res. 55:33-40.
Le Roy, P. and J. M. Elsen. 1991. First statistical approaches of major gene detection with special reference to discrete traits. In. (Ed. J. M. Elsen, L. Bodin and J. Thimonier), 2nd International Workshop on Major Genes for Reproduction in Sheep. Toulouse, France, INRA Ed. Paris. 47:431-440.
Miayke, T., T. Dogo, K. Moriya and Y. Sasaki. 1999. Bayesian analysis for existence of segregation of major genes affecting carcass traits in Japanese Black cattle population. J. Anim. Breed. Genet. 116:207-215.
Miayke, T., Y. Sasaki, G. Dolf and C. Gaillard. 2002. Application of constraints on parameters in segregation analysis for binary traits using Gibbs sampling. In: (Ed. I. Hoeschele).In Proc, 7th World Congr. Genet. Appl. Livest. Prod. Montpellier, France. 32:609-612.
Numerical Algorithms Group. 1990. The NAG Fortran Library Manual, NAG Ltd. Oxford.
Piper, L. R. and B. M. Bindon. 1982. The Booroola Merino and the performance of medium non-peppin crosses at Armidale. In: The Booroola Merino, (Ed. L. R. Piper and B. M. Bindon), CSIRO, Melbourne, 9-20.
Rebai, A. 1997. Comparison of methods for regression interval mapping in QTL analysis with non-normal traits. Genet. Res. 69:69-74.
Ricordeau, G., J. Bouillon, P. Le Roy and J. M. Elsen. 1990. D $\'{e}$ terminisme g $\'{e}$ n $\'{e}$ tique du d $\'{e}$ bit de traite au cours de la traite des ch $\`{e}$ vres. INRA. Prod. Anim. 3:121-126.
Robertson, A. L. and I. M. Lerner. 1949. The heritability of all-or -none traits: viability of poultry. Genetics 34:395-411.
Thaller, G., L. Dempfle and I. Hoeschele. 1996. Maximum likelihood analysis of rare binary traits under different modes of inheritance. Genetics 143:1819-1829.
Wright, S. 1934. An analysis of variability in number of digits in an inbred strain of guinea pigs. Genetics 19:506-536.
Xu, S. and W. R. Atchley. 1996. Mapping quantitative trait loci for complex binary diseases using line crosses. Genetics 143:1417-1424.
Yan, X. M., J. Ren, H. S. Ai, N. S. Ding, J. Gao, Y. M. Guo, C. Y. Chen, J. W. Ma, Q. L. Shu and L. S. Huang. 2004. Genetic Variations Analysis and Characterization of the Fifth Intron of Porcine NRAMP1 Gene. Asian-Aust. J. Anim. Sci. 17:1183-1187.
Yi, N. and S. Xu. 2000. Bayesian mapping of quantitative trait loci for complex binary traits. Genetics 155:1391-1403.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.