$\require{mediawiki-texvc}$

연합인증

연합인증 가입 기관의 연구자들은 소속기관의 인증정보(ID와 암호)를 이용해 다른 대학, 연구기관, 서비스 공급자의 다양한 온라인 자원과 연구 데이터를 이용할 수 있습니다.

이는 여행자가 자국에서 발행 받은 여권으로 세계 각국을 자유롭게 여행할 수 있는 것과 같습니다.

연합인증으로 이용이 가능한 서비스는 NTIS, DataON, Edison, Kafe, Webinar 등이 있습니다.

한번의 인증절차만으로 연합인증 가입 서비스에 추가 로그인 없이 이용이 가능합니다.

다만, 연합인증을 위해서는 최초 1회만 인증 절차가 필요합니다. (회원이 아닐 경우 회원 가입이 필요합니다.)

연합인증 절차는 다음과 같습니다.

최초이용시에는
ScienceON에 로그인 → 연합인증 서비스 접속 → 로그인 (본인 확인 또는 회원가입) → 서비스 이용

그 이후에는
ScienceON 로그인 → 연합인증 서비스 접속 → 서비스 이용

연합인증을 활용하시면 KISTI가 제공하는 다양한 서비스를 편리하게 이용하실 수 있습니다.

Efficient Strategy to Identify Gene-Gene Interactions and Its Application to Type 2 Diabetes 원문보기

Genomics & informatics, v.14 no.4, 2016년, pp.160 - 165  

Li, Donghe (Interdisciplinary Program of Bioinformatics, Seoul National University) ,  Wo, Sungho (Interdisciplinary Program of Bioinformatics, Seoul National University)

Abstract AI-Helper 아이콘AI-Helper

Over the past decade, the detection of gene-gene interactions has become more and more popular in the field of genome-wide association studies (GWASs). The goal of the GWAS is to identify genetic susceptibility to complex diseases by assaying and analyzing hundreds of thousands of single-nucleotide ...

주제어

AI 본문요약
AI-Helper 아이콘 AI-Helper

* AI 자동 식별 결과로 적합하지 않은 문장이 있을 수 있으니, 이용에 유의하시기 바랍니다.

제안 방법

  • After filtering SNP pairs from the first stage with BOOST, we apply the logistic regression. BOOST can not adjust for the effects of covariates, and we applied logistic regression analysis with adjustments for sex, age, body mass index (BMI), and the top 10 principal component (PC) scores to the selected pairs of SNPs with BOOST. The logistic regression analysis was performed by using the glm function in R software.
  • The generated PC scores are then utilized as covariates for genetic association analyses, and this approach guarantees robustness against a population substructure. Here, we calculated the first 10 PC scores, and they were included as covariates for the logistic method in R. Sex, age, and BMI were also included as covariates in the analysis.
  • A follow-up stage of logistic regression with covariates would improve the statistical power of the model. In this paper, we first review the BOOST method and apply the proposed two-stage approach to type 2 diabetes (T2D) in a Korean population. This analysis of gene-gene interactions on a genome-wide scale with BOOST was completed within 42 hours, and we also identified several pairs of SNPs associated with T2D.
  • On the basis of the equivalence between the log-linear model and its corresponding logistic regression model, BOOST constructed its test statistic using the homogeneous association model MH and saturated model MS and denotes their log-likelihood as LH and LS, respectively. Then, we denote the observed genotype count of disease status k with Xp = i and Xq = j by nijk and the expected genotype count by μijk, where k = 1 or 2, i = 1, 2, or 3, and j = 1, 2, or 3.
  • For a reduced model, we considered three levels for each SNP, and thus, our likelihood ratio tests followed a chi-square distribution with 4 degrees of freedom. Therefore, the proposed method can detect biological interactions. It should be noted that biological interactions include statistical interactions.
  • In this paper, we first review the BOOST method and apply the proposed two-stage approach to type 2 diabetes (T2D) in a Korean population. This analysis of gene-gene interactions on a genome-wide scale with BOOST was completed within 42 hours, and we also identified several pairs of SNPs associated with T2D.
  • In this study, we proposed an efficient strategy to identify interactions in genome-wide SNP data. We first utilized the screening stage of BOOST to filter out non-significant pairs and then used logistic regression with several covariates, such as age, sex, BMI, and PC scores.

대상 데이터

  • 5%. A total of 1,169 subjects were diagnosed as cases, and the other individuals were considered controls.
  • Data for this study was provided with biospecimens from National Biobank of Korea, the Centers for Disease Control and Prevention, Republic of Korea (4845-301, 4845-302 and 307), and this work was supported by Research Resettlement Fund for the new faculty of Seoul National University.

이론/모형

  • The concept of epistasis, generally defined as interactions among different genes, was first introduced in 1909 by William Bateson to describe the latent effect of one locus over another locus. A quantitative definition to the interaction was proposed in 1918 by R.A. Fisher as a statistical deviation from the additive effects of two loci on a phenotype. This definition enabled interaction analyses by testing whether products of multiple genotypes are statistically associated with phenotypes.
  • BOOST can not adjust for the effects of covariates, and we applied logistic regression analysis with adjustments for sex, age, body mass index (BMI), and the top 10 principal component (PC) scores to the selected pairs of SNPs with BOOST. The logistic regression analysis was performed by using the glm function in R software. To calculate the p-values of the interaction term, we used the ANOVA function by comparing two fitted models in R.
  • To perform adjustments for the population substructure between individuals, we used the EIGENSTRAT [18] method. EIGENSTRAT calculates the genetic similarities among subjects by using a genetic relationship matrix and applies PC analysis.
본문요약 정보가 도움이 되었나요?

참고문헌 (20)

  1. 1 Ueki M Cordell HJ Improved statistics for genome-wide interaction analysis PLoS Genet 2012 8 e1002625 22496670 

  2. 2 Won S Kwon MS Mattheisen M Park S Park C Kihara D Efficient strategy for detecting gene x gene joint action and its application in schizophrenia Genet Epidemiol 2014 38 60 71 24272960 

  3. 3 Wang X Elston RC Zhu X Statistical interaction in human genetics: how should we model it if we are looking for biological interaction? Nat Rev Genet 2011 12 74 21102529 

  4. 4 Hu JK Wang X Wang P Testing gene-gene interactions in genome wide association studies Genet Epidemiol 2014 38 123 134 24431225 

  5. 5 Ritchie MD Hahn LW Roodi N Bailey LR Dupont WD Parl FF Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer Am J Hum Genet 2001 69 138 147 11404819 

  6. 6 Zhang Y Liu JS Bayesian inference of epistatic interactions in case-control studies Nat Genet 2007 39 1167 1173 17721534 

  7. 7 Schwarz DF König IR Ziegler A On safari to Random Jungle: a fast implementation of Random Forests for high-dimensional data Bioinformatics 2010 26 1752 1758 20505004 

  8. 8 Purcell S Neale B Todd-Brown K Thomas L Ferreira MA Bender D PLINK: a tool set for whole-genome association and population-based linkage analyses Am J Hum Genet 2007 81 559 575 17701901 

  9. 9 Wan X Yang C Yang Q Xue H Fan X Tang NL BOOST: a fast approach to detecting gene-gene interactions in genome-wide case-control studies Am J Hum Genet 2010 87 325 340 20817139 

  10. 10 Yung LS Yang C Wan X Yu W GBOOST: a GPU-based tool for detecting gene-gene interactions in genome-wide case control studies Bioinformatics 2011 27 1309 1310 21372087 

  11. 11 Ko SH Kim SR Kim DJ Oh SJ Lee HJ Shim KH 2011 clinical practice guidelines for type 2 diabetes in Korea Diabetes Metab J 2011 35 431 436 22111032 

  12. 12 Park SE Lee WY Oh KW Baek KH Yoon KH Kang MI Impact of common type 2 diabetes risk gene variants on future type 2 diabetes in the non-diabetic population in Korea J Hum Genet 2012 57 265 268 22377714 

  13. 13 Park KS The search for genetic risk factors of type 2 diabetes mellitus Diabetes Metab J 2011 35 12 22 21537408 

  14. 14 Cho YS Go MJ Kim YJ Heo JY Oh JH Ban HJ A large-scale genome-wide association study of Asian populations uncovers genetic factors influencing eight quantitative traits Nat Genet 2009 41 527 534 19396169 

  15. 15 Agresti A Categorical Data Analysis 2nd ed New York Wiley-Interscience 2002 

  16. 16 Matsuda H Physical nature of higher-order mutual information: intrinsic correlations and frustration Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics 2000 62 3 Pt A 3096 3102 11088803 

  17. 17 Howie B Marchini J Stephens M Genotype imputation with thousands of genomes G3 (Bethesda) 2011 1 457 470 22384356 

  18. 18 Price AL Patterson NJ Plenge RM Weinblatt ME Shadick NA Reich D Principal components analysis corrects for stratification in genome-wide association studies Nat Genet 2006 38 904 909 16862161 

  19. 19 Ross KA Evidence for somatic gene conversion and deletion in bipolar disorder, Crohn's disease, coronary artery disease, hypertension, rheumatoid arthritis, type-1 diabetes, and type-2 diabetes BMC Med 2011 9 12 21291537 

  20. 20 Giri A Sanders M Velez Edwards D Ikizler T Roden D Birdwell K A genome wide association study of new onset diabetes after transplant in kidney transplantation Am J Transplant 2016 16 Suppl 3 B235 

관련 콘텐츠

오픈액세스(OA) 유형

GOLD

오픈액세스 학술지에 출판된 논문

이 논문과 함께 이용한 콘텐츠

저작권 관리 안내
섹션별 컨텐츠 바로가기

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

AI-Helper 아이콘
AI-Helper
안녕하세요, AI-Helper입니다. 좌측 "선택된 텍스트"에서 텍스트를 선택하여 요약, 번역, 용어설명을 실행하세요.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.

선택된 텍스트

맨위로