[논문]차세대 염기서열 분석기법과 생물정보학

김기봉

doi:10.5352/jls.2015.25.3.357

차세대 염기서열 분석기법과 생물정보학
Next Generation Sequencing and Bioinformatics 원문보기

생명과학회지 = Journal of life science, v.25 no.3 = no.179, 2015년, pp.357 - 367

초록
AI-Helper

매우 빠른 속도로 발전하고 있는 차세대 염기서열 분석 플랫폼과 최신 생물정보학적 분석도구들로 말미암아, 1,000달러 이하의 가격으로 인간 유전체 염기서열을 해독하고자 하는 궁극적인 목표가 조만간 곧 실현될 수 있을 것 같다. 차세대 염기서열 분석 분야의 급속한 기술적 진전은 NGS 데이터의 분석과 관리를 위한 통계적 방법과 생물정보학적 분석도구들에 대한 수요를 꾸준히 증대시키고 있다. NGS 플랫폼이 상용화되어 쓰이기 시작한 초창기부터, NGS 데이터를 분석하고 해석하거나, 가시화 해주는 다수의 응용프로그램이나 도구들이 개발되어 활용되어 왔다. 그러나, NGS 데이터의 엄청난 범람으로 데이터 저장, 데이터 분석 및 관리 등에 있어서 해결해야 할 많은 문제들이 부각되고 있다. NGS 데이터 분석은 단편서열과 참조서열간의 서열정렬, 염기식별, 다형성 발견, 쌍단편 서열이나 비쌍단편 서열 등을 이용한 어셈블리 작업, 구조변이 발견, 유전체 브라우징 등을 본질적으로 포함한다. 본 논문은 주요 차세대 염기서열 결정기술과 NGS 데이터 분석을 위한 생물정보학적 분석도구들에 대해 개관적으로 소개하고자 한다.

Abstract ▼ AI-Helper

With the ongoing development of next-generation sequencing (NGS) platforms and advancements in the latest bioinformatics tools at an unprecedented pace, the ultimate goal of sequencing the human genome for less than $1,000 can be feasible in the near future. The rapid technological advances in NGS have brought about increasing demands for statistical methods and bioinformatics tools for the analysis and management of NGS data. Even in the early stages of the commercial availability of NGS platforms, a large number of applications or tools already existed for analyzing, interpreting, and visualizing NGS data. However, the availability of this plethora of NGS data presents a significant challenge for storage, analyses, and data management. Intrinsically, the analysis of NGS data includes the alignment of sequence reads to a reference, base-calling, and/or polymorphism detection, de novo assembly from paired or unpaired reads, structural variant detection, and genome browsing. While the NGS technologies have allowed a massive increase in available raw sequence data, a number of new informatics challenges and difficulties must be addressed to improve the current state and fulfill the promise of genome research. This review aims to provide an overview of major NGS technologies and bioinformatics tools for NGS data analyses.

주제어

AI 본문요약
AI-Helper

* AI 자동 식별 결과로 적합하지 않은 문장이 있을 수 있으니, 이용에 유의하시기 바랍니다.

제안 방법

Indeed new technologies are producing data at a rate that outpaces our ability to analyze its biological meaning. Researchers are ad- dressing this challenge by adopting mathematical and stat- istical software, computer modeling, and other computa- tional and engineering methods. As a result, bioinformatics has become the latest engineering discipline.
However, only an handful of tools have been implemented [23, 28, 29, 35, 40] for SNP and small (1-5 bp) indel discovery. The goal of these programs consist in judging the likelihood that a locus is a heterozygous or homozygous variant given the error rates of the platform, the probability of bad mappings, and the amount of coverage. For these reasons, all the avail- able tools for SNP and indel discovery follow two main steps: the first is for data preparation and in the second each nucleotide is called under a Bayesian framework.

참고문헌 (51)

Adessi, C., Matton, G., Ayala, G., Turcatti, G., Mermod, J. J., Mayer, P. and Kawashima, E. 2000. Solid phase DNA amplification: characterisation of primer attachment and amplification mechanisms. Nucleic Acids Res. 28, e87.

상세보기
Alkan, C., Coe, B. P. and Eichler, E. E. 2011. Genome structural variation discovery and genotyping. Nat. Rev. Genet. 12, 363-376.

상세보기
Bao, H., Guo, H., Wang, J., Zhou, R., Lu, X. and Shi, S. 2009. MapView: visualization of short reads alignment on a desktop computer. Bioinformatics 12, 1554-1555.
Campbell, P. J., Stephens, P. J., Pleasance, E. D., O'Meara, S., Li, H., Santarius, T., Stebbings, L. A., Leroy, C. and Edkins, S. et al. 2008. Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nat. Genet. 40, 722-729.

상세보기
Chiang, D. Y., Getz, G., Jaffe, D. B., O'Kelly, M. J. T., Zhao, X., Carter, S. L., Russ, C., Nusbaum, C., Meyerson, M. and Lander, E. S. 2009. High-resolution mapping of copy-number alterations with massively parallel sequencing. Nat. Methods 6, 99-103.

상세보기
Dalca, A. V. and Brudno, M. 2010. Genome variation discovery with high-throughput sequencing data. Brief. Bioinform. 11, 3-14.

상세보기
Dalloul, R. A., Long, J. A., Zimin, A. V., Aslam, L. and Beal, K. et al. 2010. Multi-platform next generation sequencing of the domestic turkey (Meleagris gallopavo): Genome assembly and analysis. PLoS Biol. 8, e1000475. doi:10.1371/journal.pbio.1000475.

상세보기
Dinsdale, E. A., Edwards, R. A., Hall, D., Angly, F., Breitbart, M., Brulc, J. M., Furlan, M., Desnues, C., Haynes, M. and Li, L. et al. 2008. Functional metagenomic profiling of nine biomes. Nature 452, 629-632.

상세보기
Durbin, R. M., Abecasis, G. R., Altshuler, D. L., Auton, A. and Brooks, L. D. et al. 2010. A map of human genome variation from population-scale sequencing. Nature 467, 1061-1073.

상세보기
Fedurco, M., Romieu, A., Williams, S., Lawrence, I. and Turcatti, G. 2006. BTA, a novel reagent for DNA attachment on glass and efficient generation of solid-phase amplified DNA colonies. Nucleic Acids Res. 34, e22.

상세보기
Feuk, L., Carson, A. R. and Scherer, S. W. 2006. Structural variation in the human genome. Nature Rev. Genet. 7, 85-97.
Flicek, P. and Birney, E. 2009. Sense from sequence reads: methods for alignment and assembly. Nat. Methods 6, S6-S12.

상세보기
Giardine, B., Riemer, C., Hardison, R. C., Burhans, R., Elnitski, L., Shah, P., Zhang, Y., Blankenberg, D., Albert, I. and Taylor, J. et al. 2005. Galaxy: a platform for interactive large-scale genome analysis. Genome Res. 15, 1451-1455.

상세보기
Gogol-Doring, A. and Chen, W. 2012. An overview of the analysis of next generation sequencing data. Methods Mol. Biol. 802, 249-57.

상세보기
Grada, A. and Weinbrecht, K. 2013. Next-generation sequencing: methodology and appliction. J. Investig. Dermatol. 133, e11; doi:10.1038/jid.2013.248.

상세보기
Hoberman, R., Dias, J., Ge, B., Harmsen, E., Mayhew, M., Verlaan, D. J., Kwan, T., Dewar, K., Blanchette, M. and Pastinen, T. 2009. A probabilistic approach for SNP discovery in high-throughput human resequencing data. Genome Res. 19, 1542-1552.

상세보기
Huang, W. and Marth, G. 2008. EagleView: a genome assembly viewer for next-generation sequencing technologies. Genome Res. 9, 1538-1543.
Hyman, E. D. 1988. A new method of sequencing DNA. Anal. Biochem. 174, 423-436.

상세보기
Jimenez-Lopex, J. C., Gachomo, E. W., Sharma, S. and Kotchoni, S. O. 2013. Genome sequencing and next-generation sequence data analysis: a comprehensive compilation of bioinformatics tools and databases. Am. J. Mol. Biol. 3, 115-130.

상세보기
Kent, W. J. 2002. BLAT-the BLAST-like alignment tool. Genome Res. 4, 656-664.
Kosakovsky, P. S., Wadhawan, S., Chiaromonte, F., Ananda, G., Chung, W. Y., Taylor, J. and Nekrutenko, A. 2009. Windshield splatter analysis with the Galaxy metagenomic pipeline. Genome Res. 19, 2144-2153.

상세보기
Krawitz, P., Rödelsperger, C., Jäger, M., Jostins, L., Bauer, S. and Robinson, P. N. 2010. Microindel detection in short-read sequence data. Bioinformatics 26, 722-729. doi: 10.1093/bioinformatics/btq027.

상세보기
Langmead, B., Trapnell, C., Pop, M. and Salzberg, S. L. 2009. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 3, R25.
Lassmann, T., Hayashizaki, Y. and Daub, C. O. 2011. SAMStat: Monitoring biases in next generation sequencing data. Bioinformatics 27, 130-131. doi:10.1093/bioinformatics/btq614.

상세보기
Li, H. and Durbin, R. 2010. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 5, 589-595.
Li, H. and Durbin, R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 754-1760. doi:10.1093/bioinformatics/btp324.

상세보기
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G. and Durbin, R. et al. 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 16, 2078-2079.
Li, R., Li, Y., Kristiansen, K. and Wang, J. 2008. SOAP: short oligonucleotide alignment program. Bioinformatics 5, 713- 714.
Li, H., Ruan, J. and Durbin, R. 2008. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 11, 1851-1858.
Li, R., Yu, C., Li, Y., Lam, T., Yiu, S., Kristiansen, K. and Wang, J. 2009. SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 15, 1966-1967.
Lorenzi, H. A., Hoover, J., Inman, J., Safford, T., Murphy, S., Kagan, L. and Williamson, S. J. 2011. The Viral Meta- Genome Annotation Pipeline (VMGAP):an automated tool for the functional annotation of viral Metagenomic shotgun sequencing data. Stand. Genomic Sci. 4, 418-429.

상세보기
Magi, A., Benlli, M., Gozzini, A., Girolami, F., Torricelli, F. and Brandi, M. L. 2010. Bioinformatics for next generation sequencing data. Genes 1, 294-307.

상세보기
Magi, A., Benelli, M., Seungtai Yoon, S. and Torricelli, F. Detecting common copy number variants in high-throughput sequencing data by using Joint SLM algorithm. Nucleic Acids Res., submitted for publication.
Malhis, N. and Jones, S. J. M. 2010. High quality SNP calling using Illumina data at shallow coverage. Bioinformatics 26, 1029-1035.

상세보기
Marth, G. T., Korf, I., Yandell, M. D., Yeh, R. T., Gu, Z., Zakeri, H., Stitziel, N. O., Hillier, L., Kwok, P. Y. and Gish W. R. 1999. A general approach to single-nucleotide polymorphism discovery. Nat. Genet. 23, 452-456.

상세보기
McKenna, A., Hanna, M., Banks, E., Sivachenko, A. and Cibulskis, K., et al. 2010. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research 20, 1297-1303.

상세보기
Milne, I., Bayer, M., Cardle, L., Shaw, P., Stephen, G., Wright, F. and Marshall, D. 2010. Tablet－next generation sequence assembly visualization. Bioinformatics 3, 401-402.
Mitra, R. D. and Church, G. M. 1999. In situ localized amplification and contact replication of many individual DNA molecules. Nucleic Acids Res. 27, e34.

상세보기
Nagalakshmi, U., Wang, Z., Waern, K., Shou, C., Raha, D., Gerstein, M. and Snyder, M. 2008. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320, 1344-1349.

상세보기
Ning, Z., Cox, A. J. and Mullikin, J. C. 2001. SSAHA: a fast search method for large DNA databases. Genome Res. 11, 1725-1729.

상세보기
Nothnagel, M., Herrmann, A., Wolf, A., Schreiber, S., Platzer, M., Siebert, R., Krawczak, M. and Hampe, J. 2011. Technology-specific error signatures in the 1000 Genomes Project data. Human Genome 130, 505-516. doi:10.1007/s00439-011-0971-3.

상세보기
Olshen, A. B., Venkatraman, E. S., Lucito, R. and Wigler, M. 2005. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5, 557-572.
Pareek, C. S., Smoczynski, R. and Tretyn, A. 2011. Sequencing technologies and genome sequencing. J. Appl. Genetics 52, 413-435.

상세보기
Park, P. J. 2009. ChIP-seq: advantages and challenges of a maturing technology. Nat. Rev. Genet. 10, 669-680.

상세보기
Schadt, E. E., Turner, S. and Kasarskis, A. 2010. A window into third generation sequencing. Hum. Mol. Genet. 19, R227- R240.

상세보기
Scholz, M. B., Lo, C. and Chain, P. 2012. Next generation sequencing and bioinformatics bottlenecks: the current state of metagenomics data analysis. Curr. Opin. Biotechnol. 23, 9-15.

상세보기
Shendure, J., Porreca, G. J., Reppas, N. B., Lin, X., Mc-Cutcheon, J. P., Rosenbaum, A. M., Wang, M. D., Zhang, K., Mitra, R. D. and Church, G. M. 2005. Accurate multiplex polony sequencing of an evolved bacterial genome. Science 309, 1728-1732.

상세보기
Tawfik, D. S and Griffiths, A. D. 1998. Man-made cell-like compartments for molecular evolution. Nature Biotech. 16, 652-656.

상세보기
Turcatti, G., Romieu, A., Fedurco, M. and Tairi, A. P. 2008. A new class of cleavable fluorescent nucleotides: synthesis and optimization as reversible terminators for DNA sequencing by synthesis. Nucleic Acids Res. 36, e25.

상세보기
Whiteford, N., Skelly, T., Curtis, C., Ritchie, M. E., Löhr, A., Zaranek, A. W., Abnizova, I. and Brown, C. 2009. Swift: primary data analysis for the Illumina Solexa sequencing platform. Bioinformatics 25, 2194-2199.

상세보기
Xie, W., Wang, F., Guo, L., Chen, Z., Sievert, S. M., Meng, J., Huang, G., Li, Y., Yan, Q. and Wu, S. et al. 2011. Comparative metagenomics of microbial communities inhabiting deep-sea hydrothermal vent chimneys with contrasting chemistries. ISME J. 5, 414-426.

상세보기

저자의 다른 논문 :

표제어: PCR

동의어: Packet Collision Rate

용어 설명 출처 목록 (6)

용어 설명: PCR은 세균 특이성이 있는 primer를 이용하여 적은 수의 세균이 있을지라도 쉽게 검출할 수 있는 유용한 방법이며, 이를 이용하여 구강 내 치면세균막이나 타액에서 직접 세균을 검출할 수 있게 되었다[8].

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 논문명, 저널/프로시딩명, 저자 , 발행년, 권, 호, 시작페이지, 끝페이지, 발행기관 관리번호, 논문명, 대등논문명, 저자 , 저널/프로시딩명, 발행기관, 발행년, 발행언어, 권, 호, 시작페이지, 끝페이지, ISBN, ISSN, 주제분야, 키워드, 초록(한글), 초록(영문), 저자(소속기관)
저장형식	Text(ASCII format) Excel format RefWorks Direct Export RIS format (for Reference Manager, ProCite, EndNote), Scholar's Aids, Mendeley
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증