[논문]CSA 방법을 이용한 ab initio 단백질 구조 모델링 : CSA 방법을 이용한 단백질 루프구조 예측과 정확한 단백질 구조 예측을 위한 에너지 함수의 개발

허승룡

CSA 방법을 이용한 ab initio 단백질 구조 모델링 : CSA 방법을 이용한 단백질 루프구조 예측과 정확한 단백질 구조 예측을 위한 에너지 함수의 개발
ab intio Protein Structure Modeling using Conformational Space Annealing Method : Protein Loop Structure Prediction using Conformational Space Annealing (CSA) Method and Development of Energy Function for Accurate Protein Structure Prediction 원문보기

허승룡 (숭실대학교 대학원 생명정보학과(일원) 국내박사)

초록 ▼
AI-Helper

루프구조는 단백질의 이차구조 중 하나이며, 주로 단백질의 바깥부분에서 볼 수 있습니다. 이 연구에서는 EPLM 이라는 새로운 에너지 함수와 광역 최적와 알고리즘인 CSA 가 결합된 단백질 루프구조 예측 방법을 개발하였습니다. 에너지 함수는 stereo-chemistry, dynamic fragment assembly, distance-scaled finite ideal-gas reference (DFIRE), 그리고 generalized orientation-dependent and distance- dependent terms 을 포함하고 있습니다. 루프구조의 구조 탐색을 위해 사용된 CSA 알고리즘은 지금까지 많은 어려운 광역 최적화 문제들을 푸는데 적용되어지고 있습니다.
EPLM 함수의 성능을 확인하기 위해 Jacobson 과 RAPPER loop decoy set 을 이용하여, DFIRE 함수와 비교하였습니다. 주어진 decoy 구조들에서 실험구조에 가까운 구조를 찾아내는 정확도 비교와 random 하게 생성된 구조로부터 실험구조에 가까운 구조로 모델링하는 de novo 루프 모델링 성능을 비교하였습니다. 주어진 decoy 구조들에서 실험구조에 가까운 구조를 찾아내는 정확도 비교에서는 Jacobson set 의 경우 EPLM 이 더 정확한 결과를 주었고, RAPPER set 의 경우 비슷한 결과를 주었습니다. 좀 더 실험구조에 가까운 루프구조를 찾는 테스트에서는 EPLM 이 두 decoy set 모두에서 EDFIRE 보다 더 좋은 결과를 주었습니다.

단백질의 solvent accessible surface area (SASA)는 가장 중요한 구조적 특징 중 하나이며, 종종 protein-water 상호작용에 대한 설명, 또는 단백질의 free energy 의 이동을 계산을 위한 분석 툴로써 사용되고 있습니다 [58,59]. 좀 더 정확한 단백질 표면면적 예측을 통해 얻은 정보는 단백질의 구조 예측과 기능 예측에 매우 유용한 정보가 됩니다. 저희 연구실에서는 2012 년에 sequence profile 을 이용한 nearest neighbor 방법으로 단백질의 solvent accessibility 를 예측하는 SANN server 를 발표하였습니다 [60]. SANN server 의 전반적인 성능은 현존하는 solvent accessibility 예측 방법 (FKNN, SABLE, PROF, ACCpro, NETASA 등) 보다 더 좋은 결과를 주었습니다.
Solvent accessibility 를 단백질 구조예측에 사용하기 위해서 ESA 에너지 함수를 개발하였고, TINKER molecular modeling package 안에 구현하였습니다. ESA 에너지 함수의 성능은 7 개의 benchmark set 과 5 개의 knowledge-based statistical 에너지 함수를 이용하여 확인하였습니다. 그리고, 단백질의 표면 면적을 계산하는 시간을 줄이기 위해 Jens Meiler 그룹에서 개발한 ‘Neighbor Vector’ 알고리즘을 적용하였습니다 [61]. 또한, ESA 에너지 함수를 CASP11 에너지 함수에 포함시켰으며, 구조 탐색을 위해 많은 어려운 광역 최적화 문제들을 푸는데 적용되어지고 있는 CSA 방법을 사용하였습니다.

Mass spectrometry 기법과 단백질 cross-linking 실험의 결합 (XL-MS)을 통해 단백질의 구조적 정보를 얻을 수 있는데, 이러한 제한된 실험적 정보는 단백질의 3 차 구조를 예측하기에 충분하지는 않지만, 단백질 구조 sampling 과정 중 search space 를 줄여주는 역할을 합니다. 지금까지 MS 기법과 computational modeling 방법이 결합된 XL-MS modeling 방법의 효과는 입증되었지만, 몇가지의 문제점들은 아직까지도 잘 정립되지 못했습니다. 단백질 구조 예측을 위한 최적의 cross-linker 의 종류 또는 cross-linker 의 길이 등이 체계적으로 정립되어야할 요소들 입니다.

Abstract ▼ AI-Helper

The loop structure is one of a secondary structure element of a protein. They are typically found on the surface of the protein, which is largely responsible for its shape, dynamics and physiochemical properties. We have developed a protein loop structure prediction method by combining a new energy function, which we call EPLM (Energy for Protein Loop Modeling), with the conformational space annealing (CSA) global optimization algorithm. The energy function includes stereo-chemistry, dynamic fragment assembly, distance-scaled finite ideal-gas reference (DFIRE), and generalized orientation-dependent and distance-dependent terms. For the conformational search of loop structures, the CSA algorithm was used, which has been quite successful in dealing with various hard global optimization problems.
We assessed the performance of EPLM using two widely-used loop-decoy sets; Jacobson and RAPPER sets, and compared the results against the DFIRE potential. The accuracy of model selection from a pool of loop decoys and de novo loop modeling starting from randomly generated structures was separately examined. For the selection of a native-like structure from a decoy set, EPLM was more accurate than DFIRE in the case of the Jacobson set, and had similar accuracy for the RAPPER set. In terms of sampling more native-like loop structures, EPLM outperformed DFIRE for both decoy sets. The new approach equipped with EPLM and CSA can serve as the state-of-the-art de novo loop modeling method.

Solvent accessible surface area (SASA) of a protein is one of the most important structural features. SASA is often used as an analysis tool for describing protein-water interactions or calculating the transfer free energy of proteins [58,59]. The insights often gained through more accurate predictions of protein surface area are highly useful for also predicting that protein’s structure as well as its function. In 2012, we had presented a method to predict the solvent accessibility of proteins, called SANN, which is based on a nearest neighbor method applied to the sequence profile [60]. The overall performance of SANN was shown to be superior to the currently available methods (e.g. FKNN, SABLE, PROF, ACCpro, NETASA).
In order to utilize the solvent accessibility for the protein structure prediction, a restraining energy term ESA was designed and then implemented within the TINKER molecular modeling package. The performance of ESA was measured using seven benchmark sets and five knowledge-based statistical potential functions. And, to reduce the calculation time for surface area of each residue, we employed ‘Neighbor Vector’ algorithm which published by Jens Meiler et al [61]. In addition, ESA was implemented into CASP11 energy function for the conformational sampling and the conformational space annealing (CSA) method was used, which has been successfully applied to various hard combinatorial optimization problems.

The application of protein cross-linking experiment combined with mass spectrometry (XL-MS) enables to obtain structural information about proteins. Although limited experimental data can be insufficient to predict protein tertiary structure, it is helpful to reduce the search space during sampling procedure. Restraints derived from cross-linking/mass spectrometry experiments have been applied successfully to protein-protein interaction and structure refinement. However, although the potential of XL-MS and computational modeling has been demonstrated and many technical problems of XL-MS have been solved, several questions have not yet been evaluated systematically. To answer this problems, understanding the effect of XL-MS data for sampling and scoring would not only include. It should be also determined an optimal cross-linker length and reagent of cross-linker for protein structure prediction. A longer cross-linker length can yield more restraints information, but the value of restraints as the restraint would be decreased.

학위논문 정보

저자	허승룡
학위수여기관	숭실대학교 대학원
학위구분	국내박사
학과	생명정보학과(일원)
지도교수	신항철
발행연도	2017
총페이지	xvii, 117 p.
언어	eng
원문 URL	http://www.riss.kr/link?id=T14545141&outLink=K
정보원	한국교육학술정보원

표제어: PCR

동의어: Packet Collision Rate

용어 설명 출처 목록 (6)

용어 설명: PCR은 세균 특이성이 있는 primer를 이용하여 적은 수의 세균이 있을지라도 쉽게 검출할 수 있는 유용한 방법이며, 이를 이용하여 구강 내 치면세균막이나 타액에서 직접 세균을 검출할 수 있게 되었다[8].

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 논문명(한글), 저자명(한글), 학위수여기관, 학위연도, 학위구분, 학과, 총페이지, 키워드, 초록(한글), 초록(영문) 관리번호, 논문명(한글), 논문명(영문), 저자명(한글), 저자명(영문), 학위수여기관, 학위연도, 학위구분, 학과, 총페이지, 키워드, 초록(한글), 초록(영문)
저장형식	Text(ASCII format) Excel format
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

초록 ▼
AI-Helper

Abstract ▼ AI-Helper

학위논문 정보

이 논문을 인용한 문헌

관련 콘텐츠

원문 보기

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

초록 ▼ 용어보기논문에서 용어와 풀이말을 자동 추출한 결과로, 시범 서비스 중입니다. AI-Helper

Abstract ▼ AI-Helper

학위논문 정보

이 논문을 인용한 문헌

관련 콘텐츠

원문 보기

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

초록 ▼
AI-Helper