[논문]LLM Human Alignment의 성능 향상을 위한 데이터 증강 방법

고근영

[학위논문] LLM Human Alignment의 성능 향상을 위한 데이터 증강 방법
Data Augmentation for Enhancing LLM Human Alignment

고근영 (건국대학교 대학원 인공지능학과 인공지능 국내석사)

초록 ▼
AI-Helper

최근, ChatGPT 등장으로 대규모 언어 모델(Large language Models, LLM)의 발전이 자연어처리 연구 분야에서 혁신적인 방법론으로 두각을 나타내고 있다. 기존의 대규모 언어모델(Large Language Model, LLM) 만 사용하게 되면 질문에 대한 답변이 아닌 아예 다른 얘기를 하는 경우 가 있거나 부적절한 대답을 생성하는 경우가 있다. 이렇듯 위 문제점을 해결하기 대규모 언어 모델(Large Language Model, LLM)에 Human Alignment 방법을 적용하려는 연구가 활발히 진행 중이다. 기존 연구들 은 대부분 대량의 영어 데이터를 사용하여 Human Alignment을 적용하고 있지만 한국어로 된 데이터를 이용한 연구들은 거의 없다. 본 논문에서는 한국어 데이터에 LLM Human Alignment를 적용하기 위 해 네이버 지식인을 활용하여 데이터를 수집하였으며, 수집된 데이터가 적기 때문에 부족한 한국어 데이터 문제를 극복하고 LLM Human Alignment의 성능을 향상시키기 위해 데이터 증강 방법을 제안한다. 제안한 방법을 검증하기 위해 Human Alignment 방법으로는 인간 피드백 기반 강화학습(Reinforcement Learning from Human Feedback, RLHF) 모델과 DPO(Direct Preference Optimization)모델을 적용한다. 실험을 진행한 결과 우리는 수집한 네이버 지식인 데이터로도 어느 정도의 성능 이 나오는 것을 확인 가능하였고, 제안한 방법으로 데이터 증강을 하면 수집한 네이버 지식인 데이터보다 향상된 성능을 보임을 확인하였다.

Abstract ▼ AI-Helper

In recent years, the development of Large Language Models (LLMs) has gained prominence as an innovative methodology in the field of natural language processing research, thanks to ChatGPT. If you only use the existing Large Language Models (LLMs), there are cases where the question is not answered, or the answer is inappropriate. To solve the above problems, researchers are actively working on applying human alignment methods to large language models (LLMs). Most of the existing studies apply human alignment using large amounts of English data, but there are few researches using Korean data.
In this paper, we collect data using NAVER KIN to apply LLM Human Alignment to Korean data and propose a data augmentation method to overcome the problem of insufficient Korean data and improve the performance of LLM Human Alignment due to the small amount of data collected. To validate the proposed method, we apply the RLHF(Reinforcement learning from human feedback) model and the DPO(Direct Preference Optimization)model as human alignment methods. As a result of the experiments, we can confirm that the collected NAVER KIN data can produce a certain level of performance, and we can confirm that the data augmentation with the proposed method shows better performance than the collected NAVER KIN data.

주제어

학위논문 정보

저자	고근영
학위수여기관	건국대학교 대학원
학위구분	국내석사
학과	인공지능학과 인공지능
지도교수	민덕기
발행연도	2024
총페이지	42
키워드	데이터 증강 대규모 언어 모델 Human Alignment
언어	kor
원문 URL	http://www.riss.kr/link?id=T16958672&outLink=K
정보원	한국교육학술정보원

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 논문명(한글), 저자명(한글), 학위수여기관, 학위연도, 학위구분, 학과, 총페이지, 키워드, 초록(한글), 초록(영문) 관리번호, 논문명(한글), 논문명(영문), 저자명(한글), 저자명(영문), 학위수여기관, 학위연도, 학위구분, 학과, 총페이지, 키워드, 초록(한글), 초록(영문)
저장형식	Text(ASCII format) Excel format
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

[학위논문] LLM Human Alignment의 성능 향상을 위한 데이터 증강 방법
Data Augmentation for Enhancing LLM Human Alignment

초록 ▼
AI-Helper

Abstract ▼ AI-Helper

주제어

학위논문 정보

이 논문을 인용한 문헌

관련 콘텐츠

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

[학위논문] LLM Human Alignment의 성능 향상을 위한 데이터 증강 방법 Data Augmentation for Enhancing LLM Human Alignment

초록 ▼ 용어보기논문에서 용어와 풀이말을 자동 추출한 결과로, 시범 서비스 중입니다. AI-Helper

Abstract ▼ AI-Helper

주제어

학위논문 정보

이 논문을 인용한 문헌

관련 콘텐츠

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

[학위논문] LLM Human Alignment의 성능 향상을 위한 데이터 증강 방법
Data Augmentation for Enhancing LLM Human Alignment

초록 ▼
AI-Helper