[논문]관계형 데이터 웨어하우스의 복잡한 질의의 처리 효율 향상을 위한 비트맵 조인 인덱스 선택에 관한 연구

안형근; 고재진

doi:10.3745/kipstd.2012.19d.1.001

관계형 데이터 웨어하우스의 복잡한 질의의 처리 효율 향상을 위한 비트맵 조인 인덱스 선택에 관한 연구
A Study on Selecting Bitmap Join Index to Speed up Complex Queries in Relational Data Warehouses 원문보기

정보처리학회논문지. The KIPS transactions. Part D. Part D, v.19D no.1, 2012년, pp.1 - 14

초록
AI-Helper

데이터 웨어하우스는 크기가 방대하기 때문에 인덱스의 선택은 질의어 처리의 효율성에 상대한 영향을 준다. 인덱스는 질의 처리 비용을 줄이지만, 그것이 차지하는 기억 영역과 데이터베이스의 변경에 따른 보수라는 비용이 수반된다. 데이터 웨어하우스에서 하나의 사실 테이블과 여러 개의 차원 테이블 사이의 조인을 행하는 스타 조인 질의어와 차원 테이블의 선택을 최적화하기 위해서 비트맵 조인 인덱스가 잘 적용된다. 비트맵 조인 인덱스는 이진수로 표현되기 때문에 저장 비용은 적게 들지만 인덱스 할 후보 속성들이 많이 생성되기 때문에 그 중에서 인덱스 할 속성들을 선택하는 일은 어려운 과제가 된다. 인덱스 선택은 일단 후보 속성들의 개수를 축소하고, 그 중에서 인덱스를 선택하게 된다. 본 논문에서는 데이터 마이닝 방법을 사용해서 비트 맵 조인 인덱스 선택 문제에서 후보 속성들의 개수를 축소하는 것을 해결한다. 질의어에 있는 속성들의 빈도에 기준해서 후보 속성들의 개수를 감소시키는 기존의 방법에 비해서 본 논문은 속성들의 빈도를 사용함과 동시에 차원 테이블의 크기, 차원 테이블의 튜플 크기, 디스크의 페이지 크기 등을 고려한다. 그리고 데이터마이닝 기법으로 빈발 항목집합을 마이닝하여 후보 속성들의 개수를 효과적으로 줄인다. 후보 속성집합들의 비트 맵 조인 인덱스에 비용함수를 적용해서 최소의 비용과 기억 영역 제한에 적합한 속성집합들의 비트 맵 조인 인덱스를 구한다. 본 논문의 방법의 효율성을 평가하기 위해서 기존의 방법들과 비교 분석을 한다.

Abstract ▼ AI-Helper

As the size of the data warehouse is large, the selection of indices on the data warehouse affects the efficiency of the query processing of the data warehouse. Indices induce the lower query processing cost, but they occupy the large storage areas and induce the index maintenance cost which are accompanied by database updates. The bitmap join indices are well applied when we optimize the star join queries which join a fact table and many dimension tables and the selection on dimension tables in data warehouses. Though the bitmap join indices with the binary representations induce the lower storage cost, the task to select the indexing attributes among the huge candidate attributes which are generated is difficult. The processes of index selection are to reduce the number of candidate attributes to be indexed and then select the indexing attributes. In this paper on bitmap join index selection problem we reduce the number of candidate attributes by the data mining techniques. Compared to the existing techniques which reduce the number of candidate attributes by the frequencies of attributes we consider the frequencies of attributes and the size of dimension tables and the size of the tuples of the dimension tables and the page size of disk. We use the mining of the frequent itemsets as mining techniques and reduce the great number of candidate attributes. We make the bitmap join indices which have the least costs and the least storage area adapted to storage constraints by using the cost functions applied to the bitmap join indices of the candidate attributes. We compare the existing techniques and ours and analyze them in order to evaluate the efficiencies of ours.

주제어

질의응답

핵심어	질문	논문에서 추출한 답변
	중복 구조의 단점은?	실체화 뷰(materialized view)나 조인 인덱스(join index) 같은 중복구조는 여러 테이블들 사이의 조인 성능을 높이는데 효율적이고 데이터 웨어하우스의 인덱스를 최적화하는데 필요한 기법이다[3]. 하지만, 중복 구조는 저장 영역과 유지보수의 추가적인 비용이 필요하다는 단점이 있다. 따라서 최적화의 기술은 데이터 웨어하우스에서 피할 수 없으며 이를 위한 선택이 인덱스의 활용이다.
	빠른 응답시간을 위한 효율적인 최적화 기법이 필요하게 된 이유는?	데이터 웨어하우스에 대한 질의는 기업 활동에 있어 의사 결정 지원을 위한 것이기 때문에 빠른 응답 시간을 필요로 한다. 만약 빠른 응답시간을 위한 효율적인 최적화 기법이 없다면 질의 처리가 장시간 걸릴 수 있기 때문에 최근에는 최적화 기술의 필요성이 대두되고 있으며, 복잡하고 시간이 많이 걸리는 의사 결정 지원 질의에 대처하기 위하여 효율적이고 정확한 물리적 설계(physical design) 기법이 필요하게 되었다[1].
	데이터 웨어하우스에서 인덱스의 선택은 질의어 처리의 효율성에 영향을 주는데 그 이유는?	데이터 웨어하우스는 크기가 방대하기 때문에 인덱스의 선택은 질의어 처리의 효율성에 상대한 영향을 준다. 인덱스는 질의 처리 비용을 줄이지만, 그것이 차지하는 기억 영역과 데이터베이스의 변경에 따른 보수라는 비용이 수반된다.

참고문헌 (18)

S. Chaudhuri, and V. Narasayya, "Self-tuning database systems: A decade of progress," Proc. of the Intl. Conf. on VLDB, pp.3-14, 2007.
M. Golfarelli, S. Rizzi, and E. Saltarelli, "Index Selection for data warehousing," Proc. 4th Intl. Workshopon Design and Management of DataWarehouse, pp.33-42, 2002.
P. O'Neil, and G. Graefe, "Multi-table joins through bitmapped join indices," SIGMOD Record 24, No.3, pp.8-11, 1995.

상세보기
C. Y. Chan, and Y. E. Ioannidis, "Bitmap index design and evaluation," Proc. of the ACM SIGMOD Intl. Conf. on Management of Data, pp.355-366, 1998.
C. Chee-Yong, "Indexing techniques in decision support systems," Phd. Thesis, University of Wisconsin-Madison, 1999.
P. Valduriez, "Join Indices," ACM Trans. On Database Systems 12, 2, pp.218-246, June, 1987.

상세보기
S. Chaudhuri, "Index selection for databases: A hardness study and a principled heuristic solution," IEEE Trans. On Knowledge and Data Eng., pp.1313-1323, 2004.
K. Aouiche, O. Boussaid, and F. Bentayeb, "Automatic selection of bitmap join indices in data warehouse," 7th Intl. Conf. on DataWarehouse and Knowledge Didcovery, pp.64-73, 2005.
S. Chaudhuri, and V. Narasayya, "An efficient cost-driven index selection tool for Microsoft SQL server," Proc. of the Intl. Conf. on VLDB, pp.146-155, 1997.
R. Agrawal and R. Srikant, "Mining Sequential Patterns," Proc. of the 11th International Conference on Data Engineering(ICDE'95), pp.3-14, 1995.
D. Burdick, M. Calimlim, and J. Gehrke, "Mafia: a maximal frequent itemset algorithm for transaction databases," ICDB01, pp.443-452, 2001.
J. Han, J. Pei, Y. Yin, R. Mao, "Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach," Data Mining and Knowledge Discovery, Vol.8, pp.53-87, 2004.

상세보기
N. Pasquier, Y. Bastide, R. Taouil, L. Lakhal, "Discoverying frequent closed itemsets," ICDT, pp.398-416, 1999.
J. Han, J. Pei. and Y. Yin, "Mining Frequent Patterns without Candidate Generation," In Proceedings of the ACM-SIGMOD 2000 Conference, pp.1-12, 2000.
Yi-Hung Wu, Chai-Ming Chiang, and Arbee L. P. Chen, "Hiding Sensitive Association Rules with Limited Side Effects," IEEE Transactions on Knowledge and Data Engineering, Vol.19, Issue 1, pp.29-42, 2007.

상세보기
H. Mannila and H. Toivonen, "Levelwise search and borders of theories in knowledge discovery," Data Mining and Knowledge Discovery, Vol.1, No.3, pp.241-258, 1997.

상세보기
3. Ladjel Bellatreche, "A Data Mining Approach for Selecting Bitmap Join Indices," Journal of Computing Science and Engineering, Vol.1, No.2, December, 2007.

원문보기 상세보기
http://www.almaden.ibm.com/cs/projects/iis/hdb/Projects/ data_mining/mining.shtml

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 논문명, 저널/프로시딩명, 저자 , 발행년, 권, 호, 시작페이지, 끝페이지, 발행기관 관리번호, 논문명, 대등논문명, 저자 , 저널/프로시딩명, 발행기관, 발행년, 발행언어, 권, 호, 시작페이지, 끝페이지, ISBN, ISSN, 주제분야, 키워드, 초록(한글), 초록(영문), 저자(소속기관)
저장형식	Text(ASCII format) Excel format RefWorks Direct Export RIS format (for Reference Manager, ProCite, EndNote), Scholar's Aids, Mendeley
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

관계형 데이터 웨어하우스의 복잡한 질의의 처리 효율 향상을 위한 비트맵 조인 인덱스 선택에 관한 연구
A Study on Selecting Bitmap Join Index to Speed up Complex Queries in Relational Data Warehouses 원문보기

초록
AI-Helper

Abstract ▼ AI-Helper

주제어

질의응답

참고문헌 (18)

이 논문을 인용한 문헌

저자의 다른 논문 :

관련 콘텐츠

원문 보기

원문 URL 링크

오픈액세스(OA) 유형

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

관계형 데이터 웨어하우스의 복잡한 질의의 처리 효율 향상을 위한 비트맵 조인 인덱스 선택에 관한 연구 A Study on Selecting Bitmap Join Index to Speed up Complex Queries in Relational Data Warehouses 원문보기

초록 용어보기논문에서 용어와 풀이말을 자동 추출한 결과로, 시범 서비스 중입니다. AI-Helper

Abstract ▼ AI-Helper

주제어

질의응답

참고문헌 (18)

이 논문을 인용한 문헌

저자의 다른 논문 :

안형근 (7) 고재진 (9)

관련 콘텐츠

원문 보기

원문 URL 링크

오픈액세스(OA) 유형

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

관계형 데이터 웨어하우스의 복잡한 질의의 처리 효율 향상을 위한 비트맵 조인 인덱스 선택에 관한 연구
A Study on Selecting Bitmap Join Index to Speed up Complex Queries in Relational Data Warehouses 원문보기

초록
AI-Helper