[논문]트랜잭션기반 분석용 OLAP 큐브구조 설계

유한주

트랜잭션기반 분석용 OLAP 큐브구조 설계
Designing OLAP cube structures for transaction-based analysis 원문보기

유한주 (숭실대학교 대학원 산업.정보시스템공학과 국내박사)

초록 ▼
AI-Helper

OLAP(Online Analytical Processing)과 데이터 마이닝(Data Mining)은 CRM(Customer Relation Management, 고객관계관리)면으로 볼 때 상호 보완적인 기술이라고 할 수 있겠다. OLAP은 차원 계층을 기반으로 한 특별 데이터 구조로 되어 있으며, 측정값은 미리 집계된 형태로 저장되어 있다. OLAP 큐브는 다차원적인 데이터베이스인데, 의사결정 지원에 필요한 여러 문제를 해결하기 위해 만들어 쓰고 있다. 전형적인 큐브는 고객, 제품, 상점, 시간과 같이 여러개의 차원으로 정의되어 있고, 각각의 차원 구성원은 계층 구조로 되어 있다. RDB(Relational Database)를 기반으로 하는 현재의 OLAP 큐브는 아주 복잡하고 거대하기 때문에 OLAP 큐브에서 유용한 정보를 찾아내기는 무척 힘겹게 된다. 따라서 이러한 큐브로부터 고객의 구매 패턴과 같은 CRM적인 정보를 찾기 위해서는 데이터마이닝 기술을 활용할 필요가 생기는 것이다. 아이템세트란 아이템의 집합을 말한다. 각각의 아이템 세트는 아이템세트에 포함되어 있는 아이뎀들의 수로 그 크기를 나타낸다. 판매 트랜잭션은 크기에 따라 두 가지 그룹으로 분리 할 수 있는데, 그룹Ⅰ 트랜잭션은 하나의 아이템으로만 구성된 아이템 세트이고, 그룹Ⅱ 트랜잭션은 두 개 이상의 아이템들로도 구성되어 있는 아이템 세트이다. 스타 스키마라고 불리는 현재의 OLAP 스키마에서는 판매 트랜잭션의 데이터 테이블이 사실 테이블이 된다. 이 사실 테이블은 RDB에 관한한 릴레이션이 되어야만 하는데, 사실 테이블이 릴레이션이 되기 위해서는 판매 트랜잭션의 행들의 순서가 사실 테이블에서 무관하여야만 한다. 그룹Ⅰ 트랜잭션에 대한 트랜잭션의 행의 순서는 사실 테이블에서 무관하게 구성될 수 있지만, 그룹Ⅱ 트랜잭션은 트랜잭션 행의 순서대로 구성되어야 하므로 이 요구 조건을 만족시켜 줄 수 없다. 그룹Ⅱ 트랜잭션은 릴레이션이 되지 못함으로 인해서 심각한 문제가 생기게 된다. 고객이 제품을 구매할 때에는 항시 구매패턴이 생기기 마련인데, 이러한 구매패턴을 찾아 나가는 과정을 장바구니 분석이라고 부른다. 장바구니 분석은 Microsoft Association Algorithm에서는 두 가지 단계로 구성되어 있는데, 첫 번째 단계는 빈발항목집합을 찾아내는 과정이고, 두 번째 단계는 첫 번째 단계에서 찾은 빈발항목집합을 근거로 하여 이들의 중요도를 비교하는 단순한 계산과정이다. 빈발항목집합을 찾아내는 첫 번째 단계는 장바구니 분석에 있어서 핵심부분임에도 불구하고, 그룹Ⅱ 트랜잭션으로 된 사실테이블에 적용할 때에는 추적분석이 불가능해지거나 허구의 빈발항목집합이 생성되는 등 여러 문제가 발생하게 된다. 본 연구에서는 장바구니 분석에 있어서 추적분석을 가능하게 하고 실제의 빈발항목집합만을 생성시키는 새로운 OLAP 큐브 구조의 설계법을 제안하고 있다. 사실 테이블에서의 트랜잭션 행은 본래 여러 측정값과 시간차원의 기본키, 그리고 다른 여러 차원의 기본키를 포함하여 구성되어 있다. 그러나 어떤 정해진 기간 동안 추적분석을 수행 할 때에는 시간 차원이 의미가 없어지게 되기 때문에, 그룹Ⅰ 트랜잭션이든 그룹Ⅱ 트랜잭션이든 간에 사실 테이블의 트랜잭션 행이 실제의 트랜잭션 행들보다 적어지게 된다. 이렇게 실제보다 적은 수의 트랜잭션을 갖는 사실 테이블로 분석하면 심각한 문제가 발생하게 된다. 본 연구에서는 트랜잭션 행들의 수가 적어짐으로 인해서 발생하는 문제들을 해결하고자 사실 테이블 내에 트랜잭션 열을 추가하는 또 하나의 새로운 개념의 데이터 구조 기법을 제안하고 있다.

Abstract ▼ AI-Helper

OLAP(Online Analytical Processing) and data mining are two complementary technologies for CRM(Customer Relation Management). OLAP is about aggregating measures based on dimension hierarchies and storing these precalculated aggregations in a special data structure. An OLAP cube is built for decision-support queries. The cube is a multidimensional database. A typical cube contains a set of well-defined dimensions such as Customer, Product, Store, and Time. Each dimension contains many members. Dimension members are organized by hierarchies. Current OLAP cubes based on RDB(Relational Database) are of the very complicated and large. Finding useful information in such large cube is challenging. There is definitely a requirement in applying data mining techniques to dig the patterns out of these cubes. An itemset is a set of items. Each itemset has a size, which is the number of items contained in the itemset. Sales transactions can be divided into two groups : GroupⅠ transactions whose sizes are all one, and GroupⅡ transactions with itemsets of size one or greater. In the current OLAP schema so called a star schema, the base sales transactions data table becomes a fact table. This fact table should be a relation as far as relational databases are concerned. For a fact table to be a relation, the order of the sale transactions rows should be immaterial. Although in the fact table composed of GroupⅠ transactions, the order of the transaction rows is immaterial, in the fact table composed of GroupⅡ transactions the order of the transaction rows is required. The fact tables with GroupⅡ transactions which cannot be relations would cause serious problems. Every purchase a customer makes builds patterns of how products are purchased together. The process of finding these patterns, called market basket analysis, is composed of two steps in the Microsoft Association Algorithm. The first step is to find frequent itemsets. The second step which requires much less time than the first step does is to generate association rules based on the frequent itemsets. Even though the first step, finding frequent itemsets, is the core part of market basket analysis, when applied to the fact tables with GroupⅡ transactions it always raises several issues such as longitudinal analysis becoming impossible and many unpractical transactions are built up. In this paper, a new OLAP cube structures designing method which allows longitudinal analysis and also makes only real customers' purchase patterns to be identified is proposed for market basket analysis. When a longitudinal analysis over a period of time is carried out on the fact table containing variables, time key, and other dimension keys, the transaction rows in the fact table originally dimensioned by time could now be thought as not dimensioned by time. Due to this fact, the number of transaction rows of the fact table, whether with GroupⅠ or with GroupⅡ transactions, can be less than the number of real transaction rows. The fact tables with less transaction rows may cause serious problems. In an attempt to solve the problem of decreasing number of transaction rows, a new concept of data structuring technique is also suggested in this paper.

학위논문 정보

저자	유한주
학위수여기관	숭실대학교 대학원
학위구분	국내박사
학과	산업.정보시스템공학과
지도교수	최인수
발행연도	2008
총페이지	xii, 93 p
언어	kor
원문 URL	http://www.riss.kr/link?id=T11274567&outLink=K
정보원	한국교육학술정보원

표제어: PCR

동의어: Packet Collision Rate

용어 설명 출처 목록 (6)

용어 설명: PCR은 세균 특이성이 있는 primer를 이용하여 적은 수의 세균이 있을지라도 쉽게 검출할 수 있는 유용한 방법이며, 이를 이용하여 구강 내 치면세균막이나 타액에서 직접 세균을 검출할 수 있게 되었다[8].

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 논문명(한글), 저자명(한글), 학위수여기관, 학위연도, 학위구분, 학과, 총페이지, 키워드, 초록(한글), 초록(영문) 관리번호, 논문명(한글), 논문명(영문), 저자명(한글), 저자명(영문), 학위수여기관, 학위연도, 학위구분, 학과, 총페이지, 키워드, 초록(한글), 초록(영문)
저장형식	Text(ASCII format) Excel format
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

트랜잭션기반 분석용 OLAP 큐브구조 설계
Designing OLAP cube structures for transaction-based analysis 원문보기

초록 ▼
AI-Helper

Abstract ▼ AI-Helper

학위논문 정보

이 논문을 인용한 문헌

관련 콘텐츠

원문 보기

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

트랜잭션기반 분석용 OLAP 큐브구조 설계 Designing OLAP cube structures for transaction-based analysis 원문보기

초록 ▼ 용어보기논문에서 용어와 풀이말을 자동 추출한 결과로, 시범 서비스 중입니다. AI-Helper

Abstract ▼ AI-Helper

학위논문 정보

이 논문을 인용한 문헌

관련 콘텐츠

원문 보기

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

트랜잭션기반 분석용 OLAP 큐브구조 설계
Designing OLAP cube structures for transaction-based analysis 원문보기

초록 ▼
AI-Helper