[논문]Combining Multiple Sources of Evidence to Enhance Web Search Performance

Yang, Kiduk

doi:10.16981/kliss.45.3.201409.5

Combining Multiple Sources of Evidence to Enhance Web Search Performance 원문보기

한국도서관 정보학회지 = Journal of Korean Library and Information Science Society, v.45 no.3, 2014년, pp.5 - 36

Yang, Kiduk (Department of Library and Information Science, Kyungpook National University)

초록
AI-Helper

웹은 하이퍼링크 및 야후와 같이 수동으로 분류된 웹 디렉토리 처럼 문서의 콘텐츠를 넘어선 다양한 정보의 소스가 풍부하다. 이 연구는 웹문서 내용을 활용한 텍스트기반의 검색 방식, 하이퍼 링크를 활용한 링크 기반의 검색 방식, 그리고 야후의 카테고리를 활용한 분류 기반의 검색 방식을 융합하므로서 여러 정보소스를 결합하면 검색 성능을 향상시킬 수 있다는 기존 융합검색연구들을 확장시켰다. 텍스트, 링크, 분류 기반 검색 결과를 여러가지 선형조합식으로 생성한 융합결과를 기존의 검색 평가 지표를 사용하여 각각의 검색 결과와 비교 한 후, 검색결과 오버랩의 중요성 또한 조사 하였다. 본 연구는 텍스트, 링크, 분류 기반 검색의 솔루션 스패이스들의 다양성이 융합검색의 적합성을 제시한다는 결론과 더불어 시스템 파라미터의 영향, 그리고 오버랩, 문서순위, 관련성들의 상호 관계 같은 융합 환경의 중요한 특성들을 분석하였다.

Abstract ▼ AI-Helper

The Web is rich with various sources of information that go beyond the contents of documents, such as hyperlinks and manually classified directories of Web documents such as Yahoo. This research extends past fusion IR studies, which have repeatedly shown that combining multiple sources of evidence (i.e. fusion) can improve retrieval performance, by investigating the effects of combining three distinct retrieval approaches for Web IR: the text-based approach that leverages document texts, the link-based approach that leverages hyperlinks, and the classification-based approach that leverages Yahoo categories. Retrieval results of text-, link-, and classification-based methods were combined using variations of the linear combination formula to produce fusion results, which were compared to individual retrieval results using traditional retrieval evaluation metrics. Fusion results were also examined to ascertain the significance of overlap (i.e. the number of systems that retrieve a document) in fusion. The analysis of results suggests that the solution spaces of text-, link-, and classification-based retrieval methods are diverse enough for fusion to be beneficial while revealing important characteristics of the fusion environment, such as effects of system parameters and relationship between overlap, document ranking and relevance.

주제어

참고문헌 (31)

Bartell, Brian T., G. W. Cottrell and R. K. Belew. 1994. "Automatic combination of multiple ranked retrieval systems." Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval.
Belkin, Nicholas J., C. Cool, W. B. Croft and J. P. Callan. 1993. "The effect of multiple query representations on information retrieval system performance." Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval, 339-346.
Bharat, Krishnaand M. R. Henzinger. 1998. "Improved Algorithms for Topic Distillation in Hyperlinked Environments." Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, 104-111.
Brin, Serge andL. Page. 1998. "The anatomy of a large-scale hyper textual Web search engine." Computer networks and ISDN systems, 30(1): 107-117.

상세보기
Buckley, Chris, G. Salton, J. Allan and A. Singhal. 1995. "Automatic query expansion using SMART: TREC 3." In D. K. Harman (Ed.), The Third Text Rerieval Conference (TREC-3) (NIST Spec. Publ. 500-225, pp.1-19). Washington, DC: U.S. Government Printing Office
Buckley, Chris, A. Singhal and M. Mitra. 1997. "Using query zoning and correlation within SMART: TREC 5." In E. M. Voorhees & D. K. Harman (Eds.),The Fifth Text REtrieval Conference (TREC-5) (NIST Spec. Publ. 500-238, pp. 105-118). Washington, DC: U.S. Government Printing Office.
Buckley, Chris, A. Singhal, M. Mitra and G. Salton. 1996. "New retrieval approaches using SMART: TREC 4." In D. K. Harman (Ed.), The Fourth Text REtrieval Conference (TREC-4) (NIST Spec. Publ. 500-236, pp. 25-48). Washington, DC: U.S. Government Printing Office.
Chakrabarti, Soumen, B. Dom, P. Raghavan, S. Rajagopalan, D. Gibson and J. Kleinberg. 1998. "Automatic resource list compilation by analyzing hyperlink structure and associated text." Proceedings of the 7th International World Wide Web Conference.
Fishburn, Peter C. 1970. Utility theory for decision making. New York: John Wiley & Sons.
Fox, Edward A. andJ. A. Shaw. 1994. "Combination of multiple searches." In D. K. Harman (Ed.), The Second Text Rerieval Conference (TREC-2) (NIST Spec. Publ. 500-215, pp.243-252). Washington, DC: U.S. Government Printing Office.
Fox, Edward A. and J. A. Shaw. 1995. "Combination of multiple searches." In D. K. Harman (Ed.), The Third Text Rerieval Conference (TREC-3) (NIST Spec. Publ. 500-225, pp. 105-108). Washington, DC: U.S. Government Printing Office.
Frakes, Williams B. and R.Baeza-Yates.eds. 1992. Information retrieval: Data structures & algorithms. Englewood Cliffs, NJ: Prentice Hall.
Gurrin, Cathal and A. F.Smeaton. 2001. "Dublin City University experiments in connectivity analysis for TREC-9." In E. M. Voorhees & D. K. Harman (Eds.), TheNineth Text Rerieval Conference(TREC-9). Washington, DC: U.S. Government Printing Office.
Katzer, Jeffrey, M. J. McGill, J. A. Tessier, W. Frakes and P. DasGupta. 1982. "A study of the overlap among document representations." Information Technology: Research and Development, 1, 261-274.
Keen, E. Michael. 1973. "The Aberystwyth index languages test." Journal of Documentation, 29, 1-35.

상세보기
Kleinberg, Jon. 1999. "Authoritative sources in a hyperlinked environment." Journal of the Association for Computing Machinery, 46(5), 604-632.

상세보기
Lee, Joon Ho. 1996. "Combining multiple evidence from different relevance feedback methods(Tech. Rep. No.IR-87)." Amherst: University of Massachusetts, Center for Intelligent Information Retrieval.
Lee, Joon Ho. 1997. "Analyses of multiple evidence combination." Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, 267-276.
Modha, Dharmendra and W. S. Spangler. 2000. "Clustering hypertext with applications to Web searching." Proceedings of the 11th ACM Hypertext Conference, 143-152.
Page, Larry, S. Brin, R. Motwani and T. Winograd.1998. "The Page Rank citation ranking: Bringing order to the Web." Technical Report, Stanford Digital Library Technologies Project.
Plaunt, Christian and B. A. Norgard. 1998. "An Association Based Method for Automatic Indexing with a Controlled Vocabulary." Journal of the American Society for Information Science, 49(10): 888-902.
Saracevic, Tefko and P. Kantor. 1988. "A study of information seeking and retrieving. III. Searchers, searches, overlap." Journal of American Society for Information Science, 39: 197-216.

상세보기
Singhal, Amit, C. Buckley and M. Mitra. 1996. "Pivoted document length normalization." Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, 21-29.
Smith, Linda. C. 1979. Selected Artificial Intelligence Techniques in Information Retrieval Systems Research. Ph. D. diss., Syracuse University, U. S.
Sparck Jones, Karen. 1974. "Automatic indexing." Journal of Documentation 30, 393-432.

상세보기
Sumner, Robert. G., K. Yang, R. Akers and W. M. Shaw. 1998. "Interactive retrieval using IRIS: TREC-6 experiments." In E. M. Voorhees & D. K. Harman(Eds.), The Sixth Text REtrieval Conference(TREC-6).
Vogt, Christopher. C and G. W. Cottrell. 1998. "Predicting the performance of linearly combined IR systems." Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, 190-196.
Williams, Martha E. 1977. "Analysis of terminology in various CAS data files as access points for retrieval." Journal of Chemical Information and Computer Sciences, 17: 16-20.

상세보기
Wong, S. K. Michael, Y. Y. Yao and P.Bollmann. 1988. "Linear structure in information retrieval." Proceedings of the 11th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 219-232.
Wong, S. K. Michael, Y. Y. Yao, G. Salton and C. Buckley. 1991. "Evaluation of an adaptive linear model." Journal of the American Society for Information Science, 42: 723-730.

상세보기
Yang, Kiduk. 2005. "Information retrieval on the web." ARIST, 39(1): 33-80.

상세보기

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 논문명, 저널/프로시딩명, 저자 , 발행년, 권, 호, 시작페이지, 끝페이지, 발행기관 관리번호, 논문명, 대등논문명, 저자 , 저널/프로시딩명, 발행기관, 발행년, 발행언어, 권, 호, 시작페이지, 끝페이지, ISBN, ISSN, 주제분야, 키워드, 초록(한글), 초록(영문), 저자(소속기관)
저장형식	Text(ASCII format) Excel format RefWorks Direct Export RIS format (for Reference Manager, ProCite, EndNote), Scholar's Aids, Mendeley
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

Combining Multiple Sources of Evidence to Enhance Web Search Performance 원문보기

초록
AI-Helper

Abstract ▼ AI-Helper

주제어

참고문헌 (31)

이 논문을 인용한 문헌

저자의 다른 논문 :

관련 콘텐츠

원문 보기

원문 URL 링크

오픈액세스(OA) 유형

연관된 기능

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

Combining Multiple Sources of Evidence to Enhance Web Search Performance 원문보기

초록 용어보기논문에서 용어와 풀이말을 자동 추출한 결과로, 시범 서비스 중입니다. AI-Helper

Abstract ▼ AI-Helper

주제어

참고문헌 (31)

이 논문을 인용한 문헌

저자의 다른 논문 :

양기덕 (20)

관련 콘텐츠

원문 보기

원문 URL 링크

오픈액세스(OA) 유형

연관된 기능

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

초록
AI-Helper