FoDoSu : Multi-document summarization utilizing folksonomy

  FoDoSu: Multi-Document Summarization Exploiting Semantic Analysis based on Social Folksonomy

Multi-document summarization techniques aim to reduce documents into a small set of words or paragraphs that convey the main meaning of the original document. Many approaches to multi-document summarization have used probability based methods and machine learning techniques to summarize multiple documents sharing a common topic at the same time. However, these techniques fail to semantically analyze proper nouns and newly-coined words because most of them depend on an out-of-date dictionary or thesaurus.

 

To overcome these drawbacks, we propose a novel multi-document summarization system (so-called, FoDoSu: Folksonomy-based Multi-Document Summarization) that employs the tag cluster on Flickr, a type of folksonomy system, for detecting key sentences from multiple documents. We first create a word frequency table for analyzing the semantics and contribution of words by using the HITS algorithm. And then, by exploiting tag clusters, we analyze the semantic relationship between words in the word frequency table. Finally, we efficiently make a summary from multiple documents by analyzing the importance of each word and the semantic relatedness among them. The experimental results on the TAC 2008 and  2009 data sets demonstrate the improvement of our proposed framework over existing summarization systems.

답글 남기기

이메일 주소는 공개되지 않습니다. 필수 항목은 *(으)로 표시합니다