Get 20M+ Full-Text Papers For Less Than $1.50/day. Subscribe now for You or Your Team.

Learn More →

Latent dirichlet allocation based multi-document summarization

Latent dirichlet allocation based multi-document summarization Latent Dirichlet Allocation Based Multi-Document Summarization — Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai - 600 036, India. Rachit Arora rachitar@cse.iitm.ernet.in Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai - 600 036, India. Balaraman Ravindran ravi@cse.iitm.ernet.in ABSTRACT Extraction based Multi-Document Summarization Algorithms consist of choosing sentences from the documents using some weighting mechanism and combining them into a summary. In this article we use Latent Dirichlet Allocation to capture the events being covered by the documents and form the summary with sentences representing these di €erent events. Our approach is distinguished from existing approaches in that we use mixture models to capture the topics and pick up the sentences without paying attention to the details of grammar and structure of the documents. Finally we present the evaluation of the algorithms on the DUC 2002 Corpus multi-document summarization tasks using the ROUGE evaluator to evaluate the summaries. Compared to DUC 2002 winners, our algorithms gave signi cantly better ROUGE1 recall measures. Categories and Subject Descriptors I.2.7 [Arti cial Intelligence]: Natural Language Processing ”Text analysis,Multi-Document Summarization Keywords Latent Dirichlet Allocation, Multi-Document Summarization 1. INTRODUCTION Multi-Document Summarization consists of computing the summary http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png

Latent dirichlet allocation based multi-document summarization

Association for Computing Machinery — Jul 24, 2008

Loading next page...
/lp/association-for-computing-machinery/latent-dirichlet-allocation-based-multi-document-summarization-xINk5yxIyk

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Datasource
Association for Computing Machinery
Copyright
Copyright © 2008 by ACM Inc.
ISBN
978-1-60558-196-5
doi
10.1145/1390749.1390764
Publisher site
See Article on Publisher Site

Abstract

Latent Dirichlet Allocation Based Multi-Document Summarization — Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai - 600 036, India. Rachit Arora rachitar@cse.iitm.ernet.in Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai - 600 036, India. Balaraman Ravindran ravi@cse.iitm.ernet.in ABSTRACT Extraction based Multi-Document Summarization Algorithms consist of choosing sentences from the documents using some weighting mechanism and combining them into a summary. In this article we use Latent Dirichlet Allocation to capture the events being covered by the documents and form the summary with sentences representing these di €erent events. Our approach is distinguished from existing approaches in that we use mixture models to capture the topics and pick up the sentences without paying attention to the details of grammar and structure of the documents. Finally we present the evaluation of the algorithms on the DUC 2002 Corpus multi-document summarization tasks using the ROUGE evaluator to evaluate the summaries. Compared to DUC 2002 winners, our algorithms gave signi cantly better ROUGE1 recall measures. Categories and Subject Descriptors I.2.7 [Arti cial Intelligence]: Natural Language Processing ”Text analysis,Multi-Document Summarization Keywords Latent Dirichlet Allocation, Multi-Document Summarization 1. INTRODUCTION Multi-Document Summarization consists of computing the summary

There are no references for this article.