Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

A Generative Theory of RelevanceGenerative Density Allocation

A Generative Theory of Relevance: Generative Density Allocation [The present chapter plays a special role in this book. In this chapter we will not be talking about relevance, documents or queries. In fact, this chapter will have very little to do with information retrieval. The subject of our discussion will be generative models for collections of discrete data. Our goal is to come up with an effective generative framework for capturing interdependencies in sequences of exchangeable random variables. One might wonder why a chapter like this would appear in a book discussing relevance. The reason is simple: a generative model lies at the very heart of the main assumption in our model. Our main hypothesis is that there exists a generative model that is responsible for producing both documents and queries. When we construct a search engine based on the GRH, its performance will be affected by two factors. The first factor is whether the hypothesis itself is true. The second factor is how accurately we can estimate this unknown generative process from very limited amounts of training data (e.g. a query, or a single document). Assuming the GRH is true, the quality of our generative process will be the single most important influence on retrieval performance. When we assume the generative hypothesis, we are in effect reducing the problem of information retrieval to a problem of generative modeling. If we want good retrieval performance, we will have to develop effective generative models.] http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png

A Generative Theory of RelevanceGenerative Density Allocation

Part of the The Information Retrieval Series Book Series (volume 26)

Loading next page...
 
/lp/springer-journals/a-generative-theory-of-relevance-generative-density-allocation-hOpu0iLJ3F
Publisher
Springer Berlin Heidelberg
Copyright
© Springer Berlin Heidelberg 2009
ISBN
978-3-540-89363-9
Pages
71 –102
DOI
10.1007/978-3-540-89364-6_4
Publisher site
See Chapter on Publisher Site

Abstract

[The present chapter plays a special role in this book. In this chapter we will not be talking about relevance, documents or queries. In fact, this chapter will have very little to do with information retrieval. The subject of our discussion will be generative models for collections of discrete data. Our goal is to come up with an effective generative framework for capturing interdependencies in sequences of exchangeable random variables. One might wonder why a chapter like this would appear in a book discussing relevance. The reason is simple: a generative model lies at the very heart of the main assumption in our model. Our main hypothesis is that there exists a generative model that is responsible for producing both documents and queries. When we construct a search engine based on the GRH, its performance will be affected by two factors. The first factor is whether the hypothesis itself is true. The second factor is how accurately we can estimate this unknown generative process from very limited amounts of training data (e.g. a query, or a single document). Assuming the GRH is true, the quality of our generative process will be the single most important influence on retrieval performance. When we assume the generative hypothesis, we are in effect reducing the problem of information retrieval to a problem of generative modeling. If we want good retrieval performance, we will have to develop effective generative models.]

Published: Jan 1, 2009

Keywords: Mixture Model; Topic Model; Latent Dirichlet Allocation; Generative Density; Dirichlet Kernel

There are no references for this article.