Reading tea leaves: how humans interpret topic models
|Conference paper (help)|
|Reading tea leaves: how humans interpret topic models|
|Authors:||Jonathan Chang, Jordan Boyd-Graber, Sean Gerrish, Chong Wang, David M. Blei|
|Citation:||Advances in Neural Information Processing Systems 22 : 288-296. 2009|
|Editors:||Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, A. Culotta|
|Meeting:||Neural Information Processing Systems 22|
|Web:||DuckDuckGo Bing Google Yahoo! — Google PDF|
|Article:||Google Scholar PubMed|
|Restricted:||DTU Digital Library|
Reading tea leaves: how humans interpret topic models describes a method for human evaluation of topic mining results.
Three topic mining methods are evaluated:
- Probabilistic latent semantic indexing (pLSI), with pseudocount
- Latent Dirichlet allocation (LDA)
- Correlated topic model (CTM)
Two human tasks with the Amazon Mechanical Turk:
- Word intrusion task
- Topic intrusion task
"Traditional metrics are, indeed, negatively correlated with the measures of topic quality developed in this paper".
While CTM was better when measured with the statistical model performance metrics, LDA and pLSI was better performing when the topic representation was measured by human tasks.