Please wait a minute...
Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering)  2009, Vol. 10 Issue (6): 858-867    DOI: 10.1631/jzus.A0820796
Electrical & Electronic Engineering     
Hierarchical topic modeling with nested hierarchical Dirichlet process
Yi-qun DING, Shan-ping LI, Zhen ZHANG, Bin SHEN
School of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China; State Street Hangzhou, Hangzhou 310000, China
Download:     PDF (0 KB)     
Export: BibTeX | EndNote (RIS)      

Abstract  This paper deals with the statistical modeling of latent topic hierarchies in text corpora. The height of the topic tree is assumed as fixed, while the number of topics on each level as unknown a priori and to be inferred from data. Taking a nonparametric Bayesian approach to this problem, we propose a new probabilistic generative model based on the nested hierarchical Dirichlet process (nHDP) and present a Markov chain Monte Carlo sampling algorithm for the inference of the topic tree structure as well as the word distribution of each topic and topic distribution of each document. Our theoretical analysis and experiment results show that this model can produce a more compact hierarchical topic structure and captures more fine-grained topic relationships compared to the hierarchical latent Dirichlet allocation model.

Key wordsTopic modeling      Natural language processing      Chinese restaurant process      Hierarchical Dirichlet process      Markov chain Monte Carlo      Nonparametric Bayesian statistics     
Received: 15 November 2008     
CLC:  O212.8  
  H03  
Cite this article:

Yi-qun DING, Shan-ping LI, Zhen ZHANG, Bin SHEN. Hierarchical topic modeling with nested hierarchical Dirichlet process. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2009, 10(6): 858-867.

URL:

http://www.zjujournals.com/xueshu/zjus-a/10.1631/jzus.A0820796     OR     http://www.zjujournals.com/xueshu/zjus-a/Y2009/V10/I6/858

[1] HUANG Xiao-xi, ZHOU Chang-le. An OWL-based WordNet lexical ontology[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2007, 8(6): 864-870.
[2] YANG Che-Yu. Word sense disambiguation using semantic relatedness measurement[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2006, 7(10): 1-.
[3] GONG Tie-zhu, WANG Yuan-mei. A BAYESIAN PET RECONSTRUCTION METHOD USING SEGMENTED ANATOMICAL MEMBRANE AS PRIORS[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2001, 2(4): 406-410.