# Some experiences of LDA java version

LDA is a popluar Topic Model widely used in research and application. Until now, the original paper described LDA has cited 7150. Very luckly, some warm-hearted men have shared their codes, so we can easily use it. Thanks for the spirit of sharing. Here is my some experiences of LDA java version:

First, dowload the project of LDA java version. The writer Xuan-Hieu Phan is a very enthusiast.

From my experience, –also maybe my problem –, There are not effective args when we input them into Eclipse. So, I suggest to package the project into jar packet. We can input the command lines like that:

1. java -jar jgibblda.jar -est -alpha 0.5 -beta 0.1 -ntopics 400 -dir ./ -dfile train.txt

The non-input args like -alpha, -beta are used the defalut. Note that the -dir should not be ingored, although you will put the jar packet and train data in a same path. Another noteworthy part is that the default -niters are 1000 rather than 2000 described in the user’s guide. And the default -niters of infer for new data are 100 rather than 20.

The input data format should like that:
[TotalNum of the document]
[document1]
…..
[document..]
The pre-processing–(e.g. removing stop words and rare words, stemming, etc) should be done by users.

1. java -jar jgibblda.jar -inf -dir ./ -model model-final -dfile test.txt

## 《Some experiences of LDA java version》上有 3 条评论

1. Pingback 引用通告： 处理SearchSnippets数据集 | 刻骨铭心

2. 尊敬的老师您好，我是一名学生，希望能获得一些您当时使用java运行所提取的特征数据。