Topic Modeling with Latent Dirichlet Allocation (LDA)
This DSI workshop, led by Associate Director Dr. Carl Stahmer, takes an in depth look at hyper parameters - the math behind the algorithms and the effects of the tuning parameters.
Prerequisites: beginner R skills and a working R environment with the following packages installed: TM, topicmodels, ggplot2. If you have a corpus you want to work on, bring it in ascii form. If not, a practice corpus will be available for you to use.
"You shall know a word by the company it keeps." Firth, J. R. (1957:11)
- Github Repro
- Blei et al. 2003 Journal of Machine Learning Reserach "Latent Dirichlet Allocation"