Text Mining Fundamentals

This hands-on workshop will cover the theory and practice of Topic Modeling as a method of untrained text classification. We will run a variety of models on the same corpus, identifying and discussing the function of model parameters and their effect on output. Validation practices will also be discussed. Participants should have moderate experience R and Git. Familiarity with the R “TM” package will be beneficial but is not required. (Participation in the previous four workshops in this series will prepare you well for this workshop.) Please come to the workshop with a working R development environment and Git already installed and operational on your system.