Event box

Clustering and Classifying Textual Corpora (online RCR; GS717.10)

This workshop will equip students with a general understanding of document clustering and classification techniques for research. We'll use the Orange data analysis platform and the open-source MALLET toolkit to explore ways of characterizing or sorting large corpora. Participants will learn how to build workflows for classifying texts, how to interpret the results of document classification and clustering, and how to apply such techniques to their own research. 

Students who wish to continue their study of these topics with a hands-on lab session may register for GS717.11 here.

Date:
Thursday, October 19, 2023
Time:
9:00am - 11:00am
Categories:
Digital Humanities   Digital Scholarship   Scholarly Communications  
Registration has closed.

Event Organizer

Profile photo of Will Shaw
Will Shaw

Digital Humanities Consultant, Duke University Libraries