Workshop with H. Andrew Schwartz: "Automatic Content Analysis using the Differential Language Analysis ToolKit"
The Differential Language Analysis ToolKit
DLATK (Differential Language Analysis ToolKit) is an end to end language analysis software, specifically suited for social media and social scientific research applications. It has been used for research published in over 50 peer-reviewed papers across psychology, computer science, public health, medicine, and political science. Although the heart of DLATK is a Python library, it is typically used through a command interface (requiring no programming). This tutorial will cover the fundamentals of automated content analysis using DLATK:
- Differential language analysis
- (Producing linguistic insights into psychosocial phenomena)
- Predictive analytics
- (Machine and statistical learning using text data)
- The ingredients of automatic content analysis
About the instructor
H. Andrew Schwartz is part of the faculty of the Computer Science Department and Institute for AI-Driven Discovery and Innovation at Stony Brook University, New York. He was previously Lead Research Scientist for the interdisciplinary “World Well-Being Project” at the University of Pennsylvania where he created the Differential Language Analysis ToolKit.
To register please following the link:
To participate in the tutorial you will need a laptop with which you can connect to the Internet (Windows PC, Mac, or Linux PC -- all ok). The training venue provides a free Wifi connection for the day, and power to keep your laptop charged.
During the tutorial, participants will connect to a computer server which already has the analysis software (DLATK) installed. After you register, you will receive instructions on how to access this server and test a basic command which needs to be completed 24 hours before the tutorial.
Desirable but non-essential expertise:
- an entry-level understanding (or higher) of quantitative research methods
- basic scripting (R, Python, or syntax/code in SPSS/SAS/STATA)
Differential Language Analysis ToolKit:
Kern, M. L., Park, G., Eichstaedt, J. C., Schwartz, H. A., Sap, M., Smith, L. K., & Ungar, L. H. (2016). Gaining insights from social media language: Methodologies and challenges.
Grimmer, J., & Stewart, B. M. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political analysis, 21(3), 267-297.
Schwartz, H. A., Giorgi, S., Sap, M., Crutchley, P., Ungar, L., & Eichstaedt, J. (2017). DLATK: Differential Language Analysis ToolKit. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (pp. 55-60). Pdf
Schwartz, H. A., & Ungar, L. H. (2015). Data-driven content analysis of social media: a systematic overview of automated methods. The ANNALS of the American Academy of Political and Social Science, 659(1), 78-94.