course number instructor title
CS 5984 E Rho Computational Social Science

The exponential increase in the production and consumption of information and interactions on the internet over the past decade has led to a vast growth of social and behavioral data. Such availability of data, combined with rapid advancements in natural language processing (NLP) and machine learning (ML) has led to the rise of Computational Social Science – an interdisciplinary field which provides the opportunity to empirically study human behavior by computationally analyzing large volumes of data.

In this course, students will adopt an interdisciplinary approach to empirically examine different social phenomena with applications to social science, such as political science, linguistics, and communications research. The empirical approach will span a variety of quantitative methods, including applying existing ML tools and NLP techniques. In this seminar-style class, students will investigate and present several academic readings from this emerging field. They will also work on a semester-long research project in groups of 2-3 students. The goal of the project is to identify a question or problem that can be addressed by analyzing data available online, with the broader goal of addressing larger societal issues such as (but not limited to) polarization, deviant behavior, and misinformation etc.

Course goals and learning objectives

After successful completion of this course, you will be able to:

Prerequisite skills

In terms of the required skills, students need to have basic knowledge of statistics, NLP/ ML and a willingness to do interdisciplinary research. An overview of concepts and tools will be reviewed briefly in class; however, in-depth coverage of the fundamentals is not in the scope of this course. This is NOT a machine learning or data mining course. Students will need to have basic proficiency in programming.  Students should be prepared to apply what they have learned in prior computer science classes in this course. You are expected to come with, or quickly learn skills required for your group project. For example, your project may require you to collect Twitter data using the Twitter API or analyze posts from Reddit using pre-existing libraries (like NLTK, scikit-learn, huggingface, etc.), which should not be too challenging if you already know high-level languages like Python. Please make sure you are comfortable with this.