course number | instructor | title |
CS 5984 | E Rho | Computational Social Science |
The exponential increase in the production and consumption of information and interactions on the internet over the past decade has led to a vast growth of social and behavioral data. Such availability of data, combined with rapid advancements in natural language processing (NLP) and machine learning (ML) has led to the rise of Computational Social Science – an interdisciplinary field which provides the opportunity to empirically study human behavior by computationally analyzing large volumes of data.
In this course, students will adopt an
interdisciplinary approach to empirically examine different social phenomena
with applications to social science, such as political science, linguistics, and
communications research. The empirical approach will span a variety of
quantitative methods, including applying existing ML tools and NLP techniques.
In this seminar-style class, students will investigate and present several
academic readings from this emerging field. They will also work on a
semester-long research project in groups of 2-3 students. The goal of the
project is to identify a question or problem that can be addressed by analyzing
data available online, with the broader goal of addressing larger societal
issues such as (but not limited to) polarization, deviant behavior, and
misinformation etc.
Course goals and learning objectives
After successful completion of this course, you will be able
to:
Prerequisite skills
In terms of the required skills, students need to have basic
knowledge of statistics, NLP/ ML and a willingness to do interdisciplinary
research. An overview of concepts and tools will be reviewed briefly in class;
however, in-depth coverage of the fundamentals is not in the scope of this
course. This is NOT a machine learning or data mining course. Students will need
to have basic proficiency in programming.
Students should be prepared to apply what they have learned in prior
computer science classes in this course. You are expected to come with, or
quickly learn skills required for your group project. For example, your project
may require you to collect Twitter data using the Twitter API or analyze posts
from Reddit using pre-existing libraries (like NLTK, scikit-learn, huggingface,
etc.), which should not be too challenging if you already know high-level
languages like Python. Please make sure you are comfortable with this.