CS 6824

course number	instructor	title
CS 5984	B Viswanath	Security Analytics

The cyber threat landscape is diverse, and includes principal threats such as malware, botnets, spam, compromised accounts, fake accounts, and phishing, to name a few. These threats are constantly evolving and can negatively impact how we interact on the web, with other people, with our personal devices, and even threaten our safety at home (with the proliferation of IoT devices). How can we better understand such threats, and build protective measures? Given the diverse nature of these threats, where do we even start from? An effective approach is to start from data — most systems leave vast traces of data when they operate, e.g., logs of user activity, machine activity, and communication. In this class, we will explore how such data combined with appropriate algorithms can provide powerful tools to analyze security threats. We will start by covering the threat landscape from a data-driven perspective, by following research that takes a measurement and analysis approach to understand real world threats. This will help us understand incentives for attackers today, their attack strategies and how attacks evolve over time. Next, we will learn to apply techniques from machine learning, graph analysis, and natural language processing schemes in a security context. This includes understanding the strengths and limitations of different family of algorithms, and how certain combinations of data and algorithms may strengthen or weaken the “arms race” between attackers and defenders. Finally, we will cover the emerging space of data-driven attacks, where we consider malicious adversaries capable of leveraging data and machine learning (especially deep learning) to launch powerful attacks..

Topics covered:

Understanding common threats: Measurement studies of various threats: e.g., malicious crowdsourcing services, large-scale botnets, fake news, spam, and phishing campaigns, reputation manipulation on social media, fake accounts, click fraud, and denial of service attacks.
Threats against machine learning systems: Topics on adversarial machine learning, e.g., adversarial samples to fool classiﬁers, model poisoning attacks.
Data and algorithms for better security: Application of machine learning (including deep learning), graph-based approaches, and NLP schemes to build robust defenses.
Data and algorithms for evil: Misuse of deep learning for attacks, e.g., deep learning to bypass existing defenses, deep learning to generate fake online content to mislead users..

Prerequisites:
Undergraduate courses on information systems, and high level pro-gramming languages. Students are expected to have a basic understanding of graph theory, algorithms, networks and distributed systems, and also be ready to learn concepts from machine learning, NLP and information retrieval. Knowledge of a scripting language such as Python or Perl would greatly aid you in your work. Students who enroll for the course are expected to be highly motivated to learn and work hard and be ready to make up for any prerequisite deﬁciencies they may have.