Title: Understanding and Managing Cascades on Large Graphs Abstract How do contagions spread in population networks? Which group should we market to, for maximizing product penetration? Will a given YouTube video go viral? Who are the best people to vaccinate? What happens when two products compete? The objective of this tutorial is to provide an intuitive and concise overview of most important theoretical results and algorithms to help us understand and manipulate such propagation-style processes on large networks. The tutorial will contain three parts: (a) Theoretical results on the behavior of fundamental models; (b) Scalable Algorithms for changing the behavior of these processes e.g., for immunization, marketing etc.; and (c) Empirical Studies of diffusion on blogs and on-line websites like Twitter. We finally conclude with future research directions. The problems we focus on are central in surprisingly diverse areas: from computer science and engineering, epidemiology and public health, product marketing to information dissemination. Our emphasis is on intuition behind each topic, and guidelines for the practitioner. Intended Duration 2 hrs + 0.5hr break Outline 1. Theory [Prakash, 45min] - Various diffusion models - Tipping Points of epidemics (static networks) - Tipping Point of epidemics (time-varying dynamic networks) - Competing contagions - what happens in the end? Related References: 1. R. M. Anderson and R. M. May. Infectious Diseases of Humans. Oxford University Press, 1991 2. H. W. Hethcote. The mathematics of infectious diseases. SIAM Review, 42, 2000 3. D. Easley and J. Kleinberg. Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press, 2010 4. F. M. Bass. A new product growth for model consumer durables. Management Science, 15(5):215–227, 1969. 5. M. Granovetter. Threshold models of collective behavior. Am. Journal of Sociology, 83(6):1420–1443, 1978 6. P. S. Dodds and D. J. Watts. A generalized model of social and biological contagion. Journal of Theoretical Biology, 232:587–604, September 2004. 7. D. Chakrabarti, Y. Wang, C. Wang, J. Leskovec, and C. Faloutsos. Epidemic thresholds in real networks. ACM TISSEC, 10(4), 2008 8. A. Ganesh, L. Massoulie, and D. Towsley. The effect of network topology on the spread of epidemics. In INFOCOM, 2005. 9. B. A. Prakash, D. Chakrabarti, M. Faloutsos, N. Valler, and C. Faloutsos. Threshold Conditions for Arbitrary Cascade Models on Arbitrary Networks. In ICDM 2011. 10. B. A. Prakash, H. Tong, N. Valler, M. Faloutsos, and C. Faloutsos. Virus propagation on time-varying networks: Theory and immunization algorithms. In ECML-PKDD, 2010 11. N. Valler, B. A. Prakash, H. Tong, M. Faloutsos, and C. Faloutsos. Epidemic Spread in Mobile Ad Hoc Networks: Determining the Tipping Point. In IFIP NETWORKING, 2011. 12. M. E. J. Newman. Threshold effects for two pathogens spreading on a network. Physical Review Letters, 95(10):108701, September 2005. 13. N. Pathak, A. Banerjee, and J. Srivastava. A generalized linear threshold model for multiple cascades. In ICDM, 2010. 14. B. A. Prakash, A. Beutel, R. Rosenfeld, and C. Faloutsos. Winner-takes-all: Competing Viruses on fair-play networks. In WWW 2012. 15. A. Beutel, B. A. Prakash, R. Rosenfeld, and C. Faloutsos. Interacting Viruses on a Network---Can both survive? In SIGKDD 2012. 2. Algorithms [Prakash, 45min] - Immunization - various flavors - Marketing - Finding ‘culprits’ Related References: 1. R. Cohen, S. Havlin, and D. ben Avraham. Efficient immunization strategies for computer networks and populations. Physical Review Letters, 91(24), Dec. 2003 2. H. Tong, B. A. Prakash, C. Tsourakakis, T. E. Rad, and C. Faloutsos. On the Vulnerability of Large Graphs. In ICDM 2010 3. B. A. Prakash, H. Tong, N. Valler, M. Faloutsos, and C. Faloutsos. Virus propagation on time-varying networks: Theory and immunization algorithms. In ECML-PKDD, 2010 4. M. Richardson and P. Domingos. Mining knowledge-sharing sites for viral marketing. In KDD, 2002. 5. D. Kempe, J. Kleinberg, and E. Tardos. Maximizing the spread of influence through a social network. In KDD, 2003. 6. J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. S. Glance. Cost-effective outbreak detection in networks. In KDD 2007. 7. W. Chen, C. Wang, and Y. Wang. Scalable influence maximization for prevalent viral marketing in large-scale social networks. In KDD 2010. 8. A. Goyal, W. Lu, L. V. S. Lakshmanan, CELF++: Optimizing the Greedy Algorithm for Influence Maximization in Social Networks. In WWW 2011 (Companion Volume). 9. T. Lappas, E. Terzi, D. Gunopoulos, H. Mannila: Finding Effectors in Social Networks. In KDD 2010. 10. D. Shah and T. Zaman. Detecting sources of computer viruses in networks: Theory and experiment. In SIGMETRICS 2010. 3. Empirical Studies [Faloutsos, 45min] - Burst Analysis - Blog-cascades - Tweets and more Related References: 1. R. Crane and D. Sornette. Robust dynamic classes revealed by measuring the response function of a social system. In PNAS, 2008. 2. J. Yang and J. Leskovec. Patterns of temporal variation in online media. In WSDM, 2011. 3. J. Leskovec, L. Backstrom, and J. M. Kleinberg. Meme-tracking and the dynamics of the news cycle. In KDD, 2009. 4. J. Leskovec, M. McGlohon, C. Faloutsos, N. S. Glance, and M. Hurst. Patterns of cascading behavior in large blog graphs. In SDM, 2007. 5. M. McGlohon, J. Leskovec, C. Faloutsos, M. Hurst, and N. Glance. Finding patterns in blog shapes and blog evolution. In ICWSM, 2007. 6. J. Leskovec, L. A. Adamic, and B. A. Huberman. The dynamics of viral marketing. In EC, 2006. 7. Y. Matsubara, Y. Sakurai, B. A. Prakash, L. Li, and C. Faloutsos. Rise and Fall Patterns of Information Diffusion---Model and Implications. In SIGKDD, 2012. Related Tutorials: To the best of our knowledge, none of the tutorials specifically focused on propagation/diffusion-style processes on large networks as we do. Remotely related are tutorials by Leskovec [KDD 2011] on Social Media Analytics and by Faloutsos [WWW 2008, KDD 2009] on Graph Mining in general. Who should attend? The target audience is data mining and data management researchers, who wish to learn more about models and tools for dealing with cascade-like processes on large datasets. There will be special emphasis on the cross-disciplinary aspect of the concepts and tools involved. Prerequisites For maximum benefit, the expected prerequisite is an undergraduate degree in Computer Sc. or a related field. However, the tutorial’s emphasis is on the intuition behind the results and tools. Speaker Bios B. Aditya Prakash (http://www.cs.vt.edu/~badityap) is an Assistant Professor in the Computer Science Department at Virginia Tech. He graduated with a Ph.D. from the Computer Science Department at Carnegie Mellon University in Sept. 2012, and got his B.Tech (in CS) from the Indian Institute of Technology (IIT) - Bombay. He has published 16 refereed papers in major venues and holds two U.S. patents. His interests include Data Mining, Applied Machine Learning and Databases, with emphasis on large real-world networks and time-series. Christos Faloutsos (http://www.cs.cmu.edu/~christos) is a Professor at Carnegie Mellon University. He has received the Presidential Young Investigator Award by the National Science Foundation (1989), the Research Contributions Award in ICDM 2006, the SIGKDD Innovations Award (2010), eighteen “best paper” awards (including two “test of time” awards), and four teaching awards. He is an ACM Fellow, he has served as a member of the executive committee of SIGKDD; he has published over 200 refereed articles, 11 book chapters and one monograph. He holds six patents and he has given over 30 tutorials and over 10 invited distinguished lectures. His research interests include data mining for graphs and streams, fractals, database performance, and indexing for multimedia and bio-informatics data. Contact Info B. Aditya Prakash Computer Science Department, 2202 Kraft Avenue, Virginia Tech., Blacksburg VA 24060 USA Phone: +1-540-231-0906 Email: badityap@cs.vt.edu Christos Faloutsos Computer Science Department, 5000 Forbes Avenue, Carnegie Mellon University, Pittsburgh PA 15213 USA Phone: +1-412-268-1457 Email: christos@cs.cmu.edu