(c) Naren Ramakrishnan and the students of CS6604, Spring 2001. Permission to use ideas about the organization of topics, slides, and discussion notes is granted, provided suitable acknowledgements and citations are made.

CS 6604 Lectures

Strand Diagram An inclusion of a paper in the reading list does not constitute endorsement by the instructor. Outlines and topics are tentative and subject to being pushed around.

Jan 15: [Introduction Slides], Strand Diagram, and Basic Dichotomies of Recommender Systems. Reading assignment: Take a peek at the Communications of the ACM March 1997 and Aug 2000 issues and classify the systems there according to the various dimensions induced by the dichotomies.



The Modeling Dichotomy

Jan 17: Review of IR perspectives. Basic problems of recommendation. Details of content-based and collaborative approaches. Examples from search engines. Endemic problems with ratings and evaluations. [Slides]

Jan 19: Reading Assignments: [Discussion Notes]

    P. Resnick and H. Varian, Recommender Systems, Communications of the ACM, Vol. 40, No. 3, pages 56-58, March 1997. [Read all Rec papers from this issue]

Jan 22: Evaluation and comparison of collaborative filtering algorithms. [Discussion Notes]

Jan 24: The use of data mining and machine learning techniques to learn mappings and internal representations. Implications for maintaining and updating mappings, with dynamic data. How choice of the technique affects (unfortunately) evaluation criteria. Explainability and believability of recommendations. Motivations from PYTHIA. We will survey all the articles we have seen so far from these perspectives.



Targeting a Recommender System

Jan 26: Options and opportunities. Reading assignments from the Aug 2000 CACM (place the various other systems we have seen so far in this context). Exercise: Find 5-10 web sites that target customers at various levels of the targeting dichotomy (see first day's slides for more info). [Discussion Notes]

Jan 29: How data mining algorithms and techniques influence (and are influenced by) targeting dichotomies. The role of clustering in recommender systems. Can recommender systems be designed independently of the decided level of targeting? [Discussion Notes]

Jan 31: Improving targeting by observing browsing behavior. One example is the following reading assignment. Exercise: Identify 2-3 sites that use indicators such as these to improve their targeting. [Discussion Notes]



The Matview

Feb 2: On why everything is a matrix. Things that can be done with matrices. Connections with age-old IR research. Review of linear algebra and algorithmics. [Discussion Notes]

Feb 5: Introduction to Latent Semantic Indexing [Discussion Notes]: Feb 7: Minor tweaks to this idea [Discussion Notes]: Feb 9: Generalizations of the idea [Discussion Notes]: Feb 12: Interesting Variations [Discussion Notes]:
    D. D. Lee and H. S. Seung. Learning the Parts of Objects by Non-Negative Matrix Factorization, Nature, Vol 401, pages 788-791, 1999. Not available electronically (I think), will hand out copies in class. A web-tutorial on learning dynamic systems is available that covers a variety of pertinent algorithms, such as SVD, EM, and neural networks. A discussion site for this paper is also online.

Feb 14: Eigenvectors in the real-world [Discussion Notes]:
    S. Chakraborti, B.E. Dom, S. Ravi Kumar, P. Raghavan, S. Rajagopalan, A. Tomkins, D. Gibson, and J. Kleinberg, Mining the Web's Link Structure, IEEE Computer, Vol. 32, No. 8, pages 60-67, August 1999. This paper describes the algorithm behind the much-acclaimed CLEVER project at IBM Almaden.



The Graphview

Feb 16: On why everything is a graph. Things that can be done with graphs. Graph perspectives in recommender systems. Details of this strand of research [Discussion Notes].

Feb 19: Mining for graph-based communities (this is really an expansion of the sidebar from Feb 14's reading) [Discussion Notes]: Feb 21: Modeling small world networks [Discussion Notes]: Feb 23: More about small world networks [Discussion Notes]:
    R. Albert, H. Jeong, and A.-L. Barabási, Diameter of the World-Wide Web, Nature, Vol. 401, pages 130-131, 1999.

    L.A.N. Amaral, A. Scala, M. Barthelemy, and H.E. Stanley, Classes of Behavior of Small-World Networks, cond-mat/0001458, January 2000.

Feb 26: Mapping the Web: Read the following two papers and determine how/if search engines could exploit the information mined from the first study. [Discussion Notes]
    A. Broder et al., Graph Structure in the Web, In Proceedings of the International World Wide Web Conference, 1999.

    An analysis of the coverage of search engines: S. Lawrence and C. Lee Giles, Searching the World Wide Web, Science, Vol. 280, No. 5360, pages 98-100, 1998.

    Optional Reading: An analysis of link analysis used in search engines: M. Henzinger, Hyperlink Analysis for the Web, IEEE Internet Computing, pages 45-50, Jan-Feb 2001 (we have covered most of this already).

Feb 28: Applications of Graph Theory in Recommender Systems - An example of mining, modeling, and exploiting. I will describe Batul Mirza's thesis research.


Midterm Class Presentations

Mar 2: Class Presentations.

SPRING BREAK! :) :) :)

Mar 12: Class Presentations (contd.).


Content Modeling, Information Integration, and Interaction

Mar 14: Introduction to this strand. Overview of content modeling, web data extraction, information integration. A good tutorial on content modeling (some topics only), pertaining to web-DB integration, is available in: Mar 16: Modeling web resources [Discussion Notes]: Mar 19: Mining Semistructure [Discussion Notes]: Mar 21: More on Content Modeling:
    Learning to correct for single-spelling errors (e.g. in search engine queries). A presentation by Rob Capra.
Mar 23: Contextual Abstractions [Discussion Notes]: Mar 26: Modeling Interaction [Discussion Notes]: Mar 28: Laws of Surfing and their Uses: Mar 30: Task-Based System Designs: Apr 2: Integrated Approaches to Building Hot-Rods:

Transcoding, Intermediaries, and Functional Indirection

Apr 4: Introduction to the role and nature of intermediaries on the web. A good starting point is this (rather industry-ish) IBM article. Think specifically on the role of recommender systems as intermediaries in a larger personalization context.

Apr 6: Indirection as a design principle. A good example is the recently proposed solution for broken hyperlinks (again, relate this back to recommender systems): Apr 9: Personalization as a necessary ingredient in mobile systems: Apr 11: Location-sensitive Personalization (e.g., recommending a McDonalds near your current location, etc.). I have been unable to find good technical papers that describe this topic. Here's a description of a project that can be discussed in class: Apr 13: Standards and Conventions for Large-Scope Solutions. We will look at words that end with F, capability description standards, and the role that "personalization protocols" can play. Will also throw in Ethics, Privacy, and Business Stuff (ideally, they deserve their own classes, but we are running out of slots). Some references for standards:

Misc.

Apr 16: Detailed Project Presentations Start! :)


Return Home