Discussion Notes

(courtesy Saverio Perugini and Balaji Krishnamachari-Sampath)

focusing on accuracy and learning curve - the Achilles heel of collaborative filtering
directed graph model: nodes - users, directed edges - predictability
introduces the idea of making an indirect recommendation
concepts of horting and predicting are not transitive in that a direct hammock is not implied
recommendations are not symmetric - ala Siteseer
evaluation is tricky
takes care of the effusivity and negation aspects of making recommendation (e.g. making use of left shifting)
items presented for evaluation to users are partitioned into a "hot set" - to increase commonality in order to recommend better; and a "cold set" - to increase coverage
incorporates a hierarchical classification and creative links which violate that hierarchical classification
one of several techniques of the IRA, situation analyzer
tested against mythical e-commerce site / artificial data

second to Resnick and Varian as the most widely referenced paper in recommender systems - an indication of the prematurity of the field
a purely statistical model (correlation, Pearson's r, vector similarity, inverse user frequency)
use default votes to handle sparsity
an extreme end of the spectrum of methods of evaluating a recommender system
focused on accuracy of predictions, not efficiency
distinguishes between memory-based CF (lazy learning, non-parametric) and model-based CF (eager learning, parametric) - involves a lot of preprocessing; incrementality becomes an issue - almost impossible with a neural network or Bayesian network
item by item recommendations versus ranked lists
make use of protocols developed for other domains
makes vivid a very important aspect of recommender systems, that is, human satisfaction cannot be replaced with functions
do not address incrementality and do not capture the underlying social process
the other end of the spectrum is going out and doing satisfaction surveys, studies, etc. (HCI) - purely social
Is there something in between these two extremes that can make CS folks happy???

accurate and efficient recommendations to users in constant time
of course, not really constant time, because of all the work they do up front
distinguishes between so-called universal queries and user-selected queries
Universal queries (i.e. every user rates n number of items, called the "gauge" set - arbitrarily chosen?) yield a dense matrix, a solution to sparsity.
validated their work in the domain of jokes - jester - a domain in which there is minimal variation in thoughts between the different people who are providing evaluations
not dynamic
evaluation done once again via a function
function could be a black box (e.g. a neural network)
function misses other roles
One has to realize that in a two moded model (e.g. people and movies), the second mode (e.g. movies) brings the first mode together.
concept of "affiliation" network -
- people - primary
- movies, books, cds, etc. - secondary
While functional mapping (using a training set and a test set) as means of an evaluation technique is the most widely accepted and easiest in the field of recommender systems, it is also the most shallow.
Bottom line is that there is a social element to recommender systems