Discussion Notes
Jan 22
(courtesy Saverio Perugini and Balaji
Krishnamachari-Sampath)
Horting Hatches an Egg: A New Graph-Theoretic
Approach to Collaborative Filtering
- focusing on accuracy and learning curve - the
Achilles heel of collaborative filtering
- directed graph model: nodes - users,
directed edges - predictability
- introduces the idea of
making an indirect recommendation
- concepts of horting and predicting are not
transitive in that a direct hammock is not
implied
- recommendations are not symmetric - ala
Siteseer
- evaluation is tricky
- takes care of the effusivity and negation
aspects of making recommendation (e.g. making use
of left shifting)
- items presented for evaluation to users are
partitioned into a "hot set" - to increase
commonality in order to recommend better; and a
"cold set" - to increase coverage
- incorporates a hierarchical classification
and creative links which violate that
hierarchical classification
- one of several techniques of the IRA,
situation analyzer
- tested against mythical e-commerce site /
artificial data
Empirical Analysis of Predictive Algorithms for
Collaborative Filtering
- second to Resnick and Varian as the most
widely referenced paper in recommender systems -
an indication of the prematurity of the
field
- a purely statistical model (correlation,
Pearson's r, vector similarity, inverse user
frequency)
- use default votes to handle sparsity
- an extreme end of the spectrum of methods of
evaluating a recommender system
- focused on accuracy of predictions, not
efficiency
- distinguishes between memory-based CF (lazy
learning, non-parametric) and model-based CF (eager
learning, parametric) - involves a lot of
preprocessing; incrementality becomes an issue -
almost impossible with a neural network or
Bayesian network
- item by item recommendations versus ranked
lists
- make use of protocols developed for other
domains
- makes vivid a very important aspect of
recommender systems, that is, human satisfaction
cannot be replaced with functions
- do not address incrementality and
do not capture the underlying social process
- the other end of the spectrum is going out
and doing satisfaction surveys, studies, etc.
(HCI) - purely social
- Is there something in between these two
extremes that can make CS folks happy???
Eigentaste: A Constant Time Collaborative
Filtering Algorithm
- accurate and efficient recommendations to
users in constant time
- of course, not really constant time, because
of all the work they do up front
- distinguishes between so-called universal queries and
user-selected queries
- Universal queries (i.e. every user
rates n number of items, called the
"gauge" set - arbitrarily chosen?)
yield a dense matrix, a solution to
sparsity.
- validated their work in the domain of jokes -
jester - a domain in which there is minimal
variation in thoughts between the different
people who are providing evaluations
- not dynamic
- evaluation done once again via a
function
- function could be a black box (e.g. a neural
network)
- function misses other roles
- One has to realize that in a two moded model
(e.g. people and movies), the
second mode (e.g. movies) brings the
first mode together.
-
concept of "affiliation" network -
- people - primary
- movies, books, cds, etc. -
secondary
- While functional mapping (using a training
set and a test set) as means of an evaluation
technique is the most widely accepted and easiest
in the field of recommender systems, it is also
the most shallow.
- Bottom line is that there is a social element
to recommender systems
|