Discussion Notes
Feb 19, 2001
(courtesy Saverio Perugini)
Trawling the web for emerging
cyber-communities
- an example of mining for structure - bipartite graph /
core
- their fans are specialized
hubs and centers are
authorities
- a lot of preprocessing / data cleaning up
front, heuristics
- organized data (sequences) such that main
memory was enough
- Descriptive versus constructive view of algorithms:
random graph models allow us to quantify the probability of
finding some type of structure, or the resources needed by
an algorithm to find it.
- Observe the smoke signal effect in the power-law graphs: solution is to
use cumulative frequency distribution rather than raw frequency distribution
- Pruning criterion / property: Useful for constraining
search. Here, the idea that if an itemset does not have the
property (support), then no superset of it can have
the property.
- read the CACM article on `Discovering Shared Interests by Graph
Analysis,' a way ahead of its time paper that addresses more
or less the same ideas
- Prelude to next class: observation based
systems - can one figure out what a system
(e.g. search engine) is doing by simply issuing
queries to it?
|