Brainmaker

Nanos gigantium humeris insidentes!
You are currently browsing all posts tagged with Reading Note

Recommendation System

  • November 4, 2013 6:22 am

From: http://www.cambridge.org/us/academic/subjects/computer-science/knowledge-management-databases-and-data-mining/mining-massive-datasets

The distinction between the physical and on-line worlds has been called the long tail  phenomenon, and it is suggested in Fig. 9.2. The vertical axis represents popularity  (the number of times an item is chosen). The items are ordered on the horizontal axis according to their popularity. Physical institutions provide only the most popular items to the left of the vertical line, while the corresponding on-line institutions provide the entire range of items: the tail as well as the popular items.

Screen Shot 2013-11-03 at 10.18.13 PM

The long tail: physical institutions can only provide what is popular,
while on-line institutions can make everything available

There are two basic architectures for a recommendation system:

1. Content-Based  systems focus on properties of items. Similarity of items is determined by measuring the similarity in their properties.

2. Collaborative-Filtering  systems focus on the relationship between users and items. Similarity of items is determined by the similarity

1. Content-based

Item Profile: In a content-based system, we must construct for each item a profile , which is a record or collection of records representing important characteristics of that item.

User Profile:

We not only need to create vectors describing items; we need to create vectors with the same components that describe the user’s preferences.

With profile vectors for both users and items, we can estimate the degree to which a user would prefer an item by computing the cosine distance between the user’s and item’s vectors.

2. Collaborative Filtering

Measure Similarity of Users

Cluster users or items