Fundamentals of the Recommendation System

Based on the [Advanced Machine Learning with TensorFlow on Google Cloud Platform] (https://www.coursera.org/learn/recommendation-models-gcp), this article looks up fundamental recommendation system models, content-based, collaborative-filtering and knowledge-based.

Content-Based Model

How it works?

Step 1. User-Item Rating Matrix

🙎‍♀️ / Movie	Rating
Minions	7
A Star is Born	4
Aladdin	10

Let’s just assume that we are given the table above. In this table, the user🙎‍♀️ gave 7-star ratings to “Minions”, four-star ratings to “A Star is Born” and 10-ratings to “Aladdin” but we do not have information about other movies.

Step 2. Item Feature Matrix

Movie\Genre	Fantasy	Cartoon	Drama	Comedy
Minions	0	1	0	1
A Star is Born	0	0	1	0
Aladdin	1	0	0	1

For those rated movies, we now build an item feature matrix of genres. We can also consider themes, actors/directors, professional ratings, movie summary text, stills from movie, movie trailer as features. In the table above, “Minions” belongs to Cartoon and Comedy categories, “A Star is Born” is Drama and “Aladdin” is Fantasy and Comedy movie.

Step 3. User Feature Vector

Then apply dot-product to the user-item rating matrix and the item feature matrix and sum up the matrix by column as below.

Fantasy	Action	Cartoon	Drama	Comedy
10	0	7	4	17

The above matrix is a five-dimensional embedded feature space that we use to represent movies. Normalizing the above matrix, now we get the user feature vector.

Fantasy	Action	Cartoon	Drama	Comedy
0.26	0	0.18	0.11	0.45

The above vector is the user feature vector. Note that “0” for the Action genre does not mean that the user dislikes it because none of the movies he/she has previously rated contains the Action feature.

Step 4. User Rating Prediction

So, how can we include those concepts to engineer the content-based recommendation system? Now we have a new item feature matrix as below. Movies below can be both seen and unseen movies by the user (Simply we can just predict all ratings and drop off seen movies at the last step of recommendation). In this case, however, we only take care of four unseen movies.

Movie\Genre	Fantasy	Action	Cartoon	Drama	Comedy
Harry Potter	1	1	0	0	0
The Dark Knight Rises	1	1	0	1	0
Incredible	0	1	1	0	1
Memento	0	0	0	1	0

Then multiply the user feature vector component-wisely to the new movie feature vector for each movie and then sum row-wise to compute the dot product. This gives us the dot product similarity between the user and each of those four movies.

Harry Potter	The Dark Knight Rises	Incredible	Memento
0.71	0.37	0.63	0.11

The above vector shows predicted ratings for each movie. Clearly, the higher the more likely (hopefully) the user likes the item. Thus, we should recommend “Harry Potter” and “Incredible” to this user but not other movies. Simple, isn’t it?

Collaborative-Filtering Model

Matrix Factorization

Assume that we have user-item interaction matrix of muliple users denoting whether the user watches or not.

User\Movie	Harry Potter	Incredible	Shrek	Dark Knight Rises	Memento
🙎‍♀️	O		O	O
🙎		O			O
🙎🏾‍♂️	O	O	O
🙍🏼‍♀️				O	O

As the more users and movies are collected, the user-item matrix gets more sparse; normally needs to be shrunk down to more attractable size through matrix factorization. The factorization splits this matrix into row factors and column factors that are essentially user and item embeddings. Let A to be the whole user-item interaction matrix, then it is decomposed into U (user embedding) and V (item embedding) like below (much smaller!). \(A \approx U \times V^T\) Each user and item is a d-dimensional point within an embedding space. Embeddings can be learned from data; PCA, SVD, etc. Compress data to find the best generalities to rely on, called latent factors. This saves space as long as the number of latent factors, k, is smaller than the harmonic mean of the number of users and items, (U*V)/(2*(U+V)). In CF, however, the problem is when we do not have information of new users, called cold start. In this case, we can use averages from the other users/items for fresh users/items until sufficient interactions are made.

Summary

pros and cons

Method	Content-Based	Collborative Filtering	Knowledge-Based
pros	- No need for data about other users - Can recommend niche items	- No domain knowledge - Serendipity - Great starting point	- No interaction data needed - Usually high-fidelity data from user self-reporting
cons	- Need domain knowledge - Only safe recommendations	- Cold start - Sparsity - No context features	- Need user data - Need to be careful with privacy concerns

structured/unstructured information that can be used

Type	Content-Based	Collborative Filtering	Knowledge-Based
structured	- Genres - Themes - Actors/directors involved - Professional ratings	- User ratings - User views - User wishlist/cart history - User purchase/return history	- Demographic information - Location/country/language - Genre preferences - Global filters
Unstructured	- Movie summary test - Stills from moview - Movie trailer - Professional reviews	- User reviews - User-answered questions - User-submitted photos - User-submitted videos	- User ‘about me’ snippets