Correlation-based Similarity

Next: Adjusted Cosine Similarity Up: Item Similarity Computation Previous: Cosine-based Similarity

Correlation-based Similarity

In this case, similarity between two items i and j is measured by computing the Pearson-r correlation corr_i,j. To make the correlation computation accurate we must first isolate the co-rated cases (i.e., cases where the users rated both i and j) as shown in Figure 2. Let the set of users who both rated i and j are denoted by U then the correlation similarity is given by

$\begin{displaymath}sim(i,j) = corr_{i,j} = \frac{\sum_{u \in U} ( R_{u,i} - \bar... ...})^{2}}{\sqrt {\sum_{u \in U}(R_{u,j}-\bar{R_{j}})^{2} } } }. \end{displaymath}$

Here R_u,i denotes the rating of user u on item i, $\bar{R_{i}}$ is the average rating of the i-th item.

Badrul M. Sarwar
2001-02-19