Personal Recommendation Algorithm Combining Trust and Similarity based on Neural Network

Recommendations based on friend trust relationship in social network have been extensively researched in recent years. However, the majority of the trust-based models merely take credibility or similarity into account, ignoring the influences of divergent factors on the recommendation results. This paper proposes a random walking recommendation algorithm based on conditional restricted Boltzmann machine in trust network, namely CRBM_PrTW. The algorithm fills the missing data in training data by utilizing the conditional restricted Boltzmann machine to improve the accuracy of similarity calculation, which effectively solves data sparsity problem. And on this basis, the comprehensive weight of the credibility, similarity and the trust factor utilized as trust level has been implemented to random walking algorithm. Therefore we present the random walking method based on credibility in trust network, which enhances the accuracy of recommendation system. The experimental result on the Epinions dataset demonstrates that our method can provide better recommendation result in terms of evaluation metrics when compared with the existing methods. Keywords— Conditional Restricted Boltzmann machine; Trust network; Random walking; Data sparsity


I. INTRODUCTION
With the rapid development of the Internet, the large amount of data generated by mobile terminals and Web services make it difficult for users to get the information they need. With innumerable online goods, users have to devote a lot of time to select what they favor. In order to meet the users, recommendation systems come into being, the major E-commerce sites employing the recommendation system to furnish users with products that may interest them, such as Taobao, Jingdong and Amazon, etc. They have offered personalized recommendation service.
Collaborative filtering (CF) is the most extensively applied and successful recommendation algorithm [1,2,3]. Mainstream collaborative filtering algorithms are divided into two categories: user-based CF [4] and item-based CF [5]. The user-based collaborative filtering calculates the similarity between users based on the their scoring matrix and finds a collection of users sharing similar interests. Then it recommends to the target users the identified items that are preferred by other users but not found by target users in the collection. But if only few users rate on the item, the rating information is extremely sparse, then it is hard to find a similar user set. Item-based collaborative filtering predicts the rating of target items in conformity with the user's rating data on similar items. However, the recommendation coverage is low as the user's interests change constantly.
The Restricted Boltzmann Machine (RBM) can be regarded as an undigraph model [6]. In recent years, the RBM as the basic component module has achieved great success in deep learning, meanwhile the RBM model has been demonstrated to help address the cold start problem in recommendation system. Salakhutdinov et al. [7] initially applied the RBM model to the recommendation system and proposed the Conditional Restricted Boltzmann Machine (CRBM) model. The CRBM model makes full use of the rated/unrated information to mitigate the negative impacts of the sparse data on recommendation results. Liu et al. [8] extended RBM model by incorporating content-based features such as user demographic information, items categorization and other features. The experimental results show that Content-boosted Restricted Boltzmann Machine (CB-RBM) performs better than a pure RBM model and other content-boosted collaborative filtering methods.
In recent years, with the advancement of social network, the usage of user relationship in social network has become a research focus in recommendation field. People are more inclined to receive recommendations from their closest friends in social networks. Golbeck [9] proposed a TidalTrust model to use modified breadth first search strategy in the Trust network to infer the trust value between the source users and others, but it ignores the impacts of the part of raters far away from the source users on the recommendation results. Massa et al. [10,11] presented a ModelTrust model that is similar to TidalTrust. But ModelTrust considers all users in the pre-difined range and calculates the trust among users as the weight value. The biggest challenge for trust based recommendation is to select the distance to explore in the trust network. To solve this problem, Jamali and Ester [12] put forward a random walk algorithm in the trust network, namely the TrustWalker algorithm. However, the majority of the trust models only consider the single factor of trust or similarity but the effects of various factors on recommendation result.
In order to tackle the problems mentioned above, this article, combining both CRBM and TrustWalker, presents a CRBM_PrTW algorithm that can adapt to the sparse data and random walk in the trust network. The CRBM_PrTW algorithm completes the training data through the CRBM model and calculates the user similarities and the comprehensive weights of the trust, similarity and credibility. By applying random walking in trust network, the evaluated target items and the items similar to the target items are considered, which enhance the accuracy and coverage of recommendation results.
II. RELATED WORKS Data sparsity means the rating matrix gradually becomes sparse with the growth of users and items, resulting in inaccurate calculation of item similarity. In order to handle the problem, some improved methods adopted the way of pre-filling rating matrix. The singular value decomposition (SVD) model is proposed to reduce the dimension of the rating matrix and utilizes the dense rating matrix to recommend [13]. The paper [14] points out that the application of Principal Component Analysis (PCA) can alleviate the data sparsity problem. A method based on K-means clustering is presented [15,16], which gathers users with similar interest into the same cluster and uses the average score of them to predict the missing score in the rating matrix, and to some extent solves the sparse data problem. The paper [17] applies the auto-encoders to study the low dimensional features to forecast the rating of the item. Koren et al. [18] proposed a matrix decomposition model, which decomposes a sparse matrix into two low order sub-matrices. Through continuous iterative training, the product of two matrices is getting closer to the real rating matrix.
The TidalTrust uses the improved breadth first search method considering the shortest path in the trust network to obtain the trust value between users by weighting the trust value [9]. However, the model neglects the influence of the users who are far away from the source users on the recommendation results. The ModelTrust algorithm [10,11] is similar to the TidalTrust besides that a maximum path length is preset. At the basis of the data in real network, Ziegler et al. [19] analyze the similarity and the trust relationship among users and demonstrate that there is a positive correlation between them. Jamali and Ester [12] propose a random walk model in the trust network. In the course of walking, both the target users and the user's rating on the items that are similar to the target item are considered, which solves data sparsity and cold start problem. But it is of equal probability to select the next user in random walk. However, the reality is that if a user and the source user are of credibility and share greater similarity, it is more likely to be selected in the walking. The users with closer social relationships to others are much worth to be believed [20] and are much powerful in affecting others [21]. The paper [22] considers divergent factors of recommendation resources and proposes the electronic commerce recommendation system incorporated with trust and social relations.
The traditional solutions mentioned above all have various shortcomings and can not effectively solve calculation accuracy of user similarity under extremely sparse user item rating.This paper utilizes the CRBM to extract the features and predicts the missing data in the rating matrix. And it builds a robust trust network to improve the recommendation accuracy by applying the modified random walking algorithm in the trust network.

III. ALGORITHM DESCRIPTION A. Similarity calculation
The correlation between users and items is the key of collaborative filtering algorithm, therefore the usage of an appropriate method to measure similarity is paramount to obtain accurate recommendations. Currently there are several commonly used similarity calculation formulas: Jaccard similarity coefficient, Pearson correlation coefficient [12], cosine similarity and modified cosine similarity [5]. Pearson similarity between item i and item j : Here, j i UC , represents a set of common users who have rated both item i and item j , u r is the average of ratings made by user u .
Pearson similarity only considers the rating divergences among items but the influence of the number of the common users. For instance, if , then there are more common users between item i and item j than item i and item l , the correlation between item i and item j is stronger and . Therefore, we consider j i UC , in the similarity measure as follows: Cosine similarity does not take into account the user rating scale. For instance, in the rating range [1][2][3][4][5], user u rates 3 points to show like, and user v , over 4 points. The user's average score is subtracted in cosine similarity to tackle the problem. The similarity between user u and user v is: Here, c u R , denotes user u 's rating on item c , u R and v R is the average ratings on the item rated by user u and user v respectively.

B. Conditional Restricted Boltzmann Machine
The traditional collaborative filtering based on RBM failed to consider a significant factor: there are some items users have rated or watched while we don't have ratings. This implicit information also provides additional insight into a user's preferences. For example, if a user has evaluated "Rocky 5", we can conjecture that the user likes the type of movies. The Conditional RBM model takes the information into consideration. Define The model can be considered as a undirected graph model, V is the visible layer representing the data; h is the hidden layer, the feature extractor; W is the connection weight matrix between the visible and the hidden layers, and D is the connection weight matrix between the r layer and the hidden layer; While C is the bias of the visible unit, b is the bias of the hidden unit.
Welling and Hinton proposed a RBM fast learning algorithm, namely the Contrastive Divergence (CD) algorithm [23], the algorithm is also implemented to update parameter in CRBM model. On the basis of other research achievements [2,25], we train CRBM with hidden units F=100 and features C=40 in the following experiments.

C. Recommendation Method of Random Walking in Trust
Here, the trust value between users is illustrated by   v u t , .
The Trust network is a two-valued network on Epinions dataset, one means trust and zero means distrust.

3) Termination of a Single Random Walk
We have a stop-walk probability  for staying at each user u , and k is walking depth. Furthermore, ratings on target item i from users far away from source user 0 u are noisy, but ratings expressed by trusted users nearby in the network are more reliable. Another factor that influences Each random walk has three alternatives to stop: (1) Arriving at a node which has expressed a rating on the target item i .
(2) When moving to the node u without rating on the target item i , we stop the walk at probability k i u , ,  and return the rating for a similar item .
(3) If there is not restriction for walk depth, the walk may continue forever. To avoid such a case in our implementation,this paper sets the maxi-depth as 6 steps based on the theory of "six degree segmentation" [23]. When the walk depth comes to 6 steps, the walking is terminated.

4) Termination Condition of the Overall Walk
In order to get an accurate prediction rate

5) Recommendation Results Generation
At the termination of each random walk, we will choose a user v ' s rating Pr log / Pr log / 1 (12)

D. Improved Random Walk Algorithm Based on Conditional Restricted Boltzmann Machine
Trust based recommendation algorithm has been extensively studied in recent years in terms of data sparse and cold start. However, There are still some problems.
Similarity calculation is commonly implemented in a variety of recommendation algorithms, But the results lack accuracy when the data is sparse.
2) Mutually trusted friends may hold diverse interests. Trust users are not always similar, and vice versa. Traditional trust-based recommendation algorithms usually consider only trust or similarity. For example, in the TrustWalker model, the target user u 's prediction score for the target item i is calculated using only trust as a weighting coefficient, ignoring the effects of the user's preference similarity on recommendations.
In order to address the above shortcomings, this paper proposes a Random Walk Recommendation Algorithm Based on Conditional Restricted Boltzmann Machine in the trust network (CBRM_PrTW). The algorithm applies the CRBM to predict the the missing value in the user item rating matrix and the trust matrix, which improves the accuracy of the user similarity calculation and constructs a more comprehensive social network. In the process of random walk, we take into account the trust, user similarity and confidence factor to calculate trust weigh. And the recommendation is obtained with the upgraded random walk algorithm.
add the trust weight to the trust network end for Rating data and trust data in Algorithm 1 are used as the input of CRBM model to predict the missing value, and it solves the sparsity of training set. We calculate the trust weight by taking a comprehensive of trust, similarity and credibility to build a more sound trust network. That is to say, mutually-trusted users do not always hold similar interests and users sharing low similarity may be congenial to each other' s tastes. What's more, we tend to select users with higher credibility in random walking. After K random walks, we regard the weight value of results returned by each walk as the final prediction value.

E. Experimental Results and Analysis 1) Experimental data
is extremely sparse and can be used to evaluate the recommendation results of the model under the condition of utmost data sparsity.

2) Evaluation metrics
In this paper, Root Mean Square Error (RMSE) is one of the evaluation metrics and the formula is : Here, i u r , denotes the user u 's real rating on item i , i u r ,  is the prediction rating and Test means the number of items rated by target user u . The smaller the value of the RMSE , the higher the accuracy of the recommendation results.
In the case of extreme data sparsity, some models recommend only a small number of movies to users. Therefore, this chapter selects Coverage as the second metric for evaluation metrics. M represents all the movies in the test set, and   u R denotes the movies recommended to users. The coverage calculation formula is: (14) To combine RMSE and coverage into a single evaluation metric, we calculate F-Measure [12]. So we need to transformed RMSE into a precision metric in the range [0,1]. The precision and F-Measure are calculated as follows: 4 1 Pr RMSE ecision   (15) converage precision

3) Coefficient selection of the trust weight
In this paper, the coefficient α is of significance to calculate trust weight TW , the parameter affects the recommendation accuracy. So we need to analyze the relationship between parameter value and recommendation accuracy. In the CRBM_PrTW, the coefficient α range is 0.1 to 1 in the experiment. The Figure 4 describes the variation of the RMSE when the coefficient α takes different values. As it is shown in the Figure 4, when the α≤0.6, the RMSE gradually decreases with the increase of α; when α>0.6, the RMSE also increases with the growth of α; when α equals 0.6, the RMSE is the minimum value. In the following experiment, we select parameter α=0.6 as the contrast experiment.

4) Experimental results
Considering the rigor of the experiment, we randomly divide the data set into 75% training set and 25% test set, and select the user-based CF, Item-based CF, TrustWalker, ModelTrust and TidalTrust algorithm mentioned above as contrast experiments. As we can see from table 2, the CRBM_PrTW proposed in this paper is better than all other the algorithms in terms of evaluation metrics. The performance of traditional recommendation algorithm based on collaborative filtering is the worst. When the training set is extremely sparse, users merely rate on a small number of items and only a few items are scored by various users. It is difficult to calculate item similarity or user similarity, which reduces the recommendation quality. Trust network is utilized in the recommendation algorithm, as the items are recommended by its own trust users, the recommendation is significantly advanced in quality compared with the traditional collaborative filtering algorithm. The experiment demonstrates that the coverage and RMSE of TrustWalker are higher than that of ModelTrust and TidalTrust. That is due to the TrustWalker algorithm takes into account not only the trust user's rating of the target item, but also the rating of items similar to the target item. Our method utilizes CRBM to solve data sparse, which greatly improves the accuracy of similarity calculation. Meanwhile, trust value, user similarity and trust factor are considered to calculate trust weight that is applied in random walking algorithm, which enhances the recommended quality. CONCLUSION This paper proposed a random walk based on CRBM. The CRBM model effectively tackles data sparse and increases the accuracy of similarity calculation. In addition, this paper takes into consideration of credibility, user similarity and the trust factor. Therefore, the next selected users are trust users who share similar tastes in the random walk algorithm. Meanwhile, the trust factor makes the recommendation result more reliable. The experimental results show that the proposed CRBM_PrTW algorithm is better than other recommended methods.
There are several aspects that need to be considered in the future: firstly the trust in this article is binary trust, while in daily life, trust between people can be divided into different levels, such as trust, general trust, very trust and etc. We intend to calculate the type of real trust through trust propagation. Secondly, users share divergent interests in different types of movies, so we introduce the category factor in the method.