# Introduction

s the third Internet revolution after the search engine, recommendation systems, without the need for exogenous information about either items or users, achieve the initiative to push personalized service by conducting analysis based on user past online behavior, such as ratings or usage. Collaborative filtering which is the classical recommendation algorithm, comprises memory and model based methods [1]. To solve the data sparsity and cold start conundrum, social recommendation has gradually become one of the important research fields in the recommender system. It usually introduces the corresponding user trust information in social network, and its underlying assumptions are: people tend to interact with the people who have similar interests and preferences, and in such process they would be more similar with each other [2].

Social network information can indeed be used to improve the accuracy of prediction score, but how to fully exploit the social network connection feature information becomes a hot topic among a number of researchers recent years. [3] proposed an approach (SoRec) based on probabilistic matrix factorization by assuming the user-specific and factor-specific latent feature follows Gaussian distribution, combined user trust information matrix T with user-item rating matrix R to constrain user latent feature vector, and then used the result to calculate rating matrix to predict; a combination of matrix factorization model and trust neighborhoodbased model was proposed in [4] (RSTE); [5] presented an method that through active neighbors feature vectors to obtain the current user latent feature, and all userspecific matrixU , and then multiply the latent itemspecific matrix V to forecast user-item ratings (Social MF).

These studies are mainly concentrated on the users directly connected each other in a social network, while ignoring the differences among user influence through out the network. [6] carried on a research from the perspective of trust and trusted, then combined each other with the addition (Trust MF). However, actually, trust and trusted information are not inseparable in a real a social network, and such trust information can be propagated, i.e. the user is not only influenced by neighbors, but also the users of greater influence in the network, in addition, during the stage that the trust and trusted models are mixed, since the trust and trusted detached from each other, tuning parameters becomes more difficult, resulting in poor interpretability even for a better prediction. Therefore, these algorithms above exist poor prediction accuracy issues.

In this paper we proposes a social recommendation algorithm based on the strength of trust influence: First of all, we conduct iterative calculation by using the connectivity of the trust network to obtain user-specific impact factors in the whole social trust network. Secondly, according to the influence differences among users, we constrain the user-specific latent factor matrix during the process of matrix decomposition, and by multiplying the user-specific and item-specific latent factor matrix to seek more accurate prediction scores.


# II.


# Preliminaries

Conveniently, we describe the basic algorithm by the following example, while denote user User1 abbreviated as U1, as well Item1 as I1. Figure 1 is an abstraction for the real social network, in which each circle represents a user and the arrow points to the ones that can be trusted, e.g. user U3 trusted user U1, while 6 Year 2016 ( ) the user U1 does not trust U3. We also use matrix to represent the trust information in the network, where 1 represents the trust, 0 mistrust, thus the matrix is nonsymmetric matrix, denoted by n T , n is the number of nodes, as shown in Table 1. User-item ratings information is shown as Table 2. We integrate the two datasets as input to get user-specific latent factor U and item-specific latent factor V by matrix decomposition, and then use the product of the two factors to get predicted ratings R , as shown in table 3.We distinguish predicted ratings from known ones, by using the notation R for the predicted value of R .   According to the literature [7], introducing the social network into the user-rating ratings matrix can be treated as the rating matrix decomposition with a joint trust network matrix decomposition, as figure 2, they both constrain the user-specific latent factor U . We first define user-item rating matrix decomposition method, specifically address how to introduce a social network approach in the next section. We assume that the recommendation system involving m users and n items, the user-items rating matrix denoted as
U1 U2 U3 U4 U5 U6 U1 0 0 0 0 0 0 U2 0 0 0 1 0 1 U3 1 0 0 0 0 0 U4 1 1 0 0 1 0 U5 0 0 1 0 0 1 U6 0 0 0 0 0 0 7 Year 2016 ( ) C U34ij mn R R ? ? = ? ? , in which each rating ij R indicates the
preference by user i of item j ,where high values mean stronger preference, e.g. values can be integers ranging from 1 to 5. We denote an labeled matrix to show whether the user rates the item in user-item rating matrix = ? ? , if the user i has rated item j , set ij I is 1, otherwise 0. By matrix factorization we can map the higher dimension matrix into a lower d dimension matrix. We denote user i latent factor as i U , while item j latent factor as j V , thus can get user-specific latent
factor d m U R × ?
, and item-specific latent factor
d n V R × ?
and then use the product of user -specific and itemspecific latent factor by T U V to fit the available useritem ratings matrix R , whereby item not rated by users can be filled with a predicted score. The main learning function is obtained by minimizing the loss function as follows:

In the formula, we denote as Frobenius norm, and in case of over-fitting, set parameters
1 2 > , 0
? ? to control model complexity. In addition, a regularization method for weighting ? was proposed in [8] to avoid the parameters above in the model learning process, 1 2 , ? ? respectively account for user-specific and item-specific latent factor in the whole training model, at the same time, the introduction both of number of user i U ratings and that of items rated is used to prevent the trained mode tilt to users who rated items too many or items rated by too many users, resulting in over-fitting problem, therefore, for the appropriate model items and users can be regularized as follows:
III.
Algorithm Description

(1)

(2)

Fig. 3 :
Our improved Algorithm T 2 2 2 1 2 1 1 ( ) m T ij i j ij F F n i j I U V R U V ? ? = = = ? + + ?? L 2 2 2 1 2 1 1 ( ) i j m m T ij i j ij u i F v j F i j i j I U V R n U n V ? ? = = = ? + + ?? ? ? L 9 Year 2016 ( ) C
The primary objective of this algorithm is to obtain the impact factors of user trust influence in the entire social network, which is utilized to measure the degree of users affected by their neighbors. We refer to the idea of Page Rank algorithm: the larger the number of user trusted throughout the trust network is, the bigger the trust influence such user has. We denote that there are m users involved in the ratings matrix R and trust matrix T , therefore, we can simultaneously both matrix decomposition operation by the same userspecific latent factor U . As mentioned in the section 1, for a given user trust network, denoted as : ( , )
G M N = , which M represents user nodes, , , m n M M T t ? ? = ? ? is
denoted as adjacency labeled matrix, if user m trust user n , then , 1 m n t = , , 0 m n t = and vice versa. Thus the user i trust influence can be defined as:
1 ( ) ( ) i j T j d PR j PR i d M O ? ? = + ? ? (3)
M represents the number of whole nodes in the network, and is denoted as user set that user i trusts, i O is the number trusted by user i , and (0 1) d d ? ? is the damping factor, usually set to 0.85 [9]. Equation (3) only describes solution to achieve the influence of a particular user, so as to calculate the influence of the overall user-specific latent vector, it can be expressed as the formula (4): T . When n ? +? , i.e. the matrix steps into the steady state:
0 1 d E dT E I N + ? = +(5)
Whereby the equation ( 5) can be simplified as equation ( 6):
1 0 1 ( ) d E I dT I N ? + ? = ?(6)
When 0 1 d < < , the solution exists. Using the iterative calculation described as equation ( 5), with the number increases, the model will gradually converge [14], then we can get entire user trust influence vector
1 [ ] i m E e × =
, in the section above, we proposed a regularized method on social networking information. As for social network research, the majority of studies are based on the similarity of adjacent users, while ignore the fact that user trust someone does not mean they share a reliable presence of the same preference. If user trusted also has the same presence of target user behavior history information, the two are more likely to have same similar preferences, the objective function can be adjusted to the formula (7):

( , ) Sim i j is on behalf of cosine similarity between the user i and adjacent user j , ? is a labeled matrix indicating whether the user is directly connected to all the others in the matrix. Taking into account the diversity of user preferences, we thus make regularization to balance user similarity and userspecific latent factor, due to the fact that when two users are similar, ( , ) Sim i j will become very large, and the user latent factor will become very small, and vice versa. In addition, users are more inclined to follow the similar users who are more influential and keep consistent in behavior with them, and that is the reason why we introduce the strength of trust influence, equation ( 8) is defined as below.

(
)8
ij Q indicates the trust proportion of user j in all user i directly connected users, we hope that userspecific latent factor tends to the overall average users, while taking into account the influence of adjacent users throughout the trust network, so we proposed a method which integrates both neighborhood similar user characteristics and their trust influence factors to associate a user-item ratings and trust network information, resulting in a more accurate model. In order to facilitate learning and weight adjustment, at first, we handle the data with normalization, as the trust value ij Q is between 0 and 1, we normalize the original data by ( 7)
ij ik k j e e Q ?? = ? 2 2 2 2 1 2 3 1 1 1 ( ) ( , ) i j m n m T ij i j ij u i F v j F i j i j i j i j I U V R n U n V Sim i j U U ? ? ? = = = ?? = ? + + + ? ?? ? ? ? ? L 0 ( ) 1 N i PR i = ? = , i T 2 i j U U ?

# Global Journal of Computer Science and Technology

Volume XVI Issue I Version I 


# instead. ( , )

Sim i f is the similarity value for user i and user f , ij R represents the rating of user i for item j , i R indicates user i average rating score, ( ) I i is a labeled matrix indicating whether user i rate the item or not, as well as user f . ( 12)

IV.


# Complexity Analysis

The model training time consists two parts, the first part is the algorithm calculate the user trust influential feature vector, the second parts is for solving the latent factors by using the gradient descent method. We conduct iterative calculation to get the trust influential feature vector, if the entire number of nodes in the trust network is m , the average number of users trusted by per user is n , usually n is a relatively small number. Given 1 t times of iterations to achieve convergence, time complexity is 1 ( ) O mnt , when the data is in large scale, the literature [10] proposed a distributed solution that can significantly reduce the time.

V.


# Experiments and Results


# a) Dataset Selection

The classical data set were Epinions (665KB) [11], which contains not only the user-item ratings data, but also trust relationship between users, and thus it is usually recommended as baseline data set to test social recommendation algorithms. We use it for testing and validating the currently mainstream recommendation algorithms as well as our improved algorithm. First of all, we made the basic statistics of the data set as shown in Table 4. Trust (degree) and trusted (out-degree) statistics information shows that when the trust or be trusted number of persons increases, the corresponding statistics number gradually reduces, which follows the power law distribution.


# b) Cross-Validation

We use 5-fold cross-validation methods for training and testing the models. For each test we randomly selected 80% of the whole data as the training data set and the remaining 20% is used for testing. The next experiment results discussed in the final comparison is obtained by averaging the results of tests from five repeated times.


# c) Evaluation

Evaluation criteria used in the Experiments are based on the average absolute error MAE and root mean square error RMSE:

The time complexity of solving L is   
( ( ) ) ( , ) i j m n m n T ij i j ij u i F v j F i ij j i j i j i j I g U V R n U n V Sim i j U Q U ? ? ? = = = ?? = ? + + + ? ?? ? ? ?? L 2 1 1 ( )( ) 2 j m T T ij i j i j ij i v j i i j I g U V U V R U n V V ? = ? ? ? = ? + ? ? ? L Sim 1 3 1 1 ( )( ) ( , )( ) 2 i n T T ij i j i j ij j u i i ij j j j j i I g U V U V R V n U i j U Q U U ? ? = ?? ? ? ? = ? + + ? ? ? ? ? L ( ) ( )2 2 ( ) ( ) ( ) ( ) ( ) ( ) ( , ) ( ) ( )ij i fj f j I i I f ij i fj f j I i I f j I i I f R R R R Sim i f R R R R ? ? ? ? ? ? ? ? ? = ? ? ? ? ? , î j ij ij R R MAE N ? ? =

# d) Comparison

In consideration of the similar algorithms, our main work is integrating users trust influence in the social network into the user-item ratings. Thus while making the lateral comparisons, in addition to the original matrix decomposition algorithms, we also made comparisons among social recommendation algorithms which are focus on trust.


# e) Contrast Algorithms and Reasons

When training the models, we set parameters of each algorithms by referring to the literature [8] [9], which introduced the optimal parameters of each algorithms. Our algorithm is denoted as T and more details described as the following table 5: For all the algorithms above based on matrix decomposition, the dimension of latent factors are set to 5 and 10 respectively, and during model training stage?the same initialized strategy is adopted: the original matrixes involved are filled with a random number uniformly distributed values between 0 and 1.


# f) Results

Experiment results show that with the increase of the number of iterations, the index value of RMSE and MAE keeps declining, which indicates the introduction of the trust influence factors has a positive impact on enhancing the overall effectiveness of recommendation and improves the accuracy of predicted results. On the dataset, we have achieved results as showed in Table 6, while our algorithm is denoted as T. Cross-validation results show the performance of the our proposed algorithm is slightly better than all the other compared algorithms. In addition, we conducted statistical analysis of RMSE and MAE index value during the iterative calculation and found that with increasing number of iterations, RMSE and MAE index value gradually reduce and eventually keep stabilized, indicating that the algorithm utilizes influence characteristics among users in the trust network to predict user rating propensity is relatively effective and eventually enhances the accuracy of the prediction score of recommendation.


# VI.


# Conclusion

As the common recommendation system algorithm, collaborative filtering encountered sparse data and cold start problems, which leads to poor accuracy. Given the fact that the impact of opinion leaders for individuals in the social network, this paper proposed one social recommendation algorithm based on trust influence to alleviate such problems above. Compared to the previous research merely on adjacent user preferences, we consider the trust propagation mechanism and try out iterative calculations based the connectivity of social trust network to obtain user influence values, and then integrate both similar neighbor characteristics and their trust influence factors. Experiments show that compared the most state-of-art social algorithms, our approach optimizes the recommended results and improves the prediction accuracy. However, obviously, this algorithm complexity positively correlates with the number of iterations, when 
2![Fig. 2 : Factorization Diagram based on Rating Matrix and Social Network](image-2.png "Fig. 2 :")
![represents the whole users trust influence in the i time iterative calculation, 0 T represents the original large-scale sparse trust matrix, I + is on behalf of a matrix filled with 1, and has the same dimension with 0](image-3.png "E")
![its final value is between [0,1] . According to the literature[5], a logistic function ( ) 1 to make the product of latent vectors to [0,1], so after the training, we can get the final predicted result by of keeping the same benchmarks among user-item ratings, social network data and the same constraint variable U , the new objective function is updated to equation (9): During the training phase, we adopt the gradient descent method to minimize the objective function for solving the above-mentioned latent factors matrix, respectively U ?V . correlation coefficients to conduct calculation of cosine similarity, since the scope of its value between [ 1,1] ? , for taking a positive number, we use the mapping function ( ) ) / 2 f x x = +](image-4.png "")
2![is a specified number of iterations, . d is the dimension of latent factors, is the scale of the observed ratings, N represents time complexity of similarity calculation, since the matrix is so sparse that](image-5.png "2 t")
5![Contrast](image-6.png "Table 5 :")
![](image-7.png "")
![](image-8.png "")
1Fig. 1: Abstraction of Social Trust Network
1I1I2I3I4I5I6U15234U2435U342U4U551243U64324
2: predicted user-item rating matrix ?ij RI1I2I3I4I5I6U1522.534.84U2432.42.954.1
4Year 201611Volume XVI Issue I Version I)C(Global Journal of Computer Science and Technology
6Basic DataScenesOthersUsersItemsRating ScaleDensityUsersLink TypesItemstags40,163139,738664,824[1,5]0.00011845866,807487,183TrustGeneralNoneEvaluationSoRecRSTESocialMFTrustMFTd=5MAE0.91970.86350.88260.82120.8010RMSE1.1511.10711.11071.05851.0320d=10MAE0.91520.85720.85670.81480.8059RMSE1.17731.14831.11131.07711.0408
			© 2016 Global Journals Inc. (US)
		
		
faced with large-scale data, the computation time is too long, especially for single node. Therefore, the next research focus on the implementation of distributed algorithm for the iterative calculations involving largescale data.
			
			
* 
	
		A survey of collaborative filtering techniques
		
			XSu
		
		
			TMKhoshgoftaar
		
	
		Advances in Artificial Intelligence
		
			12
			2009. 2009
		
	
* 
	
		Referral web: combining social networks and collaborative filtering
		
			HKautz
		
		
			BSelman
		
		
			MShah
		
	
		Communications of the Acm
		
			40
			3
			
			1997
		
	
* 
	
		
			HMa
		
		
			HYang
		
		
			MRLyu
		
		
			IKing
		
		So Rec: Social Recommendation Using Probabilistic Matrix Factorization. International Conference on Information & Knowledge Management
				
			ACM
			2008
			
		
* 
	
		Learning to recommend with social trust ensemble
		
			HMa
		
		
			IKing
		
		
			MRLyu
		
	
		Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
				the 32nd international ACM SIGIR conference on Research and development in information retrieval
		
			ACM
			2009
			
		
* 
	
		A matrix factorization technique with trust propagation for recommendation in social networks
		
			MJamali
		
		
			MEster
		
	
		Proceedings of the fourth ACM conference on Recommender systems
				the fourth ACM conference on Recommender systems
		
			ACM
			2010
			
		
* 
	
		Social collaborative filtering by trust
		
			BYang
		
		
			YLei
		
		
			DLiu
		
		
			JLiu
		
	
		Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
				the Twenty-Third international joint conference on Artificial Intelligence
		
			AAAI Press
			2013
			
		
* 
	
		
			RSalakhutdinov
		
		
			AMnih
		
		Probabilistic matrix factorization. Nips
				
			2012
			
		
* 
	
		Recommender systems with social regularization
		
			HMa
		
		
			DZhou
		
		
			CLiu
		
		
			MRLyu
		
		
			IKing
		
	
		Fourth Acm International Conference on Web Search & Data Mining
				
			ACM
			2011
			
		
* 
	
		Deeper inside pagerank
		
			AmyNLangville
		
		
			DCarl
		
		
			Meyer
		
	
		Internet Mathematics
		
			1
			3
			
			2003
		
	
* 
	
		Fast Distributed Page Rank Computation. Distributed Computing and Networking
		
			ADSarma
		
		
			ARMolla
		
		
			GPandurangan
		
		
			EUpfal
		
		
			2013
			Springer
			Berlin Heidelberg