To deal with the sparsity problems and improve the prediction accuracy, particularly to handle customer data uncertainty and fully use business knowledge in recommendation, this study develops an approach which integrates Item-based Collaborative Filtering (IBCF) and User-based Collaborative Filtering (UBCF) with fuzzy set techniques and knowledge-based method (business rules). It first uses IBCF to produce predictions to form a dense user-item rating matrix, and based on this matrix, UBCF is applied to generate recommendations. The approach takes advantage of both the horizontal and vertical information in the user-item rating matrix, which can solve the sparsity problem. It also uses fuzzy techniques to tackle linguistic variables which are used in describing customer preference and has the ability to generate recommendations using uncertain information.
There are complex business rules in telecom industry (also in other organizations) and they directly impact on recommendation accuracy and user acceptance of recommendations. This study designed and applied five types of business rules: 1) the bundle rules, 2) the fleet rules, 3) the discount rules, 4) the product rules and 5) the special offers. The business rules are made and maintained by product or sales managers in the industry, and are taken as an input of the approach. The structure to describe a business rule is in the form of if-then-else. For example, we assume x and y are similar products; v and w are similar products in function and price. A customer needs x (or y) and v (or w). Since purchasing x+w attracts additional discount, the system may recommend it.
The proposed approach is described in eight steps as follows.
Input:
n—the number of products (could be services or product bundles) provided.
—the number of users (existing customers) in the system.
—a rating of a customer for a product and could be described in linguistic terms. (i=1,2,…m; j=1,2,..n).
Business rules—described as if-then rules and are stored in a knowledge base.
Output:
, ,…—k most appropriate products (could be services or product bundles) recommended.
Step 1: Generate a user-item linguistic term based rating matrix
Each user is represented by a set of item-rating pairs and the summary of all those pairs can be collected into a user-item rating matrix in which for the user on the item, a rating, , is given. These ratings are described in the linguistic terms shown in Table 2. There are customers in total in the system and products are provided. If user has not rated item , then .
Step 2: Calculate fuzzy item similarity
There are several methods of calculating the similarity between products, among which the Pearson correlation and cosine vector are two popular methods that are applied widely across the field. The cosine vector measures the similarity between two products by calculating the cosine of the angle between the two vectors of the target item and the comparison item [20]. The Pearson correlation measures the similarity between two items by calculating the linear correlation between the two vectors [38]. In this study, the Pearson correlation is selected for measuring the similarities between the two items and . Since the similarity between two items (telecom services) is naturally uncertain, the ratings collected are linguistic terms, and fuzzy numbers are used in the measure. We therefore give the following fuzzy similarity measure based on definitions given in Section 3:
(7)
where represents the set of users that both rated items and . and represent the ratings of user on items and under -cut respectively, and are the left-end and right-end of -cut respectively, and are the average rating of the users of on and respectively. This step aims to obtain similarity between products.
Step 3: Item neighbours selection
In most CF methods, a number of neighbours will be selected as references when predicting ratings [39]. According to Shi et al. [38], two approaches are possible for this task: the threshold-based selection or top-N techniques. In our approach, we use the top-N technique for neighbour selection. By using this method, a certain number of most similar items will be selected as neighbours. The number of neighbours is predetermined before the item neighbour selection process.
Step 4: Predict empty fuzzy ratings using item-based CF with fuzzy number calculation
In this step, all the unrated ratings can be calculated using the item-based CF method and all the empty cells in the user-item rating table will be filled except the ratings to the new items which have been rated less than two times. The algorithm for prediction is as follows:
(8)
where refers to the predicted rating of user on item , is the number of selected neighbours, is the rating of user on item , and is the similarity between item and item . This step aims to predict users’ rating values to unrated items.
Step 5: Calculate fuzzy user similarity
Besides predicting the ratings based on the similarities of items, we can also predict the ratings by analysing the similarities between users. Since the similarity between two customers/users is naturally uncertain as well, fuzzy numbers are used in the similarity measure, similar to Step 2. We use the Pearson correlation algorithm for calculating the user similarity by
(9)
where is the similarity between user and user , is the set of items that rated by both user and user , is the rating of item from user , is the rating of item from user , is the average of all ratings from user , is the average of all ratings from user . This step aims to obtain similarity between users so that to help predict users’ ratings to items.
Step 6: Select top-N similar users
Similar to Step 3, we need to select a number of neighbour users to predict ratings. The Top-N technique is used in the proposed approach.
Step 7: Recommendation generation with fuzzy number calculation
This step is to predict the ratings of every unrated telecom product/service for target users using user-based CF. The new predicted ratings will replace the ratings predicted in Step 4 and will be regarded as the final results. The applied algorithm is as follows:
(10)
where is the final predicted rating of item from user , is the average of all ratings from user , is the average of all ratings from user , is the number of neighbours selected in Step 6, is the rating of item from user , and is the similarity between user and user .
Step 8: Final recommendation
The unrated products for the target user are ranked according to the predicted ratings calculated in Step 7. The top-K products (could be services or product bundles) are selected. Each product is checked if it satisfies the related business rules. For example, for a customer who is considered to recommend a special fixed line product since it has to be bundled with a fixed broadband product, the step will check if it is bundled.
If all related rules are checked and satisfied, the top K products will be recommended directly; otherwise, the product will be revised accordingly. Finally, a set of most suitable products/services/bundles, , ,…, is recommended to the target user (customer).
Share with your friends: |