Assignment Description

This assignment will provide you with practice using arrays.

Your job is to build a simple recommender system, similar to the one that Amazon uses to recommend books to customers. The basic idea is to find out some books that a user likes, and then recommend other books that the user might also like.

Your Task

Your program should behave as follows:

• 18 points Load the 20 book names and the book ratings from 30 people into two arrays in memory. These can be read by your program using the Scanner class. You are not required to handle FileNotFoundException.
• 10 points Ask the user to enter a rating (between 1 and 5, or -1 if they haven’t read it) for each book.
• 18 points Create a method that determines for each of the 30 people a score, which represents how similar that person’s tastes are to the taste of the user of the program. Store these similarity scores in an array of 30 doubles. The similarity scores should be between 0 and 1 each.
• 18 points Create an array that represents recommended ratings for the user. There should be 20 numbers in this array, one for each book. The higher the number, the more strongly your program thinks the user will like the book. The number should be the average over all 30 ratings for the book that are greater than 0 (only include ratings for users who have actually rated the book). However, it should be a weighted average: people who are more similar to the current user should have a higher weight than people who are less similar.
• 16 points Display the name of the top book (according to the recommended ratings from the previous step) that the user has not yet read.

Suggestions and hints

Making a Recommendation

The goal is to come up with a recommendation for a book that the user might like. If you and I have similar preferences, and there’s a book that I haven’t read that you like, chances are good that I might like it. Suppose there’s another person, whose tastes kind of match mine, who also happens to like the book. We now have even more evidence that I might like the book (but because my tastes only kind of match the other person’s, it only lends a little bit of weight to the decision).

Imagine, now, that I don’t just have information of a couple of friends upon which to base a recommendation. In this assignment, we’ll have information from 30 people about their preferences. We put all of that information together to form a single score for each book, which can be calculated as a weighted average of all of the ratings of all the other users. We assign more weight to the ratings of people whose preferences are similar to ours, and a smaller weight for people whose preferences are dissimilar.

We calculate a score for each book, and the one we recommend is the one with the highest score.

The mathematical formula for a weighted average, where there are N numbers stored in an array called a, and N corresponding weights stored in an array called w, goes like this:

weighted_average(A[], w[]) = (A[0]*W[0] + A[1]*W[1] + … + A[N-1]*W[N-1]) / (W[0] + … + W[N-1]))

Calculating Similarity

Your program is going to try to decide whether or not you might like a book that you haven’t yet read. It’s going to come up with a score, which represents the likelihood that you’ll enjoy it. If people who have tastes similar to yours seem to like it, the program will assign that book a higher score. You might ask, How do you determine similarity, and how do you boil it down to a single number?

You can come up with your own way of judging how similar two people’s ratings are. One suggestion is to compute what’s called cosine similarity:

• for person 1, compute the square of each book rating for books they have read, and add these up and then take the square root. Store the result in a variable called p1. For example, if person 1 read 3 books and rated them 4, 4, and 2, then p1 = sqrt(4*4 + 4*4 + 2*2) = sqrt(36) = 6.
• do the same for person 2, and store the result in a variable called p2.
• for each book that both people have read, compute the product of their ratings. Add up all of these products, and store the result in a variable called both. For example, if person 1 and person 2 both read books 7 and 14 (out of 20), and person 1 rated them as 4 for book 7 and 2 for 14, and person 2 rated them as 2 for book 7 and 3 for book 14, then both = 4*2 + 2*3.
• The cosine similarity score between person 1 and person 2 is (both / (p1 * p2)).

As always, don’t try to program everything all at once. Do it in parts, by writing some methods that accomplish part of the whole assignment. Write some println’s that show what’s going on in memory after you call a method that you’ve just written, and run your program to make sure that the new method is working correctly. Even better, you could write a JUnit test (bonus: you’d get extra credit). Repeat this for each new method you write.

Extra Credit up to +10 points

Write JUnit tests for at least two methods. Recall that to create the skeleton for tests in Eclipse, under the package tab, right-click on your .java file (CMD-click on a Mac), select New, then JUnit Test Case. Check off the functions for which you’d like some starter tests written, and then click Finish. We’ve used assertTrue(), assertFalse(), and assertEquals(). In addition, you might find assertArrayEquals() to be useful.

20books name:

A Walk in the Woods Salem’s Lot John Adams Illustrated Guide to Snowboarding Dracula Your Mountain Bike and You: Why Does It Hurt to Sit? 1776 Dress Your Family in Corduroy and Denim High Fidelity Guns of August Triathlete’s Training Bible Jaws Schwarzenegger’s Encyclopedia of Modern Bodybuilding It What’s That? Team of Rivals No Ordinary Time: Franklin and Eleanor Roosevelt: The Home Front in World War II Truman Basic Fishing: A Beginner’s Guide Life and Times of the Thunderbolt Kid

ranks:

-1 2 3 5 -1 5 3 3 1 4 2 2 5 -1 1 3 3 5 4 3 -1 1 1 4 1 3 3 1 2 3 4 -1 4 1 2 4 5 4 2 3 3 -1 2 3 -1 2 5 -1 3 3 5 2 2 1 2 3 5 3 4 2 -1 1 -1 4 1 3 5 2 1 5 3 -1 5 2 1 3 4 5 3 2 -1 -1 3 2 -1 5 5 2 2 4 4 2 3 2 -1 3 4 4 3 1 2 1 1 5 2 2 4 2 3 4 3 -1 5 2 2 5 3 5 2 1 3 -1 3 4 -1 2 5 -1 -1 4 3 -1 3 -1 2 5 5 5 4 2 4 -1 4 2 3 -1 1 3 4 -1 1 4 4 4 -1 2 -1 1 4 4 4 3 3 3 -1 2 2 4 3 -1 2 4 3 4 2 -1 -1 2 2 3 3 -1 3 -1 3 4 -1 5 5 -1 -1 -1 1 -1 -1 1 1 2 -1 5 3 -1 3 4 3 4 -1 5 5 2 3 3 4 1 1 -1 -1 -1 -1 4 4 -1 4 4 1 3 -1 5 4 -1 1 3 4 1 -1 1 -1 1 -1 5 5 -1 3 1 4 3 -1 5 4 1 3 2 1 -1 4 2 1 -1 2 4 3 -1 5 1 4 4 2 5 5 1 2 3 1 1 -1 1 -1 1 -1 5 4 1 5 4 3 -1 1 3 4 -1 -1 3 3 -1 1 1 2 -1 3 5 -1 1 1 3 -1 3 1 3 -1 -1 3 -1 5 2 2 1 4 -1 5 -1 3 -1 2 3 1 5 4 3 3 -1 5 -1 5 2 -1 4 4 3 3 3 1 1 1 3 2 4 1 -1 -1 -1 5 -1 3 -1 -1 1 -1 2 5 2 -1 2 3 5 -1 4 3 1 1 3 3 -1 4 -1 -1 4 3 2 5 1 -1 1 3 3 -1 3 3 1 -1 -1 3 -1 5 -1 -1 3 1 2 4 -1 3 -1 2 4 1 4 3 -1 2 3 4 1 3 -1 2 -1 4 3 5 -1 -1 1 3 5 -1 4 2 1 -1 3 3 2 3 2 -1 3 1 -1 3 -1 3 2 2 3 -1 5 -1 -1 2 3 4 -1 4 1 -1 -1 -1 -1 4 2 -1 3 -1 -1 4 -1 2 -1 2 2 2 5 -1 3 4 -1 -1 2 -1 2 1 4 3 -1 3 2 1 -1 -1 -1 1 3 1 3 3 1 -1 -1 -1 3 4 3 3 -1 4 2 -1 4 -1 -1 2 4 -1 3 4 2 -1 -1 -1 4 -1 5 1 -1 4 1 -1 3 2 2 -1 4 1 3 3 1 -1 -1 -1 3 -1 4 2 1 5 -1 -1 2 1 1 -1 5 -1 5 4 1 2 2 -1 1 2 5 2 -1 3 -1 -1 1 -1 2 -1 4 2 4 3 -1 2 1 -1 -1 2 5 1 1 4 -1 2 1 -1 -1 2 4 -1 3 4 2 -1 -1 -1 4