My Trials and Tribulations with Machine Learning

Part Four: Building a Recommendation Engine with Amazon Personalize

--

This is the final of a four-part machine learning series: the subtopics I’ve learned, projects where I’ve used machine learning, the mistakes I’ve made along the way — and how you can learn machine learning, too. Read the series from the beginning here.

“You may also like” screenshot from Amazon.com
Once you add something to your cart, Amazon sometimes finds things similar to that item.

“You may also like…”

“…spending more money and time on our platform while we collect more of your data and interests!”

All over the world, Amazon sends customers realtime personalized recommendations based on their buying trends, search history, browsing history and so much more. While you may not think much of it, this takes a pretty (really) strong deep learning engine to understand what you have liked, and what you may like based on that.

What’s in a recommendation?

Think about it. When was the last time you recommended something to someone?

Maybe it was a hair product. Maybe it was a movie. Maybe it was a restaurant that you really enjoyed, and think your friend might like it too.

Why did you recommend these things? Was it based on your friends’ interests and opinions, or was it based on the things to which you’re partial? Often times, it’s a combination of both.

Trending AI Articles:

1. Are you using the term ‘AI’ incorrectly?

2. Making a Simple Neural Network

3. The AI Job Wars: Episode I

4. Artificial Intelligence Conference

Another thought experiment: When was the last time you recommended something to someone that was a total dud?

I’m not pointing any fingers (Gossip Girl was not a good show Brianna, let it go), but sometimes we recommend things to others more because we like it rather than because we think the other person might.

That’s because we’re full of biases. And it’s only human — we rarely recommend things we don’t like, even if the person you’re talking to might like it.

I’m no shrinking violet when my friends recommend movies I dislike.

How technology can save us all (again)

With recommendation tools like Amazon Personalize, the human bias and error are removed from the equation. Amazon Personalize doesn’t have a favorite B-movie. Amazon Personalize doesn’t just love this one actor that you can’t stand. Amazon Personalize’s only goal is to ensure it knows you well enough to find hidden treasures for you to enjoy.

With careful tailoring and paying close attention to your trends without adding in unconscious (or conscious) bias, machine learning allows for seamless recommendations through continuous integration and realtime results.

How to use Amazon Personalize to recommend stuff?

How Amazon Personalize works, according to AWS.
  1. Store inventory and user demographics on Amazon S3.
  2. Stream activity use from your application using the Amazon Personalize API or the JavaScript Library.
  3. Automatically process and examine data via Amazon Personalize, identify what is meaningful, select the right algorithms, and train and optimize a personalization model that is customized for your data.
  4. (Output) Provides Amazon Personalize with an activity stream to generate realtime recommendations or request recommendations in bulk via a customized personalization API.

What is Amazon S3?

You know, in case you’ve been living under a rock.

Amazon S3, or Amazon Simple Storage Service, is a service offered by AWS providing object storage through a web service interface. Amazon S3 uses the same scalable storage infrastructure that Amazon.com uses to run its global e-commerce network.

What is the Amazon Personalize API?

The Amazon Personalize API has a group of supported actions to organize your data coming from the S3. It also has actions supported by Amazon Personalize Events and Amazon Personalize Runtime. This is here to help keep your data clean and well-structured as it preps to go through the Amazon Personalize service.

What is Amazon Personalize?

By this point in the article, you should have a high-level, conceptual understanding of this. This is the part where your data is processed and the model is trained to find recommendations based on trends and patterns.

What is Customized Personalization API?

That is the end result of the Personalize process, which in turn gets outputted with an activity stream and put right back into Amazon Personalize, to ensure the accurate and realtime recommendations Amazon boasts.

Finally, An Example

We’re going to make a simple recommendation engine to show you just how easy Amazon Personalize makes it. Make sure you have an AWS Account and activate it before starting this walkthrough, and if you don’t have an account yet, you should make one here.

Create the Training Data

In order to create training data in the example, download, modify, and save the movie ratings data Amazon gives as an example to an Amazon S3 bucket. Then give Amazon Personalize permission to read from the bucket.

  • Download Amazon’s example zip file from MovieLens. Unzip the file. The user-interactions data is in the ratings.csv file. Open it!
  • Inside that file, delete the rating column.
  • Replace header row with: USER_ID, ITEM_ID, TIMESTAMP ***These headers must be exactly as shown for Amazon Personalize to recognize the data.***
  • Save the file and upload it to your Amazon S3 bucket. For more information, see Uploading Files and Folders by Using Drag and Drop in the Amazon S3 Console User Guide.
  • Grant Amazon Personalize permission to read the data in the bucket. For more information, see Uploading to an S3 Bucket, and check out the screenshot below to make sure you’re set up correctly.
This is how it should look to give permissions to Amazon Personalize to access the data in Amazon S3.

Setting up the AWS SDKs

Download and install the AWS SDKs that you want to use. The AWS guide provides examples for Python and JavaScript using the AWS Amplify library. For this example, we are using AWS SDK for Python (Boto 3).

To confirm that your Python environment is properly configured to use with Amazon Personalize, the code below should display a list of recipes.

You can also use JavaScript with the AWS library Amplify, but as stated above, we’re not doing that in this example. Click the link for more information on how to get started in JavaScript.

Import the training data

After you verify that your Python environment is configured correctly, import your data. To use a dataset for training, you need to do the following:

  • Add a schema. The schema allows Amazon Personalize to parse the training dataset. For more on this step, check out the AWS documentation for Datasets and Schemas.
To use this code sample, Define the Avro format schema that you want to use. Save the schema in a JSON file in the default Python folder.
  • Import the data. You create a dataset group which contains one or several datasets that Amazon Personalize can use for training.
  • (Optional) Add an event tracker. To add an event to train a model, you must add a tracking ID to associate the event with your dataset group. Take a look at the following code to see how it should look. For more information, check out Getting a Tracking ID.
  • (Also optional) Add an event record. To add more data in training and create a better model, you can use events. Events are recorded user activities such as a search, a view, or a purchase.

Create a solution

After importing your data, create a solution and solution version. The solution contains the configurations to train a model. A solution version is a trained model. For a code sample, see AWS’s documentation on Creating a Solution.

When you create a solution version, evaluate its performance before proceeding. For a code sample, see Evaluating a Solution Version.

Create a campaign

After you train and evaluate your solution version, you can deploy it using a campaign. A campaign is an endpoint used to host a solution version and make recommendations to users. For a code sample, see the following code sample:

The campaign isn’t ready for use until its status is set to “active.” To get the current status, call DescribeCampaign and check that the status field is ACTIVE.

And finally, getting recommendations

After you create a campaign, you can use it to get recommendations. For a code sample, see the code below on getting a recommendation based on contextual metadata.

Change the value of the context key-value pair to that of a metadata field that is your training data. A list of recommended items for the user is displayed.

Expand On It: (Optional) Explore the Amazon Personalize APIs with a Jupyter (iPython) Notebook

AWS provide a Jupyter (iPython) notebook to help you explore the Amazon Personalize APIs. With one exception, the Jupyter notebook has the same prerequisites as the Python examples in this guide. The notebook uses different source data and you don’t need the to create training data.

To get the Jupyter notebook, clone or download the notebook from the Amazon Personalize Samples repository.

Now you can see how Amazon Personalize can create easy recommendation systems for you. If you want to look more under the hood, there are tons of great resources to check out.

AWS breaks down the timeline for custom recommendations for businesses regarding their consumers.

Recommendation engines are extremely important and profitable to businesses, and with the proper setup, your business could use this powerful service, too.

I hope you enjoyed learning about personalized recommendations with me. If you like what you read or have any questions, comments or would like to collaborate with me on an article (or speak with me about job opportunities at your workplace), feel free to tweet me @mackied0g or connect with me on LinkedIn — don’t be shy, I love feedback and collaboration.

Check out my new four-part series on the Mathematics of Machine Learning!

https://docs.aws.amazon.com/

https://docs.aws.amazon.com/personalize/index.html

Don’t forget to give us your 👏 !

--

--

NY-based techie. Passionate about STE(a)M, security, AI/ML, memes and omitting the Oxford comma.