Are you looking for a way to enhance user experience on your website or mobile application by designing Recommendation Engine with Python? Building a recommendation engine could be the answer. In this tutorial, we’ll show you how to build a simple recommendation engine using Python.
A recommendation engine is a tool that provides personalized recommendations for products, services, or content based on the user’s past behavior or preferences. The recommendations are generated using data analysis and machine learning algorithms. A recommendation engine can improve user engagement, retention, and conversion rates by providing relevant suggestions to the users.
Prerequisites
Before we start building our recommendation engine, make sure you have the following:
– Python 3 installed
– Pandas library installed
– Scikit-learn library installed
– A dataset with user-item interactions (e.g., purchase history, ratings, clicks, etc.)
Step 1: Recommendation Engine with Python – Data Preparation
The first step is to prepare the data for the recommendation engine. The dataset should contain user-item interactions, where each row represents a user’s interaction with an item (e.g., purchase, rating, click).
We’ll use the MovieLens dataset, which contains movie ratings from 943 users on 1682 movies. You can download the dataset from https://grouplens.org/datasets/movielens/100k/. Save the “u.data” file to your working directory.
Next, let’s load the data into a Pandas DataFrame and inspect the first few rows:
import pandas as pd data = pd.read_csv('u.data', sep='\t', names=['user_id', 'item_id', 'rating', 'timestamp']) print(data.head())
Output:
user_id item_id rating timestamp 0 196 242 3 881250949 1 186 302 3 891717742 2 22 377 1 878887116 3 244 51 2 880606923 4 166 346 1 886397596
The data contains four columns: user_id, item_id, rating, and timestamp. Let’s drop the timestamp column since we won’t use it for this tutorial.
data.drop('timestamp', axis=1, inplace=True)
Step 2: Recommendation Engine with Python – Data Preprocessing
Next, we’ll preprocess the data to convert user-item interactions into a matrix where each row represents a user and each column represents an item. The cells contain the rating given by the user to the item.
from sklearn.model_selection import train_test_split from scipy.sparse import csr_matrix # Convert data to matrix data_matrix = data.pivot(index='user_id', columns='item_id', values='rating').fillna(0).values # Split the data into training and testing sets train_data, test_data = train_test_split(data_matrix, test_size=0.2) # Convert the training data into a sparse matrix train_data_sparse = csr_matrix(train_data)
We’ll convert the data into a sparse matrix since the dataset is sparse (i.e., most users have not interacted with most items). The sparse matrix is more memory-efficient and allows us to use machine learning algorithms optimized for sparse data.
Step 3: Recommendation Engine with Python – Building the Recommendation Engine
Now that we have prepared the data, let’s build the recommendation engine using matrix factorization. Matrix factorization is a popular technique for recommendation engines that factorize the user-item interaction matrix into two lower-dimensional matrices: one representing users’ preferences and one representing items’ attributes.
from sklearn.decomposition import TruncatedSVD # Build the model model = TruncatedSVD(n_components=20) model.fit(train_data_sparse) # Generate recommendations for the first user user_id = 1 user_ratings = train_data_sparse[user_id - 1, :] recommendations = model.inverse_transform(model.transform(user_ratings)).flatten() # Sort the recommendations in descending order recommendations_sorted = sorted(enumerate(recommendations), key=lambda x: x[1], reverse=True) # Print the top 10 recommendations for item_id, rating in recommendations_sorted[:10]: print(f"Item ID: {item_id}, Rating: {rating}")
Output:
Item ID: 49, Rating: 4.69109425891662 Item ID: 64, Rating: 4.587421740615851 Item ID: 98, Rating: 4.545518568641388 Item ID: 474, Rating: 4.524050727862941 Item ID: 204, Rating: 4.505556426018421 Item ID: 180, Rating: 4.477066246756071 Item ID: 483, Rating: 4.475805134433074 Item ID: 174, Rating: 4.473684001895637 Item ID: 228, Rating: 4.466062747629248 Item ID: 100, Rating: 4.441457326149253
The code above generates recommendations for the first user using matrix factorization. First, we fit the model to the training data using 20 components (i.e., factors). Then, we transform the user’s ratings using the model, generate recommendations, sort them in descending order, and print the top 10 recommendations.
Step 4: Recommendation Engine with Python – Conclusion
Congratulations, you have built a simple recommendation engine using Python! With the right data and machine learning algorithms, you can build a more sophisticated recommendation engine that can improve user experience and business outcomes.
In this tutorial, we have shown you how to:
– Prepare user-item interaction data
– Preprocess data into a matrix
– Build a recommendation engine using matrix factorization
– Generate recommendations for a user
To learn more about recommendation engines and machine learning in Python, check out our other tutorials and resources. Thanks for reading!
Want to learn more about Python, checkout the Python Official Documentation for detail.