Recommendations for Ruby and Rails using collaborative filtering
:fire: Recommendations for Ruby and Rails using collaborative filtering
Add this line to your application’s Gemfile:
gem 'disco'
Create a recommender
recommender = Disco::Recommender.new
If users rate items directly, this is known as explicit feedback. Fit the recommender with:
recommender.fit([ {user_id: 1, item_id: 1, rating: 5}, {user_id: 2, item_id: 1, rating: 3} ])
IDs can be integers, strings, or any other data type
If users don’t rate items directly (for instance, they’re purchasing items or reading posts), this is known as implicit feedback. Leave out the rating, or use a value like number of purchases, number of page views, or time spent on page:
recommender.fit([ {user_id: 1, item_id: 1, value: 1}, {user_id: 2, item_id: 1, value: 1} ])
Use
valueinstead of rating for implicit feedback
Get user-based recommendations - “users like you also liked”
recommender.user_recs(user_id)
Get item-based recommendations - “users who liked this item also liked”
recommender.item_recs(item_id)
Use the
countoption to specify the number of recommendations (default is 5)
recommender.user_recs(user_id, count: 3)
Get predicted ratings for specific users and items
recommender.predict([{user_id: 1, item_id: 2}, {user_id: 2, item_id: 4}])
Get similar users
recommender.similar_users(user_id)
Load the data
data = Disco.load_movielens
Create a recommender and get similar movies
recommender = Disco::Recommender.new(factors: 20) recommender.fit(data) recommender.item_recs("Star Wars (1977)")
Ahoy is a great source for implicit feedback
views = Ahoy::Event. where(name: "Viewed post"). group(:user_id). group("properties->>'post_id'"). # postgres syntax countdata = views.map do |(user_id, post_id), count| { user_id: user_id, item_id: post_id, value: count } end
Create a recommender and get recommended posts for a user
recommender = Disco::Recommender.new recommender.fit(data) recommender.user_recs(current_user.id)
Disco makes it easy to store recommendations in Rails.
rails generate disco:recommendation rails db:migrate
For user-based recommendations, use:
class User < ApplicationRecord has_recommended :products end
Change
:productsto match the model you’re recommending
Save recommendations
User.find_each do |user| recs = recommender.user_recs(user.id) user.update_recommended_products(recs) end
Get recommendations
user.recommended_products
For item-based recommendations, use:
class Product < ApplicationRecord has_recommended :products end
Specify multiple types of recommendations for a model with:
class User < ApplicationRecord has_recommended :products has_recommended :products_v2, class_name: "Product" end
And use the appropriate methods:
user.update_recommended_products_v2(recs) user.recommended_products_v2
For Rails < 6, speed up inserts by adding activerecord-import to your app.
If you’d prefer to perform recommendations on-the-fly, store the recommender
bin = Marshal.dump(recommender) File.binwrite("recommender.bin", bin)
You can save it to a file, database, or any other storage system
Load a recommender
bin = File.binread("recommender.bin") recommender = Marshal.load(bin)
Alternatively, you can store only the factors and use a library like Neighbor
Disco uses high-performance matrix factorization.
Specify the number of factors and epochs
Disco::Recommender.new(factors: 8, epochs: 20)
If recommendations look off, trying changing
factors. The default is 8, but 3 could be good for some applications and 300 good for others.
Pass a validation set with:
recommender.fit(data, validation_set: validation_set)
Collaborative filtering suffers from the cold start problem. It’s unable to make good recommendations without data on a user or item, which is problematic for new users and items.
recommender.user_recs(new_user_id) # returns empty array
There are a number of ways to deal with this, but here are some common ones:
Data can be an array of hashes
[{user_id: 1, item_id: 1, rating: 5}, {user_id: 2, item_id: 1, rating: 3}]
Or a Rover data frame
Rover.read_csv("ratings.csv")
Or a Daru data frame
Daru::DataFrame.from_csv("ratings.csv")
If you have a large number of users/items, you can use an approximate nearest neighbors library like NGT to speed up item-based recommendations and similar users.
Add this line to your application’s Gemfile:
gem 'ngt', '>= 0.3.0'
Speed up item-based recommendations with:
model.optimize_item_recs
Speed up similar users with:
model.optimize_similar_users
This should be called after fitting or loading the model.
Get ids
recommender.user_ids recommender.item_ids
Get the global mean
recommender.global_mean
Get factors
recommender.user_factors recommender.item_factors
Get factors for specific users and items
recommender.user_factors(user_id) recommender.item_factors(item_id)
Thanks to:
View the changelog
Everyone is encouraged to help improve this project. Here are a few ways you can help:
To get started with development:
git clone https://github.com/ankane/disco.git cd disco bundle install bundle exec rake test