Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build a recommender system #4

Open
SebastinSanty opened this issue Dec 22, 2016 · 14 comments
Open

Build a recommender system #4

SebastinSanty opened this issue Dec 22, 2016 · 14 comments

Comments

@SebastinSanty
Copy link
Member

A recommender system, for suggesting songs to users. For this we would also need a login system for the users. We also need to decide which attributes we will be working upon (like genre etc.). @kaivalyar Can you give some insight on this and start with a basic model. We'll catch up :). Also suggest what all you would require for building such a system, we'll try to provide an API.

@kaivalyar
Copy link
Member

kaivalyar commented Dec 22, 2016

typical algorithms for (oversimplistic) recommender systems might use KNNs. unfortunately, for all my ML talk and interest, i have little knowledge on this field. The concept of a KNN is simple though, and i can elaborate on the theory anytime. The problem with such a recommender system though, is that there is no way to tangibly measure the success of the system. How do we tell when a recommendation is good? how do we train the model? Any solutions to that can be provided via api?

@kaivalyar
Copy link
Member

kaivalyar commented Dec 22, 2016

KNN basics:

  • figure out the parameters a recommender system would depend on (metadata about current and past viewership trends)
  • quantify said parameters
  • plot each possible song (to be played in the future via recommendation) along with them (the viewership metadata from the past) as points in a vector space
  • calculate some (K) Nearest Neighbours
  • display these as recommendations

@mukkachaitanya
Copy link
Member

So for implementing this algo, we do need enough metadata for the songs the user has listened right?

@kaivalyar
Copy link
Member

yes of course, we need some info per song. artist name, song genre, song tempo, song style, etc etc ...

@coditva
Copy link
Member

coditva commented Dec 26, 2016

Can we use Last.fs API or Spotify API for this? We can get related artists, songs etc from that...

@0xRampey
Copy link
Member

@utkarshme That's a good idea. We can get good quality album artwork and music categories from the Spotify API, in case the music from DC doesn't have those. I'd suggest you add it to the feature-list in Projects.

@krishnacharya
Copy link

krishnacharya commented Dec 29, 2016

@kaivalyar I feel a music recommendation system should use a Collaborative filtering or some such unsupervised learning algorithm (we would then use Knn on this data). @mukkachaitanya The collaborative filtering would even allow for DC users playlists recommend songs to others.
Check this http://www.holehouse.org/mlclass/16_Recommender_Systems.html

@wazeerzulfikar
Copy link

@kaivalyar Recommender Systems can be built without using song metadata, there are two approaches to this:

  1. Connect similar user profiles using their likes. This would recommend music using another like-minded user. This can be implemented by the KNN algorithm basically working on profile similarity.

  2. Build associations between music tracks based on every user's choices. Eg: Two songs can be labeled similarly when a majority of users have liked both tracks. This is a workaround for developing a feature for music tracks instead of using song metadata. This can also be implemented using KNN and termed Collaborative Filtering.

If API's are used, to ensure the recommender system works offline, there would be a need to load all the metadata into a local database (can use HDF5 for large volumes of data), and build the recommender system using that.

@kaivalyar
Copy link
Member

@wazeerzulfikar if we have to choose between metadata and user tracking - I would prefer the former.

@wazeerzulfikar
Copy link

How about collaborative filtering, as it neither uses individual user tracking nor uses metadata? As in the particular user details are not needed for recommending to the user.

@kaivalyar
Copy link
Member

That is one option. Even tracking users isn't off the table yet, just to clarify.

Also, do you thinks we'll reach volumes so high as to require HDF5? I doubt that. ~500 concurrent users is a good estimate to work with, accessing songs that all fit into 200 GB. Metadata wouldn't exceed a few MB - normal file operations should be good enough I suppose.

@wazeerzulfikar
Copy link

That's true, I don't think we will be needing HDF5. I was just putting an upper cap. Direct storage of metadata might suffice.

@kaivalyar
Copy link
Member

kaivalyar commented Apr 14, 2017

@wazeerzulfikar and I have been discussing this extensively, and seem to think that metadata might be too complicated, and less useful as compared to user plays (collaborative filtering). We should use those instead. However, we first need some way of integrating a (python based?) recommender system with the NodeJS backend of Encore.

@kaivalyar kaivalyar removed their assignment May 17, 2017
@kaivalyar
Copy link
Member

@wazeerzulfikar Have a look here. We may not need to build this up from scratch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants