Skip to content

Obtaining similar users in the ml-20m dataset using min hashing to obtain the signature matrix and LSH to obtain nearest neighbour.

Notifications You must be signed in to change notification settings

Nanthini10/Similar-Duo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Min-Hashing

Authors: Harshat, Nanthini

  • Getting the signature matrix of the ml-20m dataset with respect to users and movies they have rated >2.5

  • Generating a set representation of the users who like a set of movies

  • Using the signature matrix to efficiently retrieve similar pairs of users

  • Using LSH to retrieve the nearest neighbor (most similar user) to an input user

About

Obtaining similar users in the ml-20m dataset using min hashing to obtain the signature matrix and LSH to obtain nearest neighbour.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages