espotifai

Automatic Playlist Recommender.

We studied and implemented some algorithms to deal with the playlist continuation problem. Check out our website with the report of this work and our screencast.

This is our final project for Foundations of Data Science, a Mathematical Modelling Master's subject at Getulio Vargas Foundation (FGV).

Group: Lucas Emanuel Resck Domingues and Lucas Machado Moschen. Professor: Dr. Jorge Poco.

Abstract

This repository contains our approach to the playlist continuation problem. We scraped data from Spotify and Last.fm and we made an exploratory data analysis. We also implemented models of playlist continuation and we saw good results. We develop a website to expose our work.

Summary repository structure

├─ documents -------------------- Deliverables of our project
├─ images ----------------------- Images for our deliverables and README
├─ notebooks
│  ├─ data_scrapping ------------ Notebooks to scrap data
│  ├─ eda ----------------------- Notebooks of EDA
│  ├─ playlist_similarity_model - Model based on playlist similarity
│  └─ track_similarity_model ---- Model based on track similarity
├─ report ----------------------- Our website documents
└─ scripts ---------------------- Scripts to generate data

Usage example

You can:

Get a list of Last.fm users
Scrap their public data in Last.fm and Spotify
Make an exploratory data analysis of these datasets
Analyse both recommendation models

All notebooks are very well documented and the models are explained in them.

List of users

In a network propagation fashion, users are gathered from Last.fm. To do this, run:

python generate_lastfm_users.py -h

Data scraping

To scrap data from Spotify and Last.fm, run the notebooks of the folder notebooks/data_scraping/.

Exploratory Data Analysis

The template for an EDA of both datasets are inside the folder notebooks/eda/. Fell free to edit and addapt it to your own needs.

Analyse the models

Three models are implemented and documented inside notebooks/.

The first is baseline model, with a random walk in a bipartite graph (simplest similarity matrix). The second is a model based on track similarity, and the third is based on playlist similarity. Each model has its notebook detailing the math behind it, as well as the code.

Development setup

We used the packages Spotipy and Pylast to scrap data from Spotify and Last.fm. Just install the requirements:

pip install -r requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

espotifai

Abstract

Summary repository structure

Usage example

List of users

Data scraping

Exploratory Data Analysis

Analyse the models

Development setup

About

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 183 Commits
documents		documents
images		images
notebooks		notebooks
report		report
scripts		scripts
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

lucasresck/espotifai

Folders and files

Latest commit

History

Repository files navigation

espotifai

Abstract

Summary repository structure

Usage example

List of users

Data scraping

Exploratory Data Analysis

Analyse the models

Development setup

About

Topics

Resources

Stars

Watchers

Forks

Contributors 3

Languages