This is the Binary classification repo built on top of any feature table generated by RudderStack profiles. It can build predictive features such as:
- Churn prediction: Whether a user will churn or not in the next 30 days (or any other time period)
- Conversion prediction: Whether a user will convert or not in the next 30 days (or any other time period)
- Any other problem which can be framed in a yes/no fashion (ex: whether a customer is going to make a purchase in the next n days)
The expected way to run this repo is through a RudderStack profiles project, linking this github repo url in a python model. One such project can be found here.
Once this repo is linked in a python_model
inside a profiles project, you can run that project just like any other project, by firing the command pb run
. But before that, you need to perform two steps (you can skip them if you want to run the models directly through RudderStack webapp and not locally):
You can create a virtual environment either through Conda or through the venv module that comes by default with Python. Both the approaches are outlined below.
conda create -n pysnowpark --override-channels -c https://repo.anaconda.com/pkgs/snowflake python=3.8
NOTE - There is a known issue with running Snowpark Python on Apple silicon chips due to memory handling in pyOpenSSL. The error message displayed is, “Cannot allocate write+execute memory for ffi.callback()”.
As a workaround, set up a virtual environment that uses x86 Python using these commands:
CONDA_SUBDIR=osx-64 conda create -n pysnowpark python=3.8 --override-channels -c https://repo.anaconda.com/pkgs/snowflake
conda activate pysnowpark
conda config --env --set subdir osx-64
After creating the environment, you need to install the requirements inside the environment using "pip install -r requirements.txt".
NOTE- If you are running the code on Mac M1/M2, you need to install xgboost seperately using below lines -
brew install libomp
conda install -c conda-forge py-xgboost==1.5.0
For MAC OS Install python 3.8 runtime
brew install [email protected]
Run the following command to create the environment
python3.8 -m venv pysnowpark
Activate the environment and install the dependencies
source pysnowpark/bin/activate
pip install -r requirements.txt
NOTE- You might need to install another dependency: libomp seperately using below lines -
brew install libomp
Python models are disabled by default in profiles. You can enable them by adding following lines in the site_config file:
py_models:
enabled: true
python_path: <path_to_env> # You can get this by running `which python` in the terminal after you activate your virtual env
Refer to our docs page for how to set up a python model in your project. There are various advanced config options that you can find in the model_configs.yaml file. You can add these options in the python model of your profiles project to override the defaults.