Scikit-learn Smithy

Scikit-learn smithy is a tool that helps you to forge scikit-learn compatible estimator with ease.

WebUI | Documentation | Repository | Issue Tracker

How can you use it?

✅ Directly from the browser via a Web UI.

Available at sklearn-smithy.streamlit.app
It requires no installation.
Powered by streamlit

✅ As a CLI (command line interface) in the terminal.

Available via the smith forge command.
It requires installation: python -m pip install sklearn-smithy
Powered by typer.

✅ As a TUI (terminal user interface) in the terminal.

Available via the smith forge-tui command.
It requires installing extra dependencies: python -m pip install "sklearn-smithy[textual]"
Powered by textual.

All these tools will prompt a series of questions regarding the estimator you want to create, and then it will generate the boilerplate code for you.

Why ❓

Writing scikit-learn compatible estimators might be harder than expected.

While everyone knows about the fit and predict, there are other behaviours, methods and attributes that scikit-learn might be expecting from your estimator depending on:

The type of estimator you're writing.
The signature of the estimator.
The signature of the .fit(...) method.

Scikit-learn Smithy to the rescue: this tool aims to help you crafting your own estimator by asking a few questions about it, and then generating the boilerplate code.

In this way you will be able to fully focus on the core implementation logic, and not on nitty-gritty details of the scikit-learn API.

Sanity check

Once the core logic is implemented, the estimator should be ready to test against the somewhat official parametrize_with_checks pytest compatible decorator:

from sklearn.utils.estimator_checks import parametrize_with_checks

@parametrize_with_checks([
    YourAwesomeRegressor,
    MoreAwesomeClassifier,
    EvenMoreAwesomeTransformer,
])
def test_sklearn_compatible_estimator(estimator, check):
    check(estimator)

and it should be compatible with scikit-learn Pipeline, GridSearchCV, etc.

Official guide

Scikit-learn documentation on how to develop estimators.

Supported estimators

The following types of scikit-learn estimator are supported:

✅ Classifier
✅ Regressor
✅ Outlier Detector
✅ Clusterer
✅ Transformer
- ✅ Feature Selector
🚧 Meta Estimator

Installation

sklearn-smithy is available on pypi, so you can install it directly from there:

python -m pip install sklearn-smithy

Remark: The minimum Python version required is 3.10.

This will make the smith command available in your terminal, and you should be able to run the following:

smith version

sklearn-smithy=...

Extra dependencies

To run the TUI, you need to install the textual dependency as well:

python -m pip install "sklearn-smithy[textual]"

User guide 📚

Please refer to the dedicated user guide documentation section.

Origin story

The idea for this tool originated from scikit-lego #660, which I cannot better explain than quoting the PR description itself:

So the story goes as the following:

The CI/CD fails for scikit-learn==1.5rc1 because of a change in the check_estimator internals

In the scikit-learn issue I got a better picture of how to run test for compatible components

In particular, rolling your own estimator suggests to use parametrize_with_checks, and of course I thought "that is a great idea to avoid dealing manually with each test"

Say no more, I enter a rabbit hole to refactor all our tests - which would be fine

Except that these tests failures helped me figure out a few missing parts in the codebase

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
.github		.github
.streamlit		.streamlit
docs		docs
requirements		requirements
sksmithy		sksmithy
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
_typos.toml		_typos.toml
mkdocs.yml		mkdocs.yml
noxfile.py		noxfile.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scikit-learn Smithy

Why ❓

Sanity check

Official guide

Supported estimators

Installation

Extra dependencies

User guide 📚

Origin story

About

Releases 6

Packages

Languages

License

FBruzzesi/sklearn-smithy

Folders and files

Latest commit

History

Repository files navigation

Scikit-learn Smithy

Why ❓

Sanity check

Official guide

Supported estimators

Installation

Extra dependencies

User guide 📚

Origin story

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 6

Packages 0

Languages

Packages