Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in ICE Plot with R 4.4 #16283

Open
tomasfryda opened this issue May 30, 2024 · 0 comments
Open

Bug in ICE Plot with R 4.4 #16283

tomasfryda opened this issue May 30, 2024 · 0 comments
Assignees
Labels

Comments

@tomasfryda
Copy link
Contributor

tomasfryda commented May 30, 2024

There is a bug which makes the plot weird and likely incorrect. I used red rectangle and red arrow to emphasize the differences that I noticed. The red arrow shows an issue with empty string category that is now being split in the plot in to NAs which has some values in the rug plot but not in the empty string category.

result

train <- h2o.importFile("http://s3.amazonaws.com/h2o-public-test-data/smalldata/titanic/titanic_expanded.csv")
y <- "fare"

col_types <- setNames(unlist(h2o.getTypes(train)), names(train))
col_types <- col_types[names(col_types) != y]
cols_to_test <- names(col_types[!duplicated(col_types)])

aml <- h2o.automl(y = y,
                  max_models = 5,
                  training_frame = train,
                  seed = 1234)
h2o.ice_plot(h2o.get_best_model(aml, "gbm"), train, "boat")

It used to look like:
baseline

Note that the old version is has a bug too. Looking at the empty string category it is the most common so the histogram in the background should show that as well.

> h2o.table(train$boat)
  boat Count
1        823
2    1     5
3   10    29
4   11    25
5   12    19
6   13    39

pd_plot also seems to be affected by the same issue. IIRC Zuzana refactored the code so that the common things in ICE and PDP are in one function so it's possible it is just one bug

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants