Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid copy on Refit #6478

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open

Conversation

cbourjau
Copy link

@cbourjau cbourjau commented Jun 12, 2024

The C-API exposes LGBM_BoosterRefit which receives the predicted leaf indices as a flat buffer that is laid out in the shape of nrow x ncol. The Boosting::RefitTree function, on the other hand, expects these indices as a nested vector object (std::vector<std::vector<int>>). Creating this nested object requires various additional allocations and amounts to an entire copy of the initial buffer doubling the memory requirements.

This PR changes the API of Boosting::RefitTree to take a pointer to a flat buffer of int32s, just like the C-API, thus avoiding the copy. This assumes that this part of the API is not regarded as stable. The C-API remains unchanged.

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your interest in LightGBM. It'll be at least a few days until someone's able to review this, as we're currently focused on finishing a release (#6439 (comment)).

@jameslamb
Copy link
Collaborator

[trigger ci]

Every CI run on this PR will require a maintainer manually approving it, because you've never contributed here before. Sorry for the inconvenience, but GitHub introduced this as a security measure a few years ago and we've decided to leave it enabled. We do occasionally receive malicious pull requests trying to use our CI resources 🙃

@cbourjau
Copy link
Author

Thanks for the feedback! Some CI does appear to run, though, and some of it exhibited IO failures the first time around which I was trying to address 🤷 .

Copy link
Collaborator

@borchero borchero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, looking forward to this performance improvement! I left a few comments, esp. regarding the changes to the application code

src/application/application.cpp Outdated Show resolved Hide resolved
src/application/application.cpp Outdated Show resolved Hide resolved
src/application/application.cpp Outdated Show resolved Hide resolved
src/boosting/gbdt.cpp Outdated Show resolved Hide resolved
include/LightGBM/boosting.h Outdated Show resolved Hide resolved
@cbourjau cbourjau requested a review from borchero June 17, 2024 15:33
Copy link
Collaborator

@borchero borchero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, I think we're getting there! 😁

src/application/application.cpp Outdated Show resolved Hide resolved
src/application/application.cpp Outdated Show resolved Hide resolved
src/application/application.cpp Show resolved Hide resolved
src/application/application.cpp Outdated Show resolved Hide resolved
src/application/application.cpp Outdated Show resolved Hide resolved
@borchero
Copy link
Collaborator

Thanks @cbourjau! AppVeyor CI will be fixed by #6490, that only leaves the linting job 😁

Copy link
Collaborator

@borchero borchero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks a lot for your contribution @cbourjau 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants