Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

training stops many seconds to create new queue of data #50

Open
Adibian opened this issue Jan 18, 2022 · 0 comments
Open

training stops many seconds to create new queue of data #50

Adibian opened this issue Jan 18, 2022 · 0 comments

Comments

@Adibian
Copy link

Adibian commented Jan 18, 2022

Hi
When I start training, time of train of any step is good and it uses GPU but after 32 steps (if _batches_per_group=32 in datafeeder) GPU utilization going to 0 and after many seconds the queue of data becomes ready and again training starts.
I saw datafeeder.py and it uses threading for queue of data but why it stops training? How can I increase GPU utilization?
I increased _batches_per_group in datafeeder.py but it causes training stops more seconds. Following picture show that if I set _batches_per_group=128 training stops for 89 seconds!

Capture

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant