training stops many seconds to create new queue of data #50

Adibian · 2022-01-18T11:09:57Z

Hi
When I start training, time of train of any step is good and it uses GPU but after 32 steps (if _batches_per_group=32 in datafeeder) GPU utilization going to 0 and after many seconds the queue of data becomes ready and again training starts.
I saw datafeeder.py and it uses threading for queue of data but why it stops training? How can I increase GPU utilization?
I increased _batches_per_group in datafeeder.py but it causes training stops more seconds. Following picture show that if I set _batches_per_group=128 training stops for 89 seconds!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

training stops many seconds to create new queue of data #50

training stops many seconds to create new queue of data #50

Adibian commented Jan 18, 2022 •

edited

Loading

training stops many seconds to create new queue of data #50

training stops many seconds to create new queue of data #50

Comments

Adibian commented Jan 18, 2022 • edited Loading

Adibian commented Jan 18, 2022 •

edited

Loading