Add PL Lightning to Enable Distributed Training and Deep Speed #59

aribornstein · 2021-04-05T07:26:25Z

This work is awesome I see that PL is already being used for some of the DataModules it would be awesome to see Lightning Module Integration to make training more robust. https://pytorch-lightning.readthedocs.io/en/stable/starter/converting.html

hypnopump · 2021-04-05T08:30:15Z

Hi there! It is always nice to find new people stopping by... We're glad you're finding it interesting!

So yes, we plan to use pytorch lightning to write the dataloaders and training scripts in principle... will keep you posted!

lucidrains · 2021-04-05T21:35:49Z

@aribornstein hello! yes, we are leaning pytorch-lightning, as long as it can support a use-case of ours - we need to curriculum learn the folding starting from short sequences -> longer ones

otherwise, we will also highly consider deepspeed!

aribornstein · 2021-04-11T07:03:46Z

Thats awesome I believe PyTorch Lightning should support that also DeepSpeed is fully integrated in lightning and it's features are accessible in just a few lines of code. Let me know if you have any questions.

https://medium.com/pytorch-lightning/accessible-multi-billion-parameter-model-training-with-pytorch-lightning-deepspeed-c9333ac3bb59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add PL Lightning to Enable Distributed Training and Deep Speed #59

Add PL Lightning to Enable Distributed Training and Deep Speed #59

aribornstein commented Apr 5, 2021

hypnopump commented Apr 5, 2021

lucidrains commented Apr 5, 2021

aribornstein commented Apr 11, 2021

Add PL Lightning to Enable Distributed Training and Deep Speed #59

Add PL Lightning to Enable Distributed Training and Deep Speed #59

Comments

aribornstein commented Apr 5, 2021

hypnopump commented Apr 5, 2021

lucidrains commented Apr 5, 2021

aribornstein commented Apr 11, 2021