does FSDP support AMSP (a new DP shard strategy) #128706
Labels
oncall: distributed
Add this issue/PR to distributed oncall triage queue
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
馃殌 The feature, motivation and pitch
there's a new DP shard strategy which is more flexible and general, see more detail at https://arxiv.org/abs/2311.00257 AMSP: Reducing Communication Overhead of ZeRO for Efficient LLM Training
Does FSDP support similar feature? If not, any plan to support it? thanks.
Alternatives
No response
Additional context
No response
cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu @penguinwu @fegin @XilunWu @wanchaol @fduwjj @wz337 @tianyu-l @wconstab @yf225 @chauhang @d4l3k
The text was updated successfully, but these errors were encountered: