Add Swiglu activation function #128712
Labels
module: nn
Related to torch.nn
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
馃殌 The feature, motivation and pitch
Hey team, i love building things from scratch, and as i was implementing the LLaMa paper by meta obviously using pytorch i saw that pytorch did not have a nn.swiglu activation function. I did end up implementing it on my own, but i still feel like swiglu being used in mutliple other new language models, LLaMa being the most popler one, needs to be implemented in Pytorch and should be a part of it.
Alternatives
There are many other activation functions already in pytorch, but as results show, Swiglu has been really impactful in the success of LLaMa, so i think the community might benefit with it being in pytorch
Additional context
Also, ive been using pytorch for a long time and ive always build things from scratch, and it has always been on my todo list to contribute to pytorch, i feel like this might be it.
cc @albanD @mruberry @jbschlosser @walterddr @mikaylagawarecki
The text was updated successfully, but these errors were encountered: