Is this a new feature, an improvement, or a change to existing functionality?
Improvement
How would you describe the priority of this feature request
Low (would be nice)
Please provide a clear description of problem you would like to solve.
Currently, Mlp is hardcoded to use BatchNorm as the only type of normalization. It would be great to extend this to other norms, like LayerNorm (and possibly accept user-supplied norms, like those from TransformerEngine).
Stems from discussion here with @coreyjadams @laserkelvin:
#1401 (comment)
Describe any alternatives you have considered
No response
Is this a new feature, an improvement, or a change to existing functionality?
Improvement
How would you describe the priority of this feature request
Low (would be nice)
Please provide a clear description of problem you would like to solve.
Currently,
Mlpis hardcoded to useBatchNormas the only type of normalization. It would be great to extend this to other norms, likeLayerNorm(and possibly accept user-supplied norms, like those from TransformerEngine).Stems from discussion here with @coreyjadams @laserkelvin:
#1401 (comment)
Describe any alternatives you have considered
No response