Hacker News new | past | comments | ask | show | jobs | submit login

If you have the option to disable the mask, isn't it then a generic nn.Block?

The ability to disable the mask in nn.MultiHeadSelfAttention, then having nn.DecoderBlock and a nn.EncoderBlock.

I think that’s it. I’ll probably add that soon

It would be with some simple tweaks. For instance, the current block does not support Cross-Attention, just Self-Attention.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
