Hacker News new | past | comments | ask | show | jobs | submit login

Other research from Meta FAIR actually suggests that you should prune deeper layers if you want to improve performance while maintaining accuracy [1]. So there must be a cutoff point for smaller networks where this approach still works, otherwise the results are contradictory. Or we could drastically improve these new models even further.

[1] https://arxiv.org/html/2403.17887v1




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: