Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It will be slower for a 70b model since Deepseek is an MoE that only activates 37b at a time. That's what makes CPU inference remotely feasible here.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: