Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

When was this profitability report, because the cost per token generation has dropped significantly.

When GPT4 was launched last year, the API cost was about $36/M blended tokens, but you can now get GPT4o tokens for about $4.4/M tokens, Gemini 1.5 Pro for $2.2/M or DeepSeek-V2 (as 21B A/236B W model that matches GPT4 on coding) for as low as $0.28/M tokens (over 100X cheaper for the same quality output over the course of about 1.5 years).

The just released Qwen2.5-Coder-7B-Instruct (Apache 2.0 licensed) also basically matches/beats GPT4 on coding benchmarks and quantized can not only can run at a decent speed on just about any consumer gaming GPU, but on most new CPUs/NPUs as well. This is about a 250X smaller model than GPT4.

There are now a huge array of open weight (and open source) models that are very capable and that can be run locally/on the edge.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: