is it really unthinkable that another oss/local model will be released by deepse...

zozbot234 · 2026-04-18T17:12:04 1776532324

> is it really unthinkable that another oss/local model will be released by deepseek, alibaba, or even meta that once again give these companies a run for their money

Plenty of OSS models being released as of late, with GLM and Kimi arguably being the most interesting for the near-SOTA case ("give these companies a run for their money"). Of course, actually running them locally for anything other than very slow Q&A is hard.

slowmovintarget · 2026-04-18T17:07:08 1776532028

Qwen released a new model the same day (3.6). The headline was kind of buried by Anthropic's release, though.

https://news.ycombinator.com/item?id=47792764

rectang · 2026-04-18T17:46:12 1776534372

For my working style (fine-grained instructions to the agent), Opus 4.5 is basically ideal. Opus 4.6 and 4.7 seem optimized for more long-running tasks with less back and forth between human and agent; but for me Opus 4.6 was a regression, and it seems like Opus 4.7 will be another.

This gives me hope that even if future versions of Opus continue to target long-running tasks and get more and more expensive while being less-and-less appropriate for my style, that a competitor can build a model akin to Opus 4.5 which is suitable for my workflow, optimizing for other factors like cost.

DeathArrow · 2026-04-18T20:07:44 1776542864

Have you tried GLM 5.1?

amelius · 2026-04-18T17:05:37 1776531937

I'm betting on a company like Taalas making a model that is perhaps less capable but 100x as fast, where you could have dozens of agents looking at your problem from all different angles simultaneously, and so still have better results and faster.

andai · 2026-04-18T17:37:58 1776533878

Yeah, it's a search problem. When verification is cheap, reducing success rate in exchange for massively reducing cost and runtime is the right approach.

never_inline · 2026-04-18T17:53:06 1776534786

You underestimating the algorithmic complexity of such brute forcing, and the indirect cost of brittle code that's produced by inferior models

100ms · 2026-04-18T20:01:40 1776542500

I'm excited for Taalas, but the worry with that suggestion is that it would blow out energy per net unit of work, which kills a lot of Taalas' buzz. Still, it's inevitable if you make something an order of magnitude faster, folk will just come along and feed it an order of magnitude more work. I hope the middleground with Taalas is a cottage industry of LLM hosts with a small-mid sized budget hosting last gen models for quite cheap. Although if they're packed to max utilisation with all the new workloads they enable, latency might not be much better than what we already have today

embedding-shape · 2026-04-18T17:06:41 1776532001

Nothing is unthinkable, I could think of Transformers.V2 that might look completely different, maybe iterations on Mamba turns out fruitful or countless of other scenarios.

casey2 · 2026-04-18T19:22:58 1776540178

This regression put Anthropic behind Chinese models actually.

pitched · 2026-04-18T17:05:26 1776531926

Now that Anthropic have started hiding the chain of thought tokens, it will be a lot harder for them

zozbot234 · 2026-04-18T17:14:33 1776532473

Anthropic and OpenAI never showed the true chain of thought tokens. Ironically, that's something you only get from local models.