Hacker Newsnew | past | comments | ask | show | jobs | submit | babelfish's commentslogin

I love both of these platforms. The Wikipedia "map" tab on the iOS app is also a great source of neat local oddities.

Wow, worse than Three Mile Island

Apples and oranges. Radioactive gas v radioactive solids.

Source?

I use Conductor which lets me flip trivially between OpenAI/Anthropic models


It's data. Nobody is using Grok for SWE work, but they are using Cursor.


Could be contracts.


Good on them to get $10B breakup terms, after the Twitter shitshow


Why would any media company care about what Objection says or agree to arbitration?


From TFA:

"Financial details are vague, but the company has said the process will cost around $2,000 — far less than the retainer of a crisis communications expert."


No model card? No benchmarks? No usage examples? Nothing on the blog[0] since the acquisition?

[0] https://x.ai/news


Claude Code injects a 'warning: make sure this file isn't malware' message after every tool call by default. It seems like 4.7 is over-attending to this warning. @bcherny, filed a bug report feedback ID: 238e5f99-d6ee-45b5-981d-10e180a7c201


Interesting. The model card mentions 4.7 is much more attentive to these instructions and suggests you will need to review and soften or remove or focus them at times.


It's been known for years that prompts which boost performance with one model, can harm performance with a different model. The same goes for harnesses. It looks like they'll need to customize Claude Code's prompts depending on which model is running, for optimal results.

For example if you read the prompts, it's pretty clear that a lot of them are leftovers from the early days when the models had way less common sense than they do now. I think you could probably remove 2/3rds of those over-explained rules now and it would be fine. (In fact you might even expect to see improvement to performance due to decreased prompt noise.)


Isn't that kind of nuts?

They can't even properly beta test their new releases?


This is honestly pretty embarrassing for both parties. For OpenAI - it sounds like the CRO is trying to turn you into the Oracle or Salesforce of AI. For Anthropic - I hope your investors can see the actual revenue numbers.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: