The article claims that every Claude model other than Opus 4.6 reliably fails. T...

		bakugo 5 days ago \| parent \| context \| favorite \| on: “Car Wash” test with 53 models The article claims that every Claude model other than Opus 4.6 reliably fails. This is not true, Sonnet 3.5 answers correctly around half of the time, even though it's such an old model it's not even available on the main API anymore.

		help