bobsondugnut's comments

bobsondugnut · on March 18, 2025

success is the combination of luck and preparation. NVIDIA has been well prepared + anticipating deep learning to take off for a long while.

bobsondugnut · on Feb 21, 2025

not a fan of these kinds of arguments. the 'correct' token is entirely dependent on the dataset. a LLM could have perfect training loss given a dataset, but this has no predictive power on its ability to 'answer' arbitrary prompts.

In natural language, many strings are equally valid. there are many ways to chain tokens together to get the 'correct' answer to an in sample prompt. A model with perfect loss will then for ambiguous sequences of tokens, produce a likelihood over the next tokens that corresponds to number valid token paths in the given corpus given the next token.

Compounding errors can certainly happen, but for many things upstream of the key tokens its irrelevant. There are so many ways to phrase things that are equally correct- I mean this is how language evolved (and continues to). Getting back to my first point, if you assume you have a LLM with perfect loss on the training dataset, you still can get garbage back at test time- thus i'm not sure thinking about 'compounding errors' is useful.

Errors in LLM reasoning I suspect are more closely related to noisy training data or an overabundance of low quality training data. I've observed this in how all the reasoning LLMs work, given things that are less common in the corpus of (the internet and digital assets) and require higher order reasoning, they tend to fail. Whereas these advanced math or programming problems tend to go a bit better, input data is likely much cleaner.

But for something like: how do I change the fixture on this light, I'll get back some kind of garbage from the SEO-verse. IMO next step for LLMs is figuring out how to curate an extremely high quality dataset at scale.

bobsondugnut · on Nov 12, 2024

> ChatGPT is neat. For all we know we’re near a local maxima of what we’re capable of achieving without another completely new approach that will take 10 or 15 years to figure out. There’s no proof that the acceleration and capabilities we’ve seen over the last 2 to 3 years will continue like that.

Two issues here:

1) we are only about ~10 years into the deep learning boom

2) we've seen deep learning scale with compute over this 10 years, not only over the last 2-3 years.

It could be we've reached the end of the road for NLP, no one really knows. But generally we see breakthroughs in lockstep with big jumps in compute capability (typically, GPU releases, occasionally with architecture changes).

tim333 · on Nov 13, 2024

>... local maxima ... new approach that will take 10 or 15 years ...

I was listening to the recent interviews with Sam Altman and the Anthropic guy who are familiar with current research and they are very not like that. It's more wow we've got so much to build, AGI in a couple of years. (For it seems to me a rather limited version of AGI - more can code well rather than can fix your plumbing.)

peppery-idiot · on Nov 13, 2024

Their future success is heavily tied to that set of opinions being correct and drumming up further investment. Even with the best will in the world, this type of quantitative opinion will be hugely positively biased.

They are CEOs, half the job is public cheerleading.

red-iron-pine · on Nov 13, 2024

> AGI in a couple of years.

this will be the new "fusion in 10 years", but with the added downside of expending a small-country's worth of carbon per day while not actually getting us there.

like, of course sam altman is going to talk about how close they are and how they need more money.

MBCook · on Nov 13, 2024

https://www.bloomberg.com/news/articles/2024-11-13/openai-go...

Yeah, we don’t know for sure. But it seems like there are signs the easy gains may be over.

bobsondugnut · on June 25, 2024

when was the last time AMD hardware was keeping up with NVIDIA? 2014?

0cf8612b2e1e · on June 25, 2024

Been a while since AMD had the top tier offering, but it has been trading blows in the middle tier segment the entire time. If you are just looking for a gamer card (ie not max AI performance), the AMD is typically cheaper and less power hungry than the equivalent Nvidia.

aurareturn · on June 25, 2024

It’s trading blows because AMD sells their cards at lower margins in the midrange and Nvidia lets them.

bee_rider · on June 26, 2024

But, the fact that Nvidia cards command higher margins also reflects their better software stack, right? Nvidia “lets them” trade blows in the midrange, or, equivalently, Nvidia is receiving the reward of their software investments: even their midrange hardware commands a premium.

bobsondugnut · on June 25, 2024

> the AMD is typically cheaper and less power hungry than the equivalent Nvidia

cheaper is true, but less power hungry is absolutely not true, which is kind of my point.

dralley · on June 25, 2024

It was true with RDNA 2. RDNA 3 regressed on this a bit, supposedly there was a hardware hiccup that prevented them from hitting frequency and voltage targets that they were hoping to reach.

In any case they're only slightly behind, not crazy far behind like Intel is.

bee_rider · on June 25, 2024

The MI300X sounds like it is competitive, haha

bobsondugnut · on June 25, 2024

competitive with H100 for inference. a 2 year old product on just one half of the ML story. H200 (and potentially B100) is the appropriate comparison based on their production in volume.