An Interview with Nvidia CEO Jensen Huang About AI’s iPhone Moment

bottlepalm · on March 25, 2023

Man how lucky is Nvidia? crypto winds down just as AI is ramps up.

ww520 · on March 25, 2023

It's not just luck but good strategy. In the past 10 to 15 years, Nvidia has been leveraging its core GPU to go beyond gaming video card to expand into different peripheral areas where massive parallel computing is needed, such as super computing, cloud computing, animation farm, CAD, visualization, simulation, car, VR, AI, and Crypto. They have been able to catch/enable one wave or the other because it's part of their roadmap.

enos_feedler · on March 26, 2023

I worked at NV on compilers when the CUDA toolchain was brand new. Nobody would have saw this magnitude of parallel applications. Maybe some computer vision stuff. At the time it seemed like diversification from gaming to try and break into HPC, etc.

tpmx · on March 25, 2023

They have been going pretty much everywhere with their CUDA runtime. LLMs was a random hit.

At the same time, it doesn't seem like a great moat - I think AMD should be to able compete pretty well soon. I think TSMC/Samsung/ASML will capture quite a lot of the profit in the boom to come.

Our_Benefactors · on March 26, 2023

CUDA not a moat? It’s physical hardware combined with extensive software libraries. How much more of a moat could they have?

tpmx · on March 26, 2023

In the end it's about who can do the most (probably low precision) FLOPS/$ - CUDA becomes less of a moat the more established the field becomes and the software settles. There's nothing magical about the low-level arithmetics that powers this.

Nvidia will have a first-mover advantage, but AMD will be strongly motivated to win those customers, and they should be able to, eventually.

Anyway, that's my gut instinct based on how these things tend to play out. Now let's hear from e.g. actual experts in the field? :)

ketzo · on March 26, 2023

There's a winner-take-all dynamic at play, though.

The better CUDA is, the more things get built on top of it.

The more things get built on top of CUDA, the better the CUDA ecosystem of tooling, frameworks, etc. gets.

The better the CUDA ecosystem is, the better CUDA is. GOTO 1.

But you're right -- lots and lots of folks will be motivated to compete.

roenxi · on March 26, 2023

I dunno; CUDA is a library. How many closed source libraries win winner-takes-all games?

Nvidia is a lonely proprietary ship in a hostile sea of open source packages. It might take 20 years if Intel and AMD fail to do anything useful, but sooner or later they'll either open source CUDA defensively or get ground down by an open library. There are a finite list of features to implement before "Runs on Nvidia" vs "Runs on any graphics card" becomes the only important box left on the bureaucrat's checklist.

The math here is literally 1st & 2nd year university subjects. That is a good short term moat, but not a defensible one long term.

CuriouslyC · on March 26, 2023

Except that CUDA is low level, so it's not hard to shim above it and write interoperable code. There are too many players who don't want to play the Nvidia tax, this will play out like OpenGL vs Direct3d in reverse.

ethbr0 · on March 26, 2023

> this will play out like OpenGL vs Direct3d in reverse

Is that also like OpenCL vs CUDA in reverse?

WanderPanda · on March 26, 2023

I feel like abstractions don't work if we want to get the maximum performance. Afaik Tensor cores are not usable from opencl and on the other hand even in the CUDA universe cuBLAS (hand optimized) seems to outperform cutlass (using abstractions)

mastazi · on March 26, 2023

> I think AMD should be to able compete pretty well soon.

AMD has been trying to catch up in the machine learning space for many years now, and it hasn't happened. For things to finally improve, I guess something needs to change internally at AMD. Even Apple Silicon, in the short time it has existed, has gained better support for ML (e.g. compare running PyTorch on Apple Silicon vs running PyTorch on AMD).

dharma1 · on March 26, 2023

100% this. Not seeing AMD make significant progress in ML against CUDA and other proprietary Nvidia libraries for years. The silicon is clearly capable but the software isn’t. Perhaps it’s time for leadership change in this division at AMD

danpalmer · on March 25, 2023

They have been investing in this for a long time, CUDA is 15 years old. That said, they have plenty of competition now and that's only going to increase.

- Apple have got their own chips now for mobile AI applications for iPhones.

- Google have got their own chips for Android and servers.

- AMD are gaining on Intel in the server space, and have their own GPUs, being able to sell CPUs and GPUs that complement each other may be a good strategy, plus AMD have plenty of their own experience with OpenCL.

mastax · on March 25, 2023

I'm not convinced AMD has a good AI play. (Disclaimer: I hold a long position in AMD).

AI hardware seems like it can be much simpler than GPGPUs, given the successful implementations by many companies including small startups.

AI hardware software seems like it is extremely difficult. Making a simple programmer and ops interface over a massively parallel, distributed, memory bandwidth constrained system that needs to be able to compile and run high performance custom code out of customers' shifting piles of random python packages.

AMD has continuously struggled at (2) and hasn't seemed to recommit to doing it properly. AMD certainly has silicon design expertise, but given (1) I don't think that is enough.

Xilinix is in interesting alternative path for products or improving AMDs software/devex. I'm not sure what to expect from that, yet.

noogle · on March 25, 2023

How much of a moat is CUDA?

It's indeed ages beyond any of their competitors. However, most ML/DS people interact with CUDA via a higher-level framework. In recent years this community has consolidated around a few (and even only one platform, PyTorch) framework. For some reason AMD had not invested in platform backends, but there is no network effect or a vendor lock-in to hinder a shift from CUDA to ROCm if it is supported equally well.

floatngupstream · on March 25, 2023

There is an enormous investment beside the training side. Once you have your model, you still need to run it. This is where Triton, TensorRT, and handcrafted CUDA kernels as plugins come in. There is no equivalent on ROCm for this (MIGraphX is not close).

noogle · on March 26, 2023

Models are re-trained periodically (months, weeks, even days), and new architectures/implementations come all the time. If a better algorithm appears, practitioners will adopt a new platform (e.g. Transformers for NLP models), so many systems can already plug-in new tools. GPUs are very expensive so there is also a strong incentive to make this little effort.

floatngupstream · on March 26, 2023

Yes, but this just makes a frictionless runtime for inference even more important (which is something that does not exist in a comparable form for AMD).

jacooper · on March 25, 2023

AMD GPUs are a joke for anything but gaming

danpalmer · on March 25, 2023

At the moment maybe, but owning the CPU may allow for better integration.

Imagine if AMD launch a new bus implementation from CPU to GPU. That's not something Nvidia can do by themselves. Maybe Nvidia buys Intel and does it though!

paulmd · on March 25, 2023

You mean a proprietary connection standard? Like G-Sync but for accelerator cards? ;)

https://www.tomshardware.com/news/amd-infinity-fabric-3

AMD already does that, and much like G-Sync there is also an open standard (CXL) that everyone else is converging around.

https://en.wikipedia.org/wiki/Compute_Express_Link

sofixa · on March 25, 2023

Yet. Their GPUs were not great for years, but have managed to catch up (and even overperform for their price point), so other workloads are yet to come.

jacooper · on March 25, 2023

Unfortunately with the way ROCm is developed, and how its only intended for specific GPUs, I doubt it.

Gigachad · on March 25, 2023

Still waiting for rocm support on the 5700xt which they kept promising was ready any day now.

jacooper · on March 26, 2023

Did they ever promise that?

A4ET8a8uTh0 · on March 25, 2023

Yeah... but have you tried using AMD GPUs for any LLMs for example? All the interesting stuff that is publicly released is for Nvidia. I would love to be able to focus on AMD since Nvidia has been adding some anti-user features lately.

bottlepalm · on March 25, 2023

They've been investing for a long time, but it's only blown up in the past year due to recent breakthroughs. Good timing for Nvidia.

jonas21 · on March 25, 2023

It's been blowing up since at least 2012-13 when deep convolutional neural nets started seeing massive success.

bottlepalm · on March 25, 2023

That's not blowing up. What's happening right now is blowing up.

jonas21 · on March 25, 2023

What's going on now is the continuation of a growth trend that started a decade ago.

raverbashing · on March 26, 2023

And Intel is, as usual, doing nothing

pier25 · on March 25, 2023

> Google have got their own chips for Android

What chips for Android?

simonh · on March 25, 2023

The last few Pixels have Google’s own Tensor chipsets, developed in collaboration with Samsung. These include ML hardware acceleration along similar lines to Apple’s neural engine.

pier25 · on March 26, 2023

Thanks!

imwithstoopid · on March 25, 2023

Nvidia is in an excellent position - they have CUDA as you point out, and they are moving that into server room dominance in this application space

Google has TPUs but have these even made a tiny dent in Nvidia's position?

I assume anything Apple is cooking is using Nvidia in the server room already

Intel seems completely absent from this market

AMD seems content to limit its ambitions to punching Intel

its Nvidia's game to lose at this point...I wonder when they start moving in the other direction and realize they have the power to introduce their own client platform (I secretly wish they would try to mainstream a linux laptop running on Nvidia ARM but obviously this is just a fantasy)

if anything, I think Huang may not be ambitious enough!

jitl · on March 25, 2023

Tegra & later Shield were attempts to get closer to full end user platform. The Nintendo Switch is their most successful such device — with a 2-year old Tegra SKU at launch. But going full force into consumer tech is a distraction for them right now. Even the enthusiast graphics market, which should be high margin, is losing their interest. They make much more selling to the big enterprise customer CEO Jensen mentions in the open paragraph.

echelon · on March 25, 2023

Gamers are going to be so pissed. They subsided the advance in GPU compute and will now be ignored for the much more lucrative enterprise AI customers.

Nvidia is making the right call, of course.

anonylizard · on March 25, 2023

Have they ever considered that the subsidy goes the other way? The margins on an A100 card is probably 100% higher than a RTX4090. Gaming industry is also like THE first industry to be revolutionized by AI. Current stuff like DLSS and AI-accelerated path tracing are mere toys compared to what will come.

Nvidia will not give up gaming. When every gamer has a Nvidia card, every potential AI developer to spring up from those gamers, will use Nvidia by default. It also helps gaming GPUs are still lucrative.

echelon · on March 26, 2023

> Gaming industry is also like THE first industry to be revolutionized by AI.

That's a great counter point.

> Nvidia will not give up gaming. When every gamer has a Nvidia card, every potential AI developer to spring up from those gamers, will use Nvidia by default. It also helps gaming GPUs are still lucrative.

Another. But Nvidia will have a lot of balancing to do and some very thirsty competitors. Though if competition arises, that too is good for gamers.

fennecfoxy · on March 28, 2023

Nah, volume sales & the ability to bin. A100 would be a much more expensive product if they couldn't sell defective chips as consumer GPUs. Pretty sure that the R&D cost of the workstation cards means that those cards are technically sold at a loss with Nvidia knowing that consumer sales will make up for it.

dleslie · on March 25, 2023

It's OK, gaming is also having its AI moment.

I fully expect future rendering techniques to lean heavily on AI for the final scene. NeRF, diffusion models, et cetera are the thin end of the wedge.

smoldesu · on March 25, 2023

Gamers are in heaven right now. Used 30-series cards are cheap as dirt, keeping the pressure on Intel/AMD/Apple to price their GPUs competitively. The 40-series cards are a hedged bet against anything their competitors can develop - manufactured at great cost on TSMC's 4nm node and priced out-of-reach for most users. Still, it's clear that Nvidia isn't holding out their best stuff, just charging exorbitant amounts for it.

layoric · on March 25, 2023

Where are these cheap as dirt 30 series? A 10gb 3080 is still over $500 usd used ($750 aud) when I’ve looked. When did secondhand GPUs that still cost the same as a brand new PS5 start to be considered cheap?

my123 · on March 25, 2023

A PS5 is _significantly_ slower than a 3080, it's more RTX 2070 tier.

hkng5994 · on March 25, 2023

Sure but the 3080 is a single component while the PS5 includes everything needed to run the game. The way GPU prices have inflated over the past couple generations has been absurd.

layoric · on March 25, 2023

Yes, but it is a complete gaming system. My point is $500 is still a lot of money for a used GPU that is nearly 3 years old. Jut retailed brand new for $699. So yes, prices were crazy over that period for a variety of reasons, but that shouldn’t shift the gaming value proposition so dramatically.

smoldesu · on March 26, 2023

> My point is $500 is still a lot of money for a used GPU that is nearly 3 years old.

Yeah. It's good hardware. You can get cheaper cards (even cost-competitive options) on PC but Nvidia won't sell them to you. Especially not now that they're got 10 billion dollars on their TSMC tab.

danpalmer · on March 25, 2023

> I assume anything Apple is cooking is using Nvidia in the server room already

I don't think Apple's server side is big or interesting. Far more interesting is the client side, because it's 1bn devices, and they all run custom Apple silicon for this. Similarly Google has Tensor chips in end user devices.

Nvidia doesn't have a story for edge devices like that, and that could be the biggest issue here for them.

smoldesu · on March 26, 2023

Nvidia has an install-base of several million desktop GPUs, though I'm not sure what that's worth in the long run. They've done a good job providing edge services like Shadowplay in the past, I wouldn't count them out on delivering client-side AI.

paulmd · on March 26, 2023

> I don't think Apple's server side is big or interesting. Far

Tangent but I wish they would!

Apple's e-cores would be great for servers, and they are very area/transistor efficient (even considering the node). 0.69mm2 for something with (broad strokes) Gracemont-ish performance/skylake-ish performance (but no SMT) is really good even considering the 5nm node.

I think the real-world density shrinks on 5nm ended up being around 60%... so 0.69mm2 for Blizzard is like 1.1mm2 equivalent on 7nm and Avalanche is 4.1mm2, versus Zen3 at 3.1mm2 and Zen2 at 2.72mm2.

10ESF density is supposed to be similar to TSMC 5nm, dunno how true that really is in practice on actual products. But on paper that means you have Gracemont at 1.7mm vs Blizzard at 0.69mm2 and Golden Cove at 5.55mm2 vs Avalanche at 2.55mm2.

Or comparing to AMD using the 1.6x conversion factor, that gives you a 7nm-area-equivalent (assuming 5nm density on 10ESF) of 2.72mm2 for Gracemont (vs Zen2 at 2.72mm2) and 8.88mm2 versus 3.1mm2 for Zen2. And that's why they're doing e-cores, and AMD is just squeezing the last little bit of space out of their existing uarch, lol.

https://www.reddit.com/r/hardware/comments/qlcptr/m1_pro_10c...

The M1 Pro/Max dies are mostly consumed by a gigantic iGPU (in a way it's similar to the latter days of Intel quadcore era) but the cores themselves are actually quite svelte - it's actually not a case of Apple "just throwing more transistors at it", sure they are doing that in the GPU but the CPU cores themselves are very area-efficient (again, even considering the node).

https://en.wikichip.org/wiki/File:kaby_lake_(dual_core)_(ann...

https://en.wikichip.org/wiki/File:kaby_lake_r_die_shot_(anno...

A Sierra Forest-style product with multiple chiplets full of nothing but e-cores would be a fantastic thing. I completely agree that Apple doesn't have any notable presence in server, but, you could make some real good products with the pieces Apple has already demonstrated.

I don't have an exact source, but I recall the Asahi folks saying that based on their reverse engineering, Ultra/2-chiplets isn't the limit, the architecture is laid out to go higher on chiplets (I want to say 4 or 8) and they just aren't exploiting it right now.

smoldesu · on March 26, 2023

The big problem is that Apple just can't really work with Linux. They've tried offering server hardware in the past, but it's a raw deal for datacenters and users alike.

...but, if core size is your jam (for whatever reason), keep an eye on Nvidia's Grace CPU. It's their stab at a datacenter-scale ARM SOC, and it should be releasing before EOY. Then there's the Ampere offerings that already have acceleration for PyTorch, ONNX and Tensorflow, along with Graviton for general-purpose efficiency... there's a lot of low-profile ARM cores in the datacenter today.

A good start for Apple would be updating the rackmount Mac Pro with an 80 core Double Ultra chip, but even that feels fairly pedestrian next to the 144-core-complex Grace is teasing. I'm sure it sounds silly to the readers of this website, but I genuinely don't think Apple is up to the task of competing in the datacenter. Obviously so on the software side, but arguably not even on the hardware front either.

Dalewyn · on March 26, 2023

Apple's business is squarely with the frontend of computing: The desktops, laptops, phones, tablets, and even watches that your mom and pop, artists, designers, and engineers use in daily life.

I don't see what Apple stands to gain from getting into the enterprise market, other than simple diversification of their portfolio.

lwhi · on March 26, 2023

>I don't see what Apple stands to gain from getting into the enterprise market, other than simple diversification of their portfolio.

They'd stand to win a market they have no foothold in and profit.

Dalewyn · on March 26, 2023

Like I said, I can see diversification of portfolio as something to gain.

On the other hand, Apple's non-existence in the enterprise space isn't due to a lack of trying. They've been there and done that already.

smoldesu · on March 26, 2023

Apple explicitly failed in the enterprise space because they didn't care. They built a very well-decorated walled-garden, but it's not what enterprise customers wanted. Apple wanted to sell UNIX, the customers wanted to buy servers. When the dust settled, Apple made no attempt to respond to customer demands. They smothered the product with a pillow and told their enterprise partners to pound sand or buy a Trash Can Mac.

lwhi · on March 26, 2023

Do you mean tried and failed?

potatolicious · on March 25, 2023

> "Google has TPUs but have these even made a tiny dent in Nvidia's position?"

This seems unknowable without Google's internal data. The salient question is: "how many Nvidia GPUs would Google have bought if they didn't have TPUs?"

The answer is probably "a lot", but realistically we don't know how many TPUs are deployed internally and how many Nvidia GPUs it displaced.

paulmd · on March 25, 2023

Tesla and the Dojo architecture is another interesting one - that's another Jim Keller project and frankly Dojo may be a little underappreciated given how everything Keller touches tends to turn into gold.

https://www.nextplatform.com/2022/08/23/inside-teslas-innova...

Much like Google, I think Tesla realized this is a capability they need, and at the scales they expect to operate, it's cheaper than buying a whole bunch of NVIDIA product.

abudabi123 · on March 26, 2023

If I recall correct Tesla went to the Dojo because the founder's vision went way beyond where NVIDIA was at their going rate.

losteric · on March 25, 2023

> I assume anything Apple is cooking is using Nvidia in the server room already

For training, sure. For inference, Apple has been in a solid competitive position since M1. LLaMa, Stable Diffusion, etc, can all run on consumer devices that my tech-illiterate parents might own.

smoldesu · on March 25, 2023

LLaMa and Stable Diffusion will run on almost any device with 4gb of free memory.

rolenthedeep · on March 25, 2023

> AMD seems content to limit its ambitions to punching Intel

What's the deal with that anyway? A lot of people want a real alternative to Nvidia, and AMD just... Doesn't care?

I guess we'll have to wait for intel to release something like CUDA and then AMD will finally do something about the GPGPU demand.

roenxi · on March 25, 2023

I was wondering the same thing and thinking about it.

When AMD bought ATI they viewed the GPU as a potential differentiator on CPUs. They've invested a lot of effort into CPU-GPU fusion with their APU products. That has the potential to start paying off in a big way sometime - especially if they figure our how to fuse high end GPU and CPU and just offer a GPGPU chip to everyone. I can see why AMD might put their bets here.

But the trade off was that Nvidia put a lot of effort in doing linear algebra quickly and easily on their GPUs and AMD doesn't have a response to that. Especially since they probably strategised on BLAS on an APU. But it turns out there were a lot of benefits to fast BLAS and Nvidia is making all the money from that.

In short, Nvidia solved a simpler problem that turned out to be really valuable, it would take AMD a long time to organise to do the same thing and it may be a misfit in their strategy. Hence ROCm sucks and I'm not part of the machine learning revolution. :(

paulmd · on March 25, 2023

AMD's graphics R&D is driven by consoles - literally. Microsoft and Sony pay huge sums in early-stage R&D and they get to set the direction of the R&D as a result. RDNA was run explicitly from the start as a semi-custom project (much to the chagrin of Raja Koduri, as this was not his fief). So was RDNA2.

https://www.pcgamesn.com/amd-sony-ps5-navi-affected-vega

https://www.pcgamesn.com/amd/rdna-2-sony-ps5-gpu-pc

As such, if the console market doesn't want it, it doesn't get built. AMD is not willing to put its own money into graphics research.

AMD does not really have the marketshare to get the PC market to adopt AMD-backed features that use accelerators that aren't present in the consoles. If AMD takes 20% of the market in a given year, and the PC market turns over every 6 years, this hardware support would be present in 0-3% of the PC market and 0% of the console market. So even if RDNA3 had a magic "DLSS-level" improvement that relied on some unique new accelerator they'd added in RDNA3, it'd be an uphill fight to get it adopted. Nor is AMD going to spend the money to just implement a bunch of software features anyway - they only even invested in FSR2 after it became a competitive disadvantage for them not to have something.

They won't even go the 16-series vs 20-series route of having consoles be a basic architecture (with size-reduced implementations of features) and then a full-size/higher-performance implementations on PC dGPUs with more full-fledged accelerators bolted on/etc. For example they could have done this with the ML accelerators on RDNA3 - they have a slower (microcoded?) ML instruction in the basic RDNA3, and they could have thrown a more full-fledged implementation into dGPU implementations where there's more space to spare.

But it's just not worth spending on any of that for them - it's a lot of R&D for a fairly narrow slice of the market that would be impacted.

https://www.anandtech.com/show/13973/nvidia-gtx-1660-ti-revi...

So yeah I mean she's just not that into you. Consoles set the direction of their graphics R&D. They'll tap a few other lucrative markets like HPC but they're not going to make big spends that don't have obvious ROI involved, and AMD doesn't really have the PC-gaming marketshare to care about dGPUs as an independent market worthy of R&D.

People ask "why does Intel need anything except iGPUs" and for AMD the question is "why do they need anything except consoles". The rest is interesting in a "someday" sense and potentially strategically important, but day-to-day it's pretty obvious which verticals are bringing in the bacon.

And for NVIDIA that's both dGPUs and datacenter - they still make a lot of money from consumer gaming, and it gives a foothold for development to progress from curiosity to research project to business deployment. AI accelerators and CUDA being on consumer hardware has been a huge boon to R&D (contrast ROCm/HIP being essentially unusable outside enterprise hardware) and the commercial market has found uses for RT cores as well. Because NVIDIA had the realization, a lot of years ago, that they are in fact a software company, that writes the software that sells the hardware.

People mocked Jensen for that for a lot of years, but he was completely right and that's why he's succeeded while AMD has spun their wheels on GPGPU for 15 years now.

And the problem for AMD is, consoles won't pay for a 5% more expensive chip based on blue-sky prospects of something maybe being useful in 3+ years. Or at least not unless it gets an internal backer, like DirectStorage/RDMA obviously has been adopted despite an extremely slow burn on actual usage.

Optical Flow Accelerator is probably the most recent iteration of this - GCN actually had this capability as "Fluid Motion" accelerator but consoles wanted it taken back out, because it was wasted space. Now it's the underpinning of DLSS3 and likely future work in DLSS4 - the principles of "variable temporal+spatial rate shading" AMD outlines in their recent GTC presentation seem like an obvious "DLSS2 for DLSS3". I have also spoken about this idea before and I think that is where NVIDIA is going with DLSS4, but AMD has to do it without the hardware optical flow engine (except on older GCN cards ironically).

https://gpuopen.com/gdc-presentations/2023/GDC-2023-Temporal...

capableweb · on March 25, 2023

> I assume anything Apple is cooking is using Nvidia in the server room already

I wouldn't be so quick at assuming this. Apple already ship ML-capable chips in consumer products, and they've designed and built revolutionary CPUs in modern time. I'm of course not sure about it, but I have a feeling they are gonna introduce something that kicks up the notch on the ML side sooner or later, the foundation for doing something like that is already in place.

smoldesu · on March 25, 2023

> Apple already ship ML-capable chips in consumer products, and they've designed and built revolutionary CPUs in modern time.

Has Nvidia not done that too? They shipped ML-capable consumer hardware before Apple, and have revolutionary SOCs of their own. On top of that, they have a working relationship with the server/datacenter market (something Apple burned) and a team of researchers that basically wrote the rulebook on modern text and image generation. Then you factor in CUDA's ubiquity - it runs in cars, your desktop, your server, your Nintendo Switch - Nvidia is terrifying right now.

If the rest of your argument is a feeling that Apple will turn the tables, I'm not sure I can entertain that polemic. Apple straight-up doesn't compete in the same market segment as Nvidia anymore. They cannot release something that seriously threatens their bottom line.

dividedbyzero · on March 25, 2023

> They cannot release something that seriously threatens their bottom line.

If they manage to move a significant part of ML compute from datacenter to on-device, and if others follow, that might hurt Nvidia's bottom line. Big if at this point, but not unthinkable.

smoldesu · on March 25, 2023

There are a lot of problems here though. First of all being that inferencing isn't hard to do - iPhones were capable of running LLMs before LLaMa and even before it was accelerated. Anyone can inference a model if they have enough memory, I think Nvidia is banking on that part.

Then there's the issue of model size. You can fit some pruned models on an iPhone, but it's safe to say the majority of research and development is going to happen on easily provisionable hardware running something standard like Linux or FreeBSD.

And all this is ignoring the little things, too; training will still happen in-server, and the CDN required to distribute these models to a hundred million iPhone users is not priced attractively. I stand by what I said - Apple forced themselves into a different lane, and now Nvidia is taking advantage of it. Unless they intend to reverse their stance on FOSS and patch up their burned bridges with the community, Apple will get booted out of the datacenter like they did with Xserve.

I'm not against a decent Nvidia competitor (AMD is amazing) but the game is on lock right now. It would take a fundamental shift in computing to unseat them, and AI is the shift Nvidia's prepared for.

KeplerBoy · on March 25, 2023

why wouldn't they build a relatively small cluster for training tasks using Nvidia hardware? It's simply the industry standard, every researcher is familiar with it and writing a custom back-end for pytorch that scales to hundreds of nodes is no small task.

I doubt Apple cares about spending a few hundred million dollars on A100s as long as they make sure the resulting models run on billions of apple silicone chips.

imwithstoopid · on March 25, 2023

Apple has no present experience in building big servers (they had experience at one point, but all those people surely moved on)

Mac Minis don't count

Sure, they are super rich and could just buy their way into the space...but so far they are really far behind in all things AI with Siri being a punchline at this point

if anything, Apple proves that money alone isn't enough

capableweb · on March 25, 2023

I'm no Apple fan-boy at all (closer to the opposite) so it pains me a bit to say, but they have a proven track-record of having zero experience in something, then releasing something really good in that industry.

The iPhone was their first phone, and it really kicked in the smartphone race into high gear. Same for the Apple Silicon processor. And those are just two relatively recent examples.

rchiang · on March 25, 2023

To be fair, Apple released their iPhone after building iPods for 6 years. So, it's not like they had zero experience with handheld devices at the time.

Also, while Apple did create their first chip (at least of their current families) in 2007, they did acquire 150 or so engineers when they bought PA Semi in 2008. So, that gave them a leg up compared to building a chip team completely from scratch.

simonh · on March 25, 2023

Right, the fact they have a habit of releasing industry redefining first products in a category is they do the hard work to make it happen. It’s not by accident.

microtonal · on March 25, 2023

The iPhone had a lot of prehistory in Apple, from Newton to iPod. Apple Silicon alo has a long history, starting with the humble beginnings as the Apple A4 in 2010, which relied on Samsung's Hummingbird for the CPU and PowerVR for the GPU (plus they acquired PA Semi in 2008).

So both are not very good examples, because they build up experience over long periods.

capableweb · on March 25, 2023

> So both are not very good examples, because they build up experience over long periods.

They are examples of something they could similarly do for the Apple Neural Engine but in a bigger scale in the future. They have experience deploying it in a smaller scale/different versions, they would just have to apply it in bigger scale in order to be able to compete with NVIDIA.

abudabi123 · on March 26, 2023

Apple's money can buy relationships to line up the ducks in a row but true genius is willing to work for $1 a year and be rewarded for the upside when it comes, see SJ.

newsclues · on March 25, 2023

I assumed their server experience is still working in the iCloud division.

PartiallyTyped · on March 25, 2023

Nvidia is part of the reason why it happened and why it happened now, they didn't get lucky by any stretch of the imagination wrt to AI.

orangepanda · on March 25, 2023

Is Nvidia taking advantage of the crypto/AI trends, or enabling them?

rchaud · on March 25, 2023

they're selling shovels in a gold rush, that's the best business to be in.

swyx · on March 26, 2023

better analogy is they are selling hammers and everything is looking like nails

jitl · on March 25, 2023

They nerfed hash rate on their cards multiple times

selectodude · on March 25, 2023

And drug manufacturers spend a lot of money to nerf their medications to make them harder to inject. Not all customers are good customers.

sagarpatil · on March 26, 2023

Don’t forget Covid where everyone and their grandmother was gaming. There was a 6 months waiting for RTX 3080. Although now that I think of it , some of it can be attributed to the disruption in supply chain.

squokko · on March 26, 2023

I mean NVIDIA designed the hardware and software for massively parallel computation. That's not really luck.

shubham-rawat5 · on March 27, 2023

NVIDIA's top management may have had a vision but in coming years the competition will be rising exponentially in the AI/ML space as AMD+XILINX, INTEL and some potential dedicated AI hardware startups could become a big threat to dominance of nvidia. nvidia has just a GPU and they fully control it from the architecture to the whole software ecosystem that been built around their GPUs but AMD has complete ground up control over all the components like x86 cpu, gpu, xilinx fpga and they could be big challenger to NV.

seanmcdirmid · on March 26, 2023

AI (on GPUs) has been ramping up for more than the last 10 years, crypto was a fly by night thing in comparison.

px43 · on March 26, 2023

GPU mining was only really a thing in the cryptocurrency space for a few years, maybe 2011-2013. Miners moved on to FPGAs for most things, and then ASICs shortly after. There were a few mildly GPU friendly cryptocurrencies, but the vast majority of GPU power used in the past decade has been for deep learning, which really took off in 2008, before Bitcoin was a thing.

seanmcdirmid · on March 26, 2023

The paper came out in 2009, then another breakthrough in 2010, but I don’t think it got to production until 2011 or so.

greatpostman · on March 25, 2023

I wouldn’t call it luck, just a great product that’s enabling societal change

ralphc · on March 25, 2023

NVidia got lucky, at least in the long term. They spent years perfecting SIMD to make spaceships blow up and, by coincidence, that's the same technology that enabled coin mining then deep learning.

ttul · on March 26, 2023

Never mind TSMC…

px43 · on March 26, 2023

I find it funny that a global trillion dollar economy counts as being "wound down" to some people. The total economic value of the cryptocurrency space at the moment is just between the total economic value of Russia (1.06 trillion) and Switzerland (1.26 trillion). Yes, that's a lot lower than it was, but still a virtual global superpower.

You're not wrong, it's just a fun sign of how far the cryptocurrency space has come.

throwaway1777 · on March 26, 2023

The relevant fact is that GPU mining is dead, not how big crypto is overall.

crop_rotation · on March 25, 2023

Nvidia seems in a really great position. They are to AI what Intel was to the PC (and unlike the PC era there is not a single Microsoft here who controls the entire ecosystem). CUDA still has no alternative. Yes Google has TPUs but outside of Google NVIDIA still dominates and enjoys the network effects (support in all kinds of libs and framework). They face the same problems that Intel faced, as in the market just wants the software and if the software works the same, the hardware is replaceable. It will be interesting to see how they adapt.

TechnicolorByte · on March 25, 2023

Interesting comparison with Intel in the PC market:

> the market just wants the software and if the software works the same, the hardware is replaceable

And likely to be true given how much competition is heating up in the AI hardware space. Granted, many of these competitors and startups especially have existed for years and haven’t made much of a dent. Even Google’s TPU doesn’t seem that much better than Nvidia’s stuff based on their limited MLPerf score releases. Maybe this “iPhone moment” for AI will change that and force competitors to finally put some real effort in it.

As for Nvidia, looks like they are trying to adapt by selling their own enterprise software solutions such as Omniverse and their AI model customization stuff. Will be interesting to see if they can transform into more of a software solutions provider going forward.

xiphias2 · on March 26, 2023

AMD would have been a natural competitor, but they seem to be slow to catch up.

Also segmenting gaming GPUs from AI GPUs is just not the best idea: devs don't want to buy a gaming GPU and an AI GPU separately when they can just buy a stronger NVIDIA GPU that does both.

tmsh · on March 25, 2023

Lots of great quotes like:

  "Inference will be the way software is operated in the future. Inference is simply a piece of software that was written by a computer instead of a piece of software that was written by a human and every computer will just run inference someday."

ww520 · on March 25, 2023

You can argue compilers have been doing that. And JIT was the next step to enable runtime rewriting of software. AI directed JIT or compilers probably will be next.

abudabi123 · on March 26, 2023

Semiotics Lisp Machine Software 2.0⸻make it happen.

cubefox · on March 26, 2023

The piece of software that was written by a computer (by an ML algorithm, to be precise) is called a "model". "Inference" is the execution of this software.

flangola7 · on March 26, 2023

"It's just an auto complete"

koheripbal · on March 26, 2023

We're all just auto-completes.

rhtgrg · on March 26, 2023

Isn't this proof that the CEO doesn't understand the tech? Or am I misunderstanding something?

khazhoux · on March 26, 2023

He's simplifying for the press.

abudabi123 · on March 26, 2023

He's the ceo messenger.

programmer_dude · on March 26, 2023

davidwu · on March 26, 2023

Jensen Huang is awesome and in lots of ways underrated compared to a lot of his contemporaries. Per Wikipedia, he founded the company on his 30th birthday in February, 1993. That means he's spent more of his life running Nvidia than not. What a stellar run.

The two Acquired episodes about Nvidia are fantastic and worth listening to:

Nvidia: The GPU Company (1993-2006): https://www.acquired.fm/episodes/nvidia-the-gpu-company-1993...

Nvidia: The Machine Learning Company (2006-2022): https://www.acquired.fm/episodes/nvidia-the-machine-learning...

lsy · on March 26, 2023

On Nvidia seemingly being in perfect position to ride successive hype cycles of technology: their product category is one of the few types of hardware with significant technical advancement in the last decade, making it only natural that people would seek to leverage that category of hardware with new software efforts. So less that they have been brilliant strategically and more that they have been brilliant technically and that has been creating the market so to speak. What is not clear is whether, as an industry, we are constrained by this technology into seeking solutions that can be massively parallelized.

More simply, when every leap in compute is a massively parallel architecture, every problem seems like it needs to be solved by a massively parallel system. But I'm guessing before this cycle is over we start to see the limitations of that.

_qua · on March 26, 2023

This is a lot to read. It says there is a podcast link in the email but I don't see a link. Do you need to be a subscriber?

Giorgi · on March 25, 2023

Why is this text though? Is there video?

TechnicolorByte · on March 25, 2023

Looks like there’s a paid podcast available. I’m sure there’s some TTS apps that could take in the text and make it consumable that way, though.

visarga · on March 25, 2023

NaturalReaders is a good one

jaflo · on March 25, 2023

What’s wrong with text?

rubans · on March 27, 2023

You can't easily listen to text while doing something else like cooking or laundry. Text to speech is possible, but not amazing