Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Or maybe use the best of both worlds, with soldered-in ultra fast ram, plus large amount of dimm ram.

Same as you can have a storage driver and a smaller nvme.



> Or maybe use the best of both worlds, with soldered-in ultra fast ram

That's basically what L3 cache is on Intel & AMD's existing CPUs. You could add an L4, but at some point the amount of caches you go through is also itself a bottleneck, along with being a bit ridiculous.


The way I see it, you could have a Mac Pro with (let’s say) 32GB of super-fast on-package RAM and arbitrarily upgradable DIMM slots. The consequence would be that some RAM would be faster and some would be a bit slower.

They would be contiguous, not layered.


The non-uniform memory performance of such a solution would be a software nightmare.


Doesn't seem much different than various multichip or multisocket solutions where different parts of memory have different latencies, called NUMA. Basically the OS keeps track of how busy pages are and rebalances things for heavily used pages that are placed poorly.

Similarly, Optane (in dimm form) is basically slow memory, OSs seem to handle it fine. NUMA support seems pretty mature today and handle common use cases well.

With all that said, apple could just add a second CPU to double the ram and cores, seems like a great fit for a Mac Pro.


It doesn't seem any worse than existing NUMA systems today, where memory latency depends on what core you're running on. In contrast, the proposed system would have the same performance for on-board vs plugged DIMM regardless of which CPU is accessing it, which simplifies scheduling — from a scheduling perspective, it's all the same. I think that's easier to work with than e.g. Zen1 NUMA systems.


OSes have had this problem solved for decades; the solution is called "swap files". You could naively get any current OS working in a system with fast and slow RAM by simply creating a ramdisk on the slow memory address block and telling the OS to create a swap file there.


> OSes have had this problem solved for decades; the solution is called "swap files".

What operating systems handle NUMA memory through swapping? The only one I'm familiar with doesn't use a swapping design for NUMA systems, so I'm curious to learn more.


Not really the best idea for the kind of speed baselines and differences discussed here. You can use better ideas like putting GPU memory first in the fast part then the rest in the slow area. You know, like XBox Series does.


Yet apple is managing excellent performance with just a l1+l2.


But the context of this thread is that it is being done with soldered RAM. I don't know how much that matters, just pointing out that you are taking the conversation in a circle.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: