Exactly. CUDA is huge moat and all competitors must be adopting SOFTWARE first approach similar to what tinycorp is trying to do.
Find one single thing that makes CUDA bad to use and TRIPLE DOWN on that.
I don't know. It seems like Cursor does the trick for me. I wrote 4k+ LoC Vulkan API heavy video decoding for Linux ARM64. They key thing was not to YOLO but specify context carefully up to tiny details.