Interesting point -- I've never written a scheduler. Is it really true that writing a scheduler for the multi-socket, multi-core case is that much more complicated than focusing on single-socket, multi-core? Seems possible to me. However, doesn't the existence of hyperthreading contradict that? Even for the single-socket, multi-core case, you need to account for the fact that the virtual processors are not identical.
I don't know that it's a matter of difficulty, just design tradeoffs. For a single socket you can probably get away with a single runqueue; there will be contention but it won't be that expensive. For multi-socket you want to have multiple runqueues; this scales better but you have to load balance threads between queues.
And there are affinity issues to consider as well.
Writing a scheduler is not for the faint of heart.
What bothers me about this whole affair is that the egos of both parties seem to preclude them from working together to get to a 'best of breed' situation.