Ingo Molnar: BFS vs. mainline scheduler benchmarks and measurements

psranga · on Sept 8, 2009

I would LOVE to use a scheduler tuned for my hardware if outperformed the generic scheduler. I'm not a kernel developer, but I can imagine how the mainstream single-socket multicore case may be quite different from the multi-socket multicore case.

As I software engineer, I realize that making different schedulers for different architectures is "inefficient" (unless development/test resources are increased). But if somebody is willing to put in the time and effort to make me a scheduler that works best on my hardware, I'll gladly take it.

I have to agree with Con's remarks here. Seems like an underhanded move by Ingo to give this thing a bad reputation (and hence kill developer and beta tester interest) before it comes out. And judging by the comments here, he may have succeeded. Looks like open source has lots of politics too. :)

dfranke · on Sept 8, 2009

I don't think I quite understand what misconception people hold which leads them to believe that any environment in which two or more human beings are in social contact could ever be free of politics.

dfranke · on Sept 7, 2009

I think CK's reply just put me off for good from taking the time to try out his patches:

/me sees Ingo run off to find the right combination of hardware and benchmark to prove his point.

[snip lots of bullshit meaningless benchmarks showing how great cfs is and/or how bad bfs is, along with telling people they should use these artificial benchmarks to determine how good it is, demonstrating yet again why benchmarks fail the desktop]

So, to paraphrase: BFS is vastly superior, but not in any way that you can actually measure.

Con made a valuable contribution back when the choice was between staircase-deadline and the old mainline O(1) scheduler, but he's since turned himself into a crank.

tsally · on Sept 8, 2009

So, to paraphrase: BFS is vastly superior, but not in any way that you can actually measure.

From my understanding, your interpretation is not correct. BFS is about latency, not throughput. Ingo chose all the wrong benchmarks.

Also, it's a dick move by Ingo. CK hasn't even made a public announcement about BFS yet.

Finally, you should use code that has technical merit, regardless of the personality behind the writer. I imagine the kernel you're running on has code in it written by some pretty unfriendly people.

dfranke · on Sept 8, 2009

From my understanding, your interpretation is not correct. BFS is about latency, not throughput. Ingo chose all the wrong benchmarks.

Then how about CK showing up the right ones?

Also, it's a dick move by Ingo. CK hasn't even made a public announcement about BFS yet.

He announced it publicly enough that I knew about it, and I haven't followed this stuff in years. Given that, I hardly think that commenting on his efforts on LKML constitutes a premature "outing".

Finally, you should use code that has technical merit, regardless of the personality behind the writer. I imagine the kernel you're running on has code in it written by some pretty unfriendly people.

Indeed: one of them murdered his wife. But you brought up personalities, not me. The part of CK's post that I quoted is rather dickish, but I deliberately avoided commenting on that.

tsally · on Sept 8, 2009

Then how about CK showing up the right ones?

Why? It's not as if CK is trying to get it into the mainline. I think the only reason why CK got angry is that Ingo's benchmarks appear to be deliberately chosen to make BFS look bad. Anyone who took the time to read the documentation would realize that testing throughput is not correct. That'd be like benchmarking Python and C using matrix multiplication and concluding that Python sucks. Two different tools for different purposes.

He announced it publicly enough that I knew about it, and I haven't followed this stuff in years. Given that, I hardly think that commenting on his efforts on LKML constitutes a premature "outing.".

I think the announcement was mostly the work of Reddit et al., but fair enough.

Indeed: one of them murdered his wife. But you brought up personalities, not me.

Reiser was a crazy one. ;) I actually only brought up personalities because I was under the impression that you wouldn't try CK's patch because of his personality. I say if there's interesting code out there, it shouldn't go to waste because of the quirks of the author. I apologize if I misinterpreted.

dfranke · on Sept 8, 2009

Why? It's not as if CK is trying to get it into the mainline.

What does one have to do with the other? If I'm going to spend the necessary 20 minutes to compile a new kernel using CK's patch, I want some evidence that I'm going to be gaining something.

antonovka · on Sept 8, 2009

What does one have to do with the other? If I'm going to spend the necessary 20 minutes to compile a new kernel using CK's patch, I want some evidence that I'm going to be gaining something.

Then perhaps you should pay Con Kolivas to prove the value of his freely provided no-warranty work-in-progress code for you.

dfranke · on Sept 8, 2009

Nothing is ever free unless your time has no value. I've published and contributed to a great deal of open source software, much of it successful. I've been among these communities for a very long time and understand the social expectations. One of them is this: if you want anyone to care about your work, then you'd better be prepared to defend it. I would find any contrary expectation to be unbearably narcissistic, and, whatever his other failings, I think CK would agree with me on this. Think of it this way: if you published an essay containing controversial opinions, would you demand that it be immune to criticism on the basis that you're giving it away for free? I don't think software should be considered different in any case, and certainly not in this one given that CK is representing his work as an improvement over someone else's.

antonovka · on Sept 8, 2009

I've published and contributed to a great deal of open source software, much of it successful.

I have also published a great deal of open source software, much of it successful, and some of it almost universally used. Unfortunately, that does absolutely nothing to bolster either of our arguments.

If you published an essay containing controversial opinions, would you demand that it be immune to criticism on the basis that you're giving it away for free

You asked for: "I want some evidence that I'm going to be gaining something".

If we try to apply that to your analogy, why is it the author's responsibility to not only produce the work, document the work, but then also supply their own literary criticism so that you can decide if the work is worthwhile, all before the essay is completed and ready for publication?

Open source developers regularly produce software just because they want to, and feel no impetus or desire to convince you of anything. That's not, as you put it, some sort of "narcissistic" behavior, but rather a desire to simply share without additional burden or demands by others.

scott_s · on Sept 8, 2009

For what it's worth, the kind of analysis dfranke expects is required in the academic community. If you provide no evidence that what you've created is better than the status-quo, then no one will care. I don't consider this an "additional burden."

tsally · on Sept 8, 2009

What does one have to do with the other? If I'm going to spend the necessary 20 minutes to compile a new kernel using CK's patch, I want some evidence that I'm going to be gaining something.

Ok, I can accept that. For me personally though, the quality of -ck patchset is enough for me to give CK's new stuff a shot.

rbanffy · on Sept 8, 2009

Personality can be a huge problem because the kernel and its parts are a long term project and I would prefer people that can work together for its entire duration.

Accusations like "/me sees Ingo run off to find the right combination of hardware and benchmark to prove his point." is far from the level of civility I would entrust my favorite OS's kernel. The proper response would be something in the line "it will suck on this benchmarks, but it will excell in these others and in these hardware configurations" and show numbers. It's always possible to act in a civilized way.

That said, CK seems only a bit off when compared to a couple others. There are a lot of people who like to be quite caustic from time to time. I must confess I have to refrain myself frequently and HN is not LKML.

And, more important: this whole discussion is more or less pointless. It doesn't matter how good a modern scheduler is, all of them impose a very little penalty on a modern kernel. Are we really going to fight this bitterly over a couple percentage points?

Would that be so hard as to include a "-scheduler=" switch in the kernel for it to be selectable at boot time (and so I can pick one that fits both my netbook and my SGI 4096 CPU monster when I want to)? This would pretty much end this discussion.

blasdel · on Sept 8, 2009

All good kernel micro-benchmarks will involve a ton of volume, and thus be exercising throughput. All the micro-benchmarks Ingo used are extremely latency-focused: a few processes blocking on Pipe I/O or messaging as frequently as possible -- the only subject of the test is how fast the scheduler can toggle between the processes.

BFS was dramatically worse across the board in all the benchmarks, except curiously when the # of processes == # of logical processors -- to me that's evidence of extreme naivete in its design.

wmf · on Sept 8, 2009

It's not that Ingo chose the wrong benchmarks. The right benchmarks don't exist, and we're not even sure how to build them. Unfortunately, building a scheduler optimized for "unbenchmarkable" use cases is a recipe for misunderstanding.

tsally · on Sept 8, 2009

The right benchmarks for the stated goals of BFS do exist. BFS is hardly "unbemchmarkable"; the only thing being misunderstood here is the goal of BFS.

spamizbad · on Sept 8, 2009

Yeah, CK came across as a major dick, but...

Ingo benched this on a two-socket, 8-CPU system. CK made it blatantly obvious he was targeting single-socket multi-core systems, or UMA SMP systems. Further, he's seems pretty adamant about not tweaking the scheduler for anything beyond single-socket multicore or UMA SMP, to say nothing of more exotic CPU/Memory architectures.

neilc · on Sept 8, 2009

Building a scheduler that doesn't scale to an 8-CPU system seems like madness to me, given the very predictable trends in desktop hardware.

wmf · on Sept 8, 2009

The cost of bouncing a cache line depends less on the number of cores and more on the number of sockets. Since the future of desktops is clearly 1-socket machines, BFS is optimized for that and it's not representative to test with a 2-socket machine. IOW, today's expensive 8-core 2S workstation is not a good proxy for tomorrow's cheap 8-core 1S desktop.

neilc · on Sept 8, 2009

Interesting point -- I've never written a scheduler. Is it really true that writing a scheduler for the multi-socket, multi-core case is that much more complicated than focusing on single-socket, multi-core? Seems possible to me. However, doesn't the existence of hyperthreading contradict that? Even for the single-socket, multi-core case, you need to account for the fact that the virtual processors are not identical.

wmf · on Sept 8, 2009

I don't know that it's a matter of difficulty, just design tradeoffs. For a single socket you can probably get away with a single runqueue; there will be contention but it won't be that expensive. For multi-socket you want to have multiple runqueues; this scales better but you have to load balance threads between queues.

jacquesm · on Sept 8, 2009

And there are affinity issues to consider as well.

Writing a scheduler is not for the faint of heart.

What bothers me about this whole affair is that the egos of both parties seem to preclude them from working together to get to a 'best of breed' situation.

Both Con & Ingo have a huge NIH problem.

joeyo · on Sept 8, 2009

  > Building a scheduler that doesn't scale to an 8-CPU
  > system seems like madness to me, given the very
  > predictable trends in desktop hardware.

Even if that is true, it might make sense in other settings. On linux powered cell phones, for example.

nailer · on Sept 8, 2009

Yep. http://twitter.com/cyanogen

blasdel · on Sept 8, 2009

http://thread.gmane.org/gmane.linux.kernel/886535

He did it again on a single-socket 4-CPU system with HT. He got relatively the same results, including the curious marginal win for BFS on kbuild -jN where N == the # of logical processors.

dfranke · on Sept 8, 2009

> two-socket, 8-CPU system

Which is exactly the kind of machine that I'm sitting in front of at this very moment and have owned for more than two years. I realize this puts me on the leading edge, but any scheduler that can't cope with this in a reasonable manner is a bit of a joke.

skorgu · on Sept 8, 2009

"Can't cope" may be overstating it, I parsed it as "not optimized for" which sounds fair.

blasdel · on Sept 7, 2009

The parallel LWN thread: http://lwn.net/Articles/351058/

nitrogen · on Sept 8, 2009

A lot of discussion on LWN and on the LKML seems to focus on the "unbenchmarkability" of UI interactiveness. This is just a guess, but maybe Con's background as an anesthesiologist (rather than a programmer) causes him to understand feelings better than measurements.

If I were to assign blame for the latency problems I experience, I would place it on the 100-call-deep stacks that are used to deliver events in Gtk+ and Qt, and on the binary Nvidia driver's lack of good 2D+compositing acceleration.

rbanffy · on Sept 8, 2009

"Con's background as an anesthesiologist (rather than a programmer) causes him to understand feelings better than measurements"

I would be pretty worried with an anesthesiologist that disregarded his measurements... Either you can measure or it doesn't exist. If it exists and you don't measure, you'd better write something to measure it.

jacquesm · on Sept 8, 2009

If there is ONE job where your ability to measure means the difference between life and death than it is being an anesthesiologist.

skorgu · on Sept 8, 2009

I'm not going to try to follow LKML just for this but if anyone at HN has a benchmark they'd like me to try out on 2.6.30 Arch vs 2.6.30-bfs on a quad-core Xeon I'll give it a shot.

illumen · on Sept 8, 2009

Run a compile job, or a video encode... then try web browsing.

Perhaps you could time clicks to open new tabs in your browser. Or switch virtual spaces.

With the background work, you should notice no or very little difference. Since the interactive work should get highest priority(this is what you are aiming for).

You can try these tests without bfs patches too. You can see how the default kernel/X11/driver is not optimal - and how things can be improved.

It's still much nicer than OSX, and windows however.

skorgu · on Sept 8, 2009

Yeah, I've done the standard "nice make -j 4 bzImage" and noticed no detectable slowdown in normal browsing tasks, window movement, chromium/flashplayer etc. It does feel snappier as CK was going for but since I know I'm running bfs I'm somewhat afraid of confirmation bias.

peregrine · on Sept 8, 2009

On a related thread http://news.ycombinator.com/item?id=810833

BFS had been ported to one of the Custom Roms for Android and has already shown increased speed improvements.