Why is reading lines from stdin much slower in C++ than Python?

dded · on Jan 13, 2014

Back about 15 or more years or so ago, I wrote all my small utility programs in C. These were typically small programs, but I often had a lot of (text) input data in the form of netlists. I started reading about C++, and getline() and some of the containers (that I had to build from scratch in C), so I decided that C++ was for me. There were a number of disappointments, but a big one was that C++'s getline() was more than an order of magnitude slower than fgets() (on my system etc., etc.).

With some experimenting, I discovered that even Perl was much faster than C++ with getline(). (Note that this was input of a file, not stdin as in this article.)

I've not used getline() since.

nly · on Jan 13, 2014

getline() works for (single) character delimited reads and lets you use a std::string. The latter is reason enough to use it: it means you don't have to worry about manual dynamic memory allocation and have zero risk of overflows.

In any case, the C++ equivalent to fgets is sgetn on the underlying streambuf or filebuf - http://en.cppreference.com/w/cpp/io/basic_streambuf/sgetn . I'd use this before I used fgets, if for no other reason than I then don't have to remember to call fclose()

The fact that you haven't used one of the most trivial functions in the standard library because of a slight performance penalty 15 years ago is worrying.

eliasmacpherson · on Jan 13, 2014

What are you worried about? An order of magnitude is not a slight performance penalty. The C++ standard library is vast and we've had c++98, c++03 and c++11 in the 15 or so years. There's few enough parts of it that you'll revisit regularly. You can't go around expecting people to be experts in the minutiae of all of it.

ufo · on Jan 14, 2014

If both the C++ and the Perl code had the same performance I'd stick with the Perl version.

dded · on Jan 14, 2014

Slight? It would turn a 10-min run into over 2 hours.

nly · on Jan 14, 2014

Well its not wise to guess about why you experienced this problem. A proper investigation would require a testcase that recaptures it.

My hunch would be that you was discarding the std::string between lines (thereby not reusing the buffer and causing additional memory allocation for each line)... but I'm not sure even that would cause a 12 fold slowdown.

dded · on Jan 14, 2014

Nope. I doubt I did that, but even if I did, it couldn't explain the slowdown. Sometimes I needed to hold the netlist in memory (as a graph), and thus allocated space for every line anyways. (Actually, up to five times for each line: data structure, device name, source name, drain name, gate name.) Unless I ran out of memory, this never caused a significant slowdown.

Jach · on Jan 14, 2014

Fun fact: the solution of using `cin.sync_with_stdio(false);` introduces a fairly unimportant memory leak that you'll see when you use Valgrind. The behavior was reported as a bug (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27931) but it's actually part of the C++ standard not to clean up standard streams. (Edit: Here's the page of the standard in question, see the end of the second paragraph of 27.3: http://imgur.com/P7wYcHn)

saurik · on Jan 14, 2014

A "leak" normally implies something that will get worse: like, as you use this stream, the stream will slowly lose memory or something. The standard is simply saying that when the program exits the stream objects themselves--and therefore the buffers they use--will not be destroyed: but of course closing the program will destroy them... the developer who closed the bug even put "leak" in quotes as this is more an artifact of valgrind's definition of "leak" than even an "unimportant" issue in the behavior. (I want to clear this up, as I can imagine people reading what you are saying and then avoiding this feature as they are concerned that their extremely-long running program will eventually exhaust memory.)

Jach · on Jan 14, 2014

You're right, of course. My original comment did put quotes around 'leak', but I figured someone would come along and argue a leak is a leak even if it's not going to crash your program. Oh well. :)

saurik · on Jan 14, 2014

This isn't a "leak": a leak is a leak, and this isn't a leak... this is just something Valgrind is reporting because it could be a leak due to exhibiting a pattern indistinguishable from something that actually was leaked from the perspective of a tool that only is able to work from a dynamically maintained list of heap allocations and the event "program has now terminated". Saying this is a leak is somewhat equivalent to claiming that because the executable code itself wasn't deallocated by the program before it exited the code was "leaked": it just so happens that as an implementation detail this data structure is lazily dynamically allocated, but it should be considered a static part of the program. In a world where things Valgrind reported were "leaks" as opposed to simply "outstanding heap allocations" then I'd be happier claiming this was a bug in Valgrind than claiming that this feature of the language standard should have the word "leak" used to describe it ;P.

nly · on Jan 13, 2014

tl;dr: C++s standard streams (cin, cout, cerr, clog) may be used alongside code using the underlying libc i/o streams API and therefore, by default, synchronize with libcs own buffers. In the very least, this means you don't end up with output resulting from characters interleaved between individual stream operations via the two sources.

Its worth noting that even Cs i/o APIs are typically synchronized across threads so you can output lines to stdout without experiencing the same issue within just vanilla threaded C code.

ef47d35620c1 · on Jan 13, 2014

I think you can unlink it:

    std::ios_base::sync_with_stdio(false);

drivers99 · on Jan 13, 2014

That's what they did to fix it.

dkhenry · on Jan 13, 2014

This is an old discussion, but I actually have this thread bookmarked so I can show young programmers the dangers of assuming that if you write in C++ it will be faster then everything else out there.

Negitivefrags · on Jan 13, 2014

Seems like an odd conclusion to reach given that the end result was C++ being faster.

Nobody ever claimed that C++ didn't require more knowledge to use effectively.

dkhenry · on Jan 13, 2014

Actually C was the fastest, by a fairly wide margin, however hat doesn't matter much since you can make the C++ faster and you can make C go just as slow as the input buffered C. There are still lots of people who think that speed is solely a function of programing language with very little impact from program design. Like I said I like to bring this out when showing newer developers that no matter what language you pick you need to be aware of what is going on in the background and most importantly you should test you code before trying to assume what parts are fast and what parts are slow.

nly · on Jan 13, 2014

Yes indeed, and many don't realize that C streams are also buffered and can also be tuned:

http://www.gnu.org/software/libc/manual/html_node/Buffering-...

http://www.gnu.org/software/libc/manual/html_node/Controllin...

there are also threading issues:

http://www.gnu.org/software/libc/manual/html_node/Streams-an...

glibc even lets you implement custom streams (FILE* handles that will work with fgets etc. but call your own source and sink to do i/o)

lelandbatey · on Jan 13, 2014

Yes, but it's generally expected that if you do things the "usual way" in both Python and in C++, the C++ will always be faster.

eliasmacpherson · on Jan 13, 2014

Maybe the sensible takeaway is that if you can do a prototype in a simpler language first to establish a baseline, you should. I think most people expect that they can, due to their own ignorance, make a slow program even in C++.

brandonbloom · on Jan 13, 2014

You're right, it is an odd conclusion, but lots of folks think that way. That's why Negitivefrags needs to dissuade inexperienced folks of that notion.

dagw · on Jan 14, 2014

I (re)learned the same lesson last year when I started looking at OpenCL. I saw all these awesome benchmarks about it was 40-50 times faster than C. I then did a bunch of simple tutorials and tests and got great performance. Then I rewrote a piece of performance critical C code I had and after a couple of days of hard work I had something that ran 5 times slower than my original C code.

memracom · on Jan 14, 2014

tldr; when performance is important, don't just use defaults, optimize. And learn how libraries and your OS work deep down at low levels. Even Python's default IO performance can be improved in many cases, by changing buffer sizes or even bypassing the file io susbsytem and using memory mapped files. But no solution is right for all use cases.

Like they tell you in school, premature optimization is the root of all evil. So don't worry about this until you need it.

NAFV_P · on Jan 14, 2014

Thought I'd throw a point(er) alongside these comments...

Isn't the Python interpreter written in C?

gamegoblin · on Jan 14, 2014

The CPython interpreter, aptly named, is written in C. There are other implementations written in Java, etc.

Shish2k · on Jan 14, 2014

Straightforward C is slow; straightforward python is medium speed; expertly written C (the foundations of python) is fast

dded · on Jan 14, 2014

> Straightforward C is slow;

I've never found it to be so, and it's not a common complaint.

GlitchMr · on Jan 14, 2014

Citation needed. My C compiler can optimize even awful code very well.

NAFV_P · on Jan 14, 2014

> Straightforward C is slow;

Would you care to elaborate? I typed "man straightforward" into terminal and nothing was thrown back at me.

betterunix · on Jan 14, 2014

https://en.wikipedia.org/wiki/Pypy

NAFV_P · on Jan 14, 2014

Thanks for reminding me of that, I will have a look.

meowface · on Jan 14, 2014

Yes, but so are the interpreters and compilers of many other languages.

gdy · on Jan 13, 2014

tl;dr It is not.

Keyframe · on Jan 14, 2014

One of the first questions I asked myself years ago in C was what's the difference between open, read, write and fopen, fread, fwrite. Buffering.