Back about 15 or more years or so ago, I wrote all my small utility programs in C. These were typically small programs, but I often had a lot of (text) input data in the form of netlists. I started reading about C++, and getline() and some of the containers (that I had to build from scratch in C), so I decided that C++ was for me. There were a number of disappointments, but a big one was that C++'s getline() was more than an order of magnitude slower than fgets() (on my system etc., etc.).
With some experimenting, I discovered that even Perl was much faster than C++ with getline(). (Note that this was input of a file, not stdin as in this article.)
getline() works for (single) character delimited reads and lets you use a std::string. The latter is reason enough to use it: it means you don't have to worry about manual dynamic memory allocation and have zero risk of overflows.
In any case, the C++ equivalent to fgets is sgetn on the underlying streambuf or filebuf - http://en.cppreference.com/w/cpp/io/basic_streambuf/sgetn . I'd use this before I used fgets, if for no other reason than I then don't have to remember to call fclose()
The fact that you haven't used one of the most trivial functions in the standard library because of a slight performance penalty 15 years ago is worrying.
What are you worried about? An order of magnitude is not a slight performance penalty.
The C++ standard library is vast and we've had c++98, c++03 and c++11 in the 15 or so years. There's few enough parts of it that you'll revisit regularly. You can't go around expecting people to be experts in the minutiae of all of it.
Well its not wise to guess about why you experienced this problem. A proper investigation would require a testcase that recaptures it.
My hunch would be that you was discarding the std::string between lines (thereby not reusing the buffer and causing additional memory allocation for each line)... but I'm not sure even that would cause a 12 fold slowdown.
Nope. I doubt I did that, but even if I did, it couldn't explain the slowdown. Sometimes I needed to hold the netlist in memory (as a graph), and thus allocated space for every line anyways. (Actually, up to five times for each line: data structure, device name, source name, drain name, gate name.) Unless I ran out of memory, this never caused a significant slowdown.
Fun fact: the solution of using `cin.sync_with_stdio(false);` introduces a fairly unimportant memory leak that you'll see when you use Valgrind. The behavior was reported as a bug (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27931) but it's actually part of the C++ standard not to clean up standard streams. (Edit: Here's the page of the standard in question, see the end of the second paragraph of 27.3: http://imgur.com/P7wYcHn)
A "leak" normally implies something that will get worse: like, as you use this stream, the stream will slowly lose memory or something. The standard is simply saying that when the program exits the stream objects themselves--and therefore the buffers they use--will not be destroyed: but of course closing the program will destroy them... the developer who closed the bug even put "leak" in quotes as this is more an artifact of valgrind's definition of "leak" than even an "unimportant" issue in the behavior. (I want to clear this up, as I can imagine people reading what you are saying and then avoiding this feature as they are concerned that their extremely-long running program will eventually exhaust memory.)
You're right, of course. My original comment did put quotes around 'leak', but I figured someone would come along and argue a leak is a leak even if it's not going to crash your program. Oh well. :)
This isn't a "leak": a leak is a leak, and this isn't a leak... this is just something Valgrind is reporting because it could be a leak due to exhibiting a pattern indistinguishable from something that actually was leaked from the perspective of a tool that only is able to work from a dynamically maintained list of heap allocations and the event "program has now terminated". Saying this is a leak is somewhat equivalent to claiming that because the executable code itself wasn't deallocated by the program before it exited the code was "leaked": it just so happens that as an implementation detail this data structure is lazily dynamically allocated, but it should be considered a static part of the program. In a world where things Valgrind reported were "leaks" as opposed to simply "outstanding heap allocations" then I'd be happier claiming this was a bug in Valgrind than claiming that this feature of the language standard should have the word "leak" used to describe it ;P.
tl;dr: C++s standard streams (cin, cout, cerr, clog) may be used alongside code using the underlying libc i/o streams API and therefore, by default, synchronize with libcs own buffers. In the very least, this means you don't end up with output resulting from characters interleaved between individual stream operations via the two sources.
Its worth noting that even Cs i/o APIs are typically synchronized across threads so you can output lines to stdout without experiencing the same issue within just vanilla threaded C code.
This is an old discussion, but I actually have this thread bookmarked so I can show young programmers the dangers of assuming that if you write in C++ it will be faster then everything else out there.
Actually C was the fastest, by a fairly wide margin, however hat doesn't matter much since you can make the C++ faster and you can make C go just as slow as the input buffered C. There are still lots of people who think that speed is solely a function of programing language with very little impact from program design. Like I said I like to bring this out when showing newer developers that no matter what language you pick you need to be aware of what is going on in the background and most importantly you should test you code before trying to assume what parts are fast and what parts are slow.
Maybe the sensible takeaway is that if you can do a prototype in a simpler language first to establish a baseline, you should.
I think most people expect that they can, due to their own ignorance, make a slow program even in C++.
You're right, it is an odd conclusion, but lots of folks think that way. That's why Negitivefrags needs to dissuade inexperienced folks of that notion.
I (re)learned the same lesson last year when I started looking at OpenCL. I saw all these awesome benchmarks about it was 40-50 times faster than C. I then did a bunch of simple tutorials and tests and got great performance. Then I rewrote a piece of performance critical C code I had and after a couple of days of hard work I had something that ran 5 times slower than my original C code.
tldr; when performance is important, don't just use defaults, optimize. And learn how libraries and your OS work deep down at low levels. Even Python's default IO performance can be improved in many cases, by changing buffer sizes or even bypassing the file io susbsytem and using memory mapped files. But no solution is right for all use cases.
Like they tell you in school, premature optimization is the root of all evil. So don't worry about this until you need it.
With some experimenting, I discovered that even Perl was much faster than C++ with getline(). (Note that this was input of a file, not stdin as in this article.)
I've not used getline() since.