This is a cool exercise! After completing it, I wanted to find out exactly what each NN hidden node represented. I trained a tiny (10 hidden node) NN on an OCR dataset and created a visualization here: https://rawgithub.com/tashmore/nn-visualizer/master/nn_visua... .
Can anyone figure out what each hidden node represents?
You can also select a node and press "A" (Gradient Ascent). This will change the input in a way that increases the selected node's value. By selecting an output node and mashing "A", you can run the NN in reverse, causing it to "hallucinate" a digit.
Cool project! Anyone trying to run it on *nix will need to open Main.java and replace all instances of the windows-specific file separator "\\" with "/".
This is a fringe viewpoint at best. With our current policies, the Congressional Budget Office predicts SS will become completely insolvent between 2036 and 2038 (p. 54 of http://cbo.gov/sites/default/files/cbofiles/attachments/06-2...). The "easy" fix will require significant payroll tax increases, which seems politically unlikely considering we just REDUCED the FICA tax by 2%.
Under its extended-baseline scenario, CBO estimates that over the next 75 years, the program has an actuarial shortfall equal to 1.6 percent of taxable payroll, or
0.6 percent of GDP (see Table 4-1). Thus, to bring the program into actuarial balance through 2085, payroll taxes could be increased immediately by 1.6 percent of taxable payroll and kept at that higher rate, or scheduled benefits could be reduced by an equivalent amount.
The fix is easy (not taking into account the Republican party's irrational view on tax increases). Repeal the tax cut. And not increasing taxes leaves a small actuarial imbalance.
I'm confused. What is the "defect" in K&R's "copy(char to[], char from[])" function?
The author notes that "the second this function is called...without a trailing '\0' character, then you'll hit difficult to debug errors", but no function with that signature could possibly work in this case.
The built-in "strcpy" function has the exact same limitation. Does the author have a problem with it as well? Null-termination is a fundamental concept of C strings; there's no reason to shield C students from it.
The other example of "bugs and bad style" in this "destruction" of K&R C is a minor complaint about not using an optional set of braces.
I hope the remainder of the [incomplete] chapter demonstrates some actual bugs in the book's code, because it currently doesn't live up to the first paragraph's bluster.
It's a nice strawman, right? Especially when he points out that the original code, in context, is perfectly fine. His later complaint about the assignment-in-if statement is certainly something shared by modern C programmers (see compiler warnings about same), but it perfectly fits the original style and accomplishes its task.
His criticisms seem to be rooted so far in stylistic issues and in taking the code out of context (design context, usage guarantees, etc.). Then again, how are you to nerdbait while still being fair to original sources?
My criticisms are only partially stylistic, but also that people copy this code to other situations and it breaks. So, I'm showing them why it only works in that one specific context and how to break the code.
If I can make a suggestion, then. I apologize if you've already covered this elsewhere in your book, but:
Please encourage the use of the "restrict" keyword to encourage proper aliasing optimization by compilers.
Also, would it be worth considering, perhaps in a chapter on how-to-deal-with-C-style-strings-if-you-must, including a section on writing compiler-independent macros to emulate safer calls like the str*_s variants in VS. These functions are usually slightly-different in incantation across VS and GCC, so it might be helpful to abstract them. It also is a relatively simple exercise in preprocessor macros with a clear benefit.
The built-in "strcpy" function has the exact same limitation. Does the author have a problem with it as well?
Yes. From the linked chapter:
we avoided classic style C strings in this book
From an earlier chapter on strings:
The source of almost all bugs in C come from forgetting to have enough space, or forgetting to put a '\0' at the end of a string. In fact it's so common and hard to get right that the majority of good C code just doesn't use C style strings. In later exercises we'll actually learn how to avoid C strings completely.
This is the author's opinion, of course – it's from a book, that should go without saying – but it's not as if the idea of avoiding C strings in general, and "strcpy" in particular, is an oddball or unique point of view. See e.g.:
"the majority of good C code just doesn't use C style strings..."
Nice. In the 23 years I have worked on C language products, I've never worked on "good C code" by this definition.
The cool thing about this guys book, I guess, is that by avoiding all the things about the language he doesn't like, any reader will be wholly unprepared for C in the Real World after this book.
This chapter is explicitly teaching people what the author's idea of bad C code is: It has them read some, then tells them specifically what the gotchas are, then asks them to code up test cases for the flaws and run them through Valgrind.
Where's the "avoiding" here?
And which of the skills being exercised - imagining what kinds of bad things could happen, writing executable test cases, detecting segfaults - are not useful in the real world?
That's basically the point of this chapter. It's getting people to think like a hacker and try to break the code in unintended ways. That makes them better programmers and helps when avoiding common mistakes in C code.
Using K&R to do this is to give people a set of known good C code samples and show how even those can be broken and misused.
It was a rhetorical question. What's the sense in having functions that operate on "strings" if you can't figure out what a "string" is at runtime? It's much saner to have functions operate on "strings that are 80 characters or less" or "a structure containing a integer `length` and an array of `length` chars."
What's the sense in having functions that operate on "pointers to valid memory" if you can't figure out if it's "valid memory" at runtime?
The point is that the function is not buggy. You may not like the specification for it. You may think it should be designed differently. This is not the same as the code being buggy.
How about being able to work on a team of other programmers without insisting that everything ought to be done you own, completely unique, way?
If he truly had a better, more correct, way to write C code I would join him in trying to change the world. But the first example he gives is just factually wrong.
There's the avoiding. The skills being exercised are not a problem, it's the skill not being exercised that is a problem. You cannot be a competent C programmer without understanding strings and their relevant library functions.
Thanks for the code review of my last 23 years of work on C projects. Nice ego that your book must be the only way to learn this.
I mostly disagree with your statement about "most good C code" not using stdlib strings/apis.
To be fair, from your comments here, it looks like you aren't really saying that -- you are saying we have to be careful when using them. This I know, and carefully guard against, and I do try to break my code regularly. But to assert that "most good C code" doesn't use them fails to meet with my experience.
That said, _all_ bad C code (ab)uses them, probably due to carelessness and/or misunderstanding of how things work in C.
The approach looks similar to the one in "JavaScript: the good parts" I am worried about the same while reading it: ok, here's the right way to do e.g. inheritance, but how do they do it in real world?
I didn't see it yet from a quick skim of the available chapters, but I'm going to guess that his preferred alternative is 'bstr - The Better String Library' - http://bstring.sourceforge.net/
I've used it the past, and I try to include it in any project that isn't immediately trivial, although the to/from integration with just about all library code ever is a bit of a hassle.
It's not just that the to/from is a hassle, it's also that you have to deeply understand the gotchas of ASCIIZ strings to safely do that bridging at all --- so pretending it doesn't exist is probably not a good teaching strategy. (I have no opinion about the article and am only commenting on this thread because I like C).
I do not pretend anything about null-terminated strings in the book. In fact, I have many exercises and assignments where they use Valgrind to buffer overflow C strings repeatedly and don't introduce bstring until much later.
> but no function with that signature could possibly work in this case.
This is the source of the bugs in C. People write functions that only work given all calls to them are never changed, which is absurd. Good modern C code involves trying to protect against bad usage and adding defensive checks.
So yes, the built-in strcpy is crap which is why most competent C doesn't use it except in a few rare cases where it's required.
And this does demonstrate actual bugs in the code. I wrote a test case that causes it, which incidentally is a common bug in C code called a buffer overflow. It's because of code examples like this that get copied to other situations that we have these defects.
From my codebase/third-party directory on my laptop (a bit random, I admit), from those projects I'd consider "competent C" (ie, not OpenSSL or MRI ruby):
* dovecot uses ASCIIZ strings and libc string functions
* redis uses ASCIIZ strings and libc string functions
* libevent uses ASCIIZ strings and libc string functions
* qmail uses djb's string library
* memcached uses ASCIIZ strings and libc string functions
It's probably good to be comfortable with both approaches.
I don't know that you actually made this claim, but you seem to have given people here the impression that you believe functions that work with ASCIIZ strings should be bulletproofed to handle non-ASCIIZ inputs. I couldn't agree with that argument, especially as an argument about K&R's code being rusty.
People here are jumpy though (they're commenting, like me, mostly because they're bored).
Hmm, reading the source it looks like he is using His sds string library, which has Len, size and a asciiz char* member. When last I checked he does pass the char* around (because it is null terminated) but he also sometimes will do pointer math to get back to the sds
You're right; I was reacting to the count of char\s+star and snprintf calls, but only fair to chalk Redis up among the packages I have that rely on a high-level string library.
Huh? My C programming is in the distant past, so I might be forgetting. But strcpy does assume a terminal zero, doesn't it? E.g., http://linux.die.net/man/3/strcpy. It sounds like you are talking about strncpy or memcpy.
Run that regex on some C code, then go look at how the inputs to those functions are used, and then you can probably create some of your own buffer overflows. It's like magic.
I'm confused by your comment. strcpy assumes that the string to be copied ends with a NUL. The case described in the link violated that assumption and caused a segfault.
To be both pedantic and correct, there is no way to define a C function to restrict a string input so that it is correctly terminated. So no, it's at best documented that you shouldn't do that and definitely doesn't prevent you from doing this.
To be still more pedantic, there are two kinds of definition of an interface: definition-within-the-type-system and definition-within-the-documentation. The string library is specified in the ISO C standard (I use C99, but you can be all hip and C11 if you want), and passing an unterminated string to strcpy is a constraint violation.
7.1.1 A string is a contiguous sequence of characters terminated by and including the first null character.
7.21.2.3.2 The strcpy function copies the string pointed to by s2 (including the terminating null character) into the array pointed to by s1.
Therefore, code which passes an unterminated string to strcpy is not conforming code (because s2 does not point to a "string" as C99 defines it).
Of course, you should use strncpy anyway. But that's not the point.
No? It ensures that malloc didn't return a NULL pointer and the '&& "memory error"' is a common pattern to add a comment describing why an assert() statement failed.
One problem with it is that on most compilers if you compile with optimizations it will remove the assertion. Now the code is no longer guarded against malloc failures and will just segfault. If the intent is to teach people how to handle malloc failures gracefully, it's not that great of an example.
Since virtually no production C code† can actually handle the case of a random malloc() call failing (just like with exceptions in C++, the code to reliably unwind an allocation failure depends on exactly where you're at in your allocation pattern), the simplest and most reliable way of handling malloc() returning NULL is to rig the program to abort.
You're right that assert() isn't the most reliable way to do this (programs have to work whether or not assert() is a no-op; that's the point of assert).
On most platforms, you can just rig malloc to do the abort itself instead of failing --- either with configuration or by preloading a wrapper malloc. Some very, very large shops do exactly this.
Another very common idiom is "xmalloc", which does the malloc/if/abort dance. But xmalloc() misses every place where libraries call malloc(); the most obvious example is strdup(), but the more pernicious issue is 3rd party code that can't know to use x-whatever().
Calling malloc and then immediately assert()'ing success is a reasonable shorthand. That exact same code can be made safe on any mainstream platform just by changing malloc's configuration.
† On general-purpose platforms; I know things are more complicated on embedded platforms.
Good point about the 3rd party libraries there, it's true that even if you do your own cleanup after manual mallocs, you can't do much about theirs.
Do you have an example of documentation that shows how to configure malloc to do some cleanup in the case of failure? Based on the Linux manpages I can't really see any obvious way to do this short of preloading some kind of malloc wrapper that's custom-written for the application.
I think it may depend on the distro (on Ubuntu, for instance, you can set MALLOC_CHECK_ to 2 to force aborts on errors --- but this also forces the use of a debugging malloc, and you should care about malloc performance).
So what I'd recommend is, grab Jason Evans' jemalloc package and use that (it's good enough for Facebook!).
In the earlier chapters they setup a Makefile with full debugging on so these don't get removed. It's mostly just for this one exercise, and in the rest of the book they use a set of debug macros I wrote:
"I'd round up rather than down...The end result was a grade distribution that was just slightly more generous than the last time. And hence I perpetuated the cycle."
Next time, why not begin with a slightly harsher grade distribution? Then selective rounding up would result in the same distribution as the previous year.
The proposal considers issues of pay, but crucially it also considers issues of other working conditions, including the issue of whether working teachers who do better quality work than other teachers are recognized for that achievement.
Very interesting -- thanks for the reference! Will read through it this evening. I sincerely hope it proposes good methods of improving teachers, ones other than the horribly flawed method of judging teachers based on student test scores.